Generic Document Scanner UI Components | Android Document Scanner

Introduction

The Scanbot SDK provides the ability to detect various types of documents in an image, crop them and recognize data fields via the Generic Document Recognizer.

Currently, the Generic Document Recognizer supports the following types of documents:

German ID Card
German Passport
German Driver's license
German Residence permit
European Health Insurance Card (EHIC)

The Generic Document scanner is available both as an RTU UI and a classic component (types of components are explained here).

Integration

Take a look at our Example Apps to see how to integrate the Generic Document Scanner.

Ready-To-Use UI: ready-to-use-ui-demo
Classic UI Components: Generic Document Recognizer Example For Live Detection or Generic Document Recognizer Example For Auto Snapping

Add Feature as a Dependency

GenericDocumentRecognizer is available with SDK Package 3 (Data Capture Modules). You have to add the following dependencies for it:

implementation("io.scanbot:sdk-package-3:$latestSdkVersion")
implementation("io.scanbot:sdk-genericdocument-assets:$latestSdkVersion")

caution

Please do not use multiple scanners at the same time. For example, do not combine generic document scanner, health insurance scanner, text data scanner, etc. at the same time! Each scanner instance requires a lot of memory, GPU, and processor resources. Using multiple scanners will lead to performance issues for the entire application.

Initialize the SDK

The Scanbot SDK must be initialized before use. Add the following code snippet to your Application class:

import io.scanbot.sdk.ScanbotSDKInitializer

...

ScanbotSDKInitializer()
    ...
    .initialize(this)

caution

Unfortunately, we have noticed that all devices using a Cortex A53 processor DO NOT SUPPORT GPU acceleration. If you encounter any problems, please disable GPU acceleration for these devices.

ScanbotSDKInitializer()
        .allowGpuAcceleration(false)

Ready-To-Use UI Component

Ready-To-Use UI Component (activity) that is responsible for scanning documents supported by the Generic Document Recognizer is GenericDocumentRecognizerActivity.

alt text

Have a look at our end-to-end working example of the RTU components usage here.

Starting and configuring RTU Generic Document scanner

First of all, you have to add the SDK package and feature dependencies as described here.

Initialize the SDK as described here. More information about the SDK license initialization can be found here.

To use any of the RTU UI components you need to include the corresponding dependency in your build.gradle file:

implementation("io.scanbot:sdk-package-ui:$scanbotSdkVersion")

Get the latest $scanbotSdkVersion from the Changelog.

To start the RTU Generic Document scanner you only have to start a new activity and be ready to process its result later.

info

Starting from version 1.90.0, the SDK RTU components contain predefined AndroidX Result API contracts. They handle part of the boilerplate for starting the RTU activity component and mapping the result once it finishes.

If your code is bundled with Android's deprecated startActivityForResult API - check the other approach we offer for this case.

AndroidX Result API
old 'startActivityForResult' approach

val genericDocumentResult: ActivityResultLauncher<GenericDocumentRecognizerConfiguration>

...

genericDocumentResult =
    activity.registerForActivityResultOk(GenericDocumentRecognizerActivity.ResultContract()) { result ->
        val resultWrappers = result.result!!
        val firstWrapper = resultWrappers.first()
        val document = scanbotSDK.resultRepositoryForClass(firstWrapper.clazz).getResultAndErase(firstWrapper.resultId)

        Toast.makeText(
            activity,
            document?.fields?.map { "${it.type.name} = ${it.value?.text}" }.toString(),
            Toast.LENGTH_LONG
        ).show()
    }

...

myButton.setOnClickListener {
    val genericDocumentConfiguration = GenericDocumentRecognizerConfiguration()
    genericDocumentResult.launch(genericDocumentConfiguration)
}

myButton.setOnClickListener {
    val genericDocumentConfiguration = GenericDocumentRecognizerConfiguration()
    val intent = GenericDocumentRecognizerActivity.newIntent(this, genericDocumentConfiguration)
    startActivityForResult(intent, GENERIC_DOCUMENT_RECOGNIZER_DEFAULT_UI)
}

override fun onActivityResult(requestCode: Int, resultCode: Int, data: Intent?) {
    super.onActivityResult(requestCode, resultCode, data)

    if (requestCode == GENERIC_DOCUMENT_RECOGNIZER_DEFAULT_UI) {
        val result: Result = GenericDocumentRecognizerActivity.extractResult(resultCode, data)
        if (!result.resultOk) {
            return
        }

        // Get the ResultWrapper object from the intent
        val resultWrappers = result.result

        val firstWrapper = resultWrappers.first()
        val document = scanbotSDK.resultRepositoryForClass(firstWrapper.clazz).getResultAndErase(firstWrapper.resultId)

        Toast.makeText(
            activity,
            document?.fields?.map { "${it.type.name} = ${it.value?.text}" }.toString(),
            Toast.LENGTH_LONG
        ).show()
    }
}

info

We offer some syntactic sugar for handling the result from RTU components via AndroidX Result API:

every RTU component's activity contains a Result class which, in turn, along with the resultCode value exposes a Boolean resultOk property. This will be true if resultCode equals Activity.RESULT_OK;
when you only expect Activity.RESULT_OK result code - you can use the AppCompatActivity.registerForActivityResultOk extension method instead of registerForActivityResult - it will be triggered only when there is a non-nullable result entity present.

caution

Always use the corresponding activity's static newIntent method to create intent when starting the RTU UI activity using deprecated startActivityForResult approach. Creating android.content.Intent object using its constructor (passing the activity's class as a parameter) will lead to the RTU UI component malfunctioning.

An instance of GenericDocumentRecognizerConfiguration is required for starting the RTU UI activity. It allows configuration changes through methods it exposes:

val genericDocumentConfiguration = GenericDocumentRecognizerConfiguration()

// Apply the color configuration
genericDocumentConfiguration.setTopBarButtonsInactiveColor(context.getColor(this, R.color.white))
genericDocumentConfiguration.setTopBarBackgroundColor(context.getColor(this, R.color.colorPrimaryDark))
...

// Apply the text configuration
genericDocumentConfiguration.setClearButtonTitle(context.getString(R.string.clear_button))
genericDocumentConfiguration.setSubmitButtonTitle(context.getString(R.string.submit_button))
...

// Apply the parameters for fields
genericDocumentConfiguration.setFieldsDisplayConfiguration(
    hashMapOf(
        // Use constants from NormalizedFieldNames objects from the corresponding document type
        DePassport.NormalizedFieldNames.PHOTO to FieldProperties(
            "My passport photo",
            FieldProperties.DisplayState.AlwaysVisible
        ),
        MRZ.NormalizedFieldNames.CHECK_DIGIT to FieldProperties(
            "Check digit",
            FieldProperties.DisplayState.AlwaysVisible
        )
        ...
    )
)

Excluding fields from scanning in RTU UI

It is also possible to exclude certain fields from the scanning process altogether. When implemented, these excluded fields will not even be attempted to be recognized. This is useful for security and/or privacy reasons. All other fields will be scanned as usual. Fields should be set ONLY as normalized field names.

// Exclude some document fields from being recognized
     genericDocumentConfiguration.setExcludedFieldTypes(hashSetOf(
         DeIdCardFront.NormalizedFieldNames.PHOTO,
         DeIdCardFront.NormalizedFieldNames.CARD_ACCESS_NUMBER,
         DePassport.NormalizedFieldNames.PHOTO,
         DePassport.NormalizedFieldNames.SIGNATURE,
         DeIdCardBack.NormalizedFieldNames.EYE_COLOR
     ))

info

All parameters in GenericDocumentRecognizerConfiguration are optional.

Full API references for these methods can be found on this page.

Handling the result

Below is a simple example of handling the result: List<ResultWrapper<GenericDocument>>.

For simplicity we will take only the first document

val firstResultWrapper = resultWrappers.first()

Get the ResultRepository from the ScanbotSDK instance. scanbotSDK was created in onCreate via ScanbotSDK(context).

val resultRepository = scanbotSDK.resultRepositoryForClass(firstResultWrapper.clazz)

Receive an instance of GenericDocument class from the repository. This call will also remove the result from the repository (to optimize the memory usage):

val genericDocument = resultRepository.getResultAndErase(firstResultWrapper.resultId)

If you do not need to remove the result, use resultRepository.getResult(firstDocument.resultId). Please note that this repository does not use persistent storage and is based on an LRU Cache. The repository will be cleared if the app process is terminated or the memory consumption is too high.

Now you can simply show the detected document fields in a Toast notification:

Toast.makeText(
    activity,
    genericDocument?.fields?.joinToString("\n") { "${it.type.name} = ${it.value?.text}" } ?: "",
    Toast.LENGTH_LONG
).show()

It is also possible to use the GenericDocumentWrapper successors to use strongly typed objects and conveniently get access to the fields of the corresponding document.

To receive an instance of the scanned document wrapper, use GenericDocumentLibrary or wrap() extension function as follows:

Manual data parsing snippet
loading...

See full example on GitHub

Full API references for the result wrapper class are available here and for GenericDocument class here.

Classic component

To integrate the classic component of the Generic Document Scanner you can take a look at our Generic Document Recognizer Example For Live Detection, Generic Document Recognizer Example For Auto Snapping or check the following step-by-step integration instructions.

GenericDocumentRecognizer can be used both in conjunction with ScanbotCameraXView (e.g. live detection for preview) and by itself for detection on a Bitmap or JPEG byte array. Let's have a look at an example with ScanbotCameraXView.

Add feature depedencies and initialize the SDK

First of all, you have to add the SDK package and feature dependencies as described here.

Initialize the SDK as described here. More information about the SDK license initialization can be found here.

Add `ScanbotCameraXView` to layout

<io.scanbot.sdk.ui.camera.ScanbotCameraXView
    android:id="@+id/camera_view"
    android:layout_width="match_parent"
    android:layout_height="match_parent" />

Get `GenericDocumentRecognizer` instance from `ScanbotSDK`, set the required document types and blurriness acceptance score, then attach it to `ScanbotCameraXView`

val scanbotSdk = ScanbotSDK(this)

// Please note that each call to this method will create a new instance of GenericDocumentRecognizer
// It should be used on a "single instance per screen" basis
val genericDocumentRecognizer = scanbotSdk.createGenericDocumentRecognizer()
genericDocumentRecognizer.acceptedSharpnessScore = 80f

// Uncomment to scan only ID cards and passports
// genericDocumentRecognizer.acceptedDocumentTypes = listOf(
//     RootDocumentType.DePassport,
//     RootDocumentType.DeIdCardFront,
//     RootDocumentType.DeIdCardBack
// )

// Uncomment to scan only Driver's licenses
// genericDocumentRecognizer.acceptedDocumentTypes = listOf(
//     RootDocumentType.DeDriverLicenseFront,
//     RootDocumentType.DeDriverLicenseBack
// )

// Uncomment to scan only Residence permit cards
// genericDocumentRecognizer.acceptedDocumentTypes = listOf(
//     RootDocumentType.DeResidencePermitFront,
//     RootDocumentType.DeResidencePermitBack
// )

// Uncomment to scan only back side of European health insurance cards
// genericDocumentRecognizer.acceptedDocumentTypes = listOf(
//     RootDocumentType.EuropeanHealthInsuranceCard
// )

// Uncomment to scan only front side of German health insurance cards
// genericDocumentRecognizer.acceptedDocumentTypes = listOf(
//     RootDocumentType.RootDocumentType.DeHealthInsuranceCardFront
// )

// To scan all the supported document types (default value)
genericDocumentRecognizer.acceptedDocumentTypes = RootDocumentType.ALL_TYPES

val frameHandler = GenericDocumentRecognizerFrameHandler.attach(cameraView, genericDocumentRecognizer)

Excluding fields from scanning for genericDocumentRecognizer

// Exclude some document fields from being recognized
    genericDocumentRecognizer.excludedFieldTypes = setOf(
        DeIdCardFront.NormalizedFieldNames.PHOTO,
        DeIdCardFront.NormalizedFieldNames.CARD_ACCESS_NUMBER,
        DePassport.NormalizedFieldNames.PHOTO,
        DePassport.NormalizedFieldNames.SIGNATURE,
        DeIdCardBack.NormalizedFieldNames.EYE_COLOR)

Add a result handler for `GenericDocumentRecognizerFrameHandler`

Add a frame handler which, for example, observes consecutive successful recognition statuses and shows a toast notification whenever two or more such statuses are received.

Manual data parsing snippet
loading...

See full example on GitHub

Method handle(result: FrameHandlerResult<GenericDocumentRecognitionResult, SdkLicenseError>) will be triggered every time GenericDocumentRecognizer detects a document in the camera preview frame or if a license error has occurred.

If the result of the scanning was successful, the user gets the GenericDocumentRecognitionResult object which contains a cropped document image and a GenericDocument object. Each field is represented by the Field class, holding the field's type, cropped visual source, recognized text and confidence level value.

You can now run your app and should see a simple camera preview that can scan your documents.

Pass snapped picture to `GenericDocumentRecognizer`, process results

First, decode the image ByteArray obtained from the camera's callback, taking into account the image orientation. Our ImageProcessor component can be used for this:

val resultBitmap = ImageProcessor(image).rotate(imageOrientation).processedBitmap()

Next, we perform a recognition:

val recognitionResult = genericDocumentRecognizer.scanBitmap(resultBitmap)

As an example of further application, set the obtained scan parameters to a TextView:

val myTextView = findViewById<TextView>(R.id.my_text_view)

val resultsMessage = "Recognition results:\n" +
    "Recognition status: ${recognitionResult.status}\n" +
    "Card type: ${recognitionResult.document.type}\n" +
    "Number of fields scanned: ${recognitionResult.document?.fields?.size ?: 0}"

myTextView.text = resultsMessage

It is also possible to use the GenericDocumentWrapper successors to use strongly typed objects and conveniently get access to the fields of the corresponding document.

To receive an instance of the scanned document wrapper, use GenericDocumentLibrary or wrap() extension function as follows:

val recognitionResult = genericDocumentRecognizer.scanBitmap(resultBitmap)

val genericDocument = recognitionResult.document
if (genericDocument != null) {
    // Alternatively, use GenericDocumentLibrary.wrapperFromGenericDocument(genericDocument)
    when (val wrapper = genericDocument.wrap()) {
        is DeIdCardFront -> {
            val id = wrapper.id
            val name = wrapper.givenNames
            val surname = wrapper.surname
            val cardAccessNumber = wrapper.cardAccessNumber
        }
        is DeDriverLicenseFront -> {
            val id = wrapper.id
            val name = wrapper.givenNames
            val surname = wrapper.surname
            val categories = wrapper.licenseCategories
        }
        is DeResidencePermitFront -> {
            val id = wrapper.id
            val name = wrapper.givenNames
            val surname = wrapper.surname
            val cardAccessNumber = wrapper.cardAccessNumber
        }
        else -> {
            // Handle other document types
        }
    }
}

Add a Finder Overlay

In addition, it is recommended to add a "Finder Overlay". This feature allows you to predefine a document over the ScanbotCameraXView screen. By using this overlay the Generic Document scanner can skip the time-consuming step "Search for the document area" and perform the recognition directly in the specified "Finder Overlay" area. By using this approach the Generic Document scanner recognizes and extracts the document content much faster.

Details about applying finder view logic in the layout and in the code can be found here.

Want to scan longer than one minute?

Generate a free trial license to test the Scanbot SDK thoroughly.

Get your free Trial License

Generic Document Scanner UI Components | Android Document Scanner

Introduction

Integration

Add Feature as a Dependency

Initialize the SDK

Ready-To-Use UI Component

Starting and configuring RTU Generic Document scanner

Excluding fields from scanning in RTU UI

Handling the result

Classic component

Add feature depedencies and initialize the SDK

Add `ScanbotCameraXView` to layout

Get `GenericDocumentRecognizer` instance from `ScanbotSDK`, set the required document types and blurriness acceptance score, then attach it to `ScanbotCameraXView`

Excluding fields from scanning for genericDocumentRecognizer

Add a result handler for `GenericDocumentRecognizerFrameHandler`

Pass snapped picture to `GenericDocumentRecognizer`, process results

Add a Finder Overlay

Want to scan longer than one minute?

What do you think of this documentation?

On this page

Generic Document Scanner UI Components | Android Document Scanner

Introduction​

Integration​

Add Feature as a Dependency​

Initialize the SDK​

Ready-To-Use UI Component​

Starting and configuring RTU Generic Document scanner​

Excluding fields from scanning in RTU UI​

Handling the result​

Classic component​

Add feature depedencies and initialize the SDK​

Add ScanbotCameraXView to layout​

Get GenericDocumentRecognizer instance from ScanbotSDK, set the required document types and blurriness acceptance score, then attach it to ScanbotCameraXView​

Excluding fields from scanning for genericDocumentRecognizer​

Add a result handler for GenericDocumentRecognizerFrameHandler​

Pass snapped picture to GenericDocumentRecognizer, process results​

Add a Finder Overlay​

Want to scan longer than one minute?

What do you think of this documentation?

On this page

Introduction

Integration

Add Feature as a Dependency

Initialize the SDK

Ready-To-Use UI Component

Starting and configuring RTU Generic Document scanner

Excluding fields from scanning in RTU UI

Handling the result

Classic component

Add feature depedencies and initialize the SDK

Add `ScanbotCameraXView` to layout

Get `GenericDocumentRecognizer` instance from `ScanbotSDK`, set the required document types and blurriness acceptance score, then attach it to `ScanbotCameraXView`

Excluding fields from scanning for genericDocumentRecognizer

Add a result handler for `GenericDocumentRecognizerFrameHandler`

Pass snapped picture to `GenericDocumentRecognizer`, process results

Add a Finder Overlay