Generic Document Scanner UI Components | Android Document Scanner
Introduction
The Scanbot SDK provides the ability to detect various types of documents in an image, crop them and recognize data fields via the Generic Document Recognizer.
Currently, the Generic Document Recognizer supports the following types of documents:
- German ID Card
- German Passport
- German Driver's license
- German Residence permit
- European Health Insurance Card (EHIC)
The Generic Document scanner is available both as an RTU UI and a classic component (types of components are explained here).
Integration
Take a look at our Example Apps to see how to integrate the Generic Document Scanner.
- Ready-To-Use UI: ready-to-use-ui-demo
- Classic UI Components: Generic Document Recognizer Example For Live Detection or Generic Document Recognizer Example For Auto Snapping
Add Feature as a Dependency
GenericDocumentRecognizer
is available with SDK Package 3 (Data Capture Modules). You have to add the following dependencies for it:
implementation("io.scanbot:sdk-package-3:$latestSdkVersion")
implementation("io.scanbot:sdk-genericdocument-assets:$latestSdkVersion")
Please do not use multiple scanners at the same time. For example, do not combine generic document scanner, health insurance scanner, text data scanner, etc. at the same time! Each scanner instance requires a lot of memory, GPU, and processor resources. Using multiple scanners will lead to performance issues for the entire application.
Initialize the SDK
The Scanbot SDK must be initialized before use. Add the following code snippet to your Application
class:
import io.scanbot.sdk.ScanbotSDKInitializer
...
ScanbotSDKInitializer()
...
.initialize(this)
Unfortunately, we have noticed that all devices using a Cortex A53 processor DO NOT SUPPORT GPU acceleration. If you encounter any problems, please disable GPU acceleration for these devices.
ScanbotSDKInitializer()
.allowGpuAcceleration(false)
Ready-To-Use UI Component
Ready-To-Use UI Component (activity) that is responsible for scanning documents supported by the Generic Document Recognizer is GenericDocumentRecognizerActivity
.
Have a look at our end-to-end working example of the RTU components usage here.
Starting and configuring RTU Generic Document scanner
First of all, you have to add the SDK package and feature dependencies as described here.
Initialize the SDK as described here. More information about the SDK license initialization can be found here.
To use any of the RTU UI components you need to include the corresponding dependency in your build.gradle
file:
implementation("io.scanbot:sdk-package-ui:$scanbotSdkVersion")
Get the latest $scanbotSdkVersion
from the Changelog.
To start the RTU Generic Document scanner you only have to start a new activity and be ready to process its result later.
Starting from version 1.90.0, the SDK RTU components contain predefined AndroidX Result API contracts. They handle part of the boilerplate for starting the RTU activity component and mapping the result once it finishes.
If your code is bundled with Android's deprecated startActivityForResult
API - check the other approach we offer for this case.
- AndroidX Result API
- old 'startActivityForResult' approach
val genericDocumentResult: ActivityResultLauncher<GenericDocumentRecognizerConfiguration>
...
genericDocumentResult =
activity.registerForActivityResultOk(GenericDocumentRecognizerActivity.ResultContract()) { result ->
val resultWrappers = result.result!!
val firstWrapper = resultWrappers.first()
val document = scanbotSDK.resultRepositoryForClass(firstWrapper.clazz).getResultAndErase(firstWrapper.resultId)
Toast.makeText(
activity,
document?.fields?.map { "${it.type.name} = ${it.value?.text}" }.toString(),
Toast.LENGTH_LONG
).show()
}
...
myButton.setOnClickListener {
val genericDocumentConfiguration = GenericDocumentRecognizerConfiguration()
genericDocumentResult.launch(genericDocumentConfiguration)
}
myButton.setOnClickListener {
val genericDocumentConfiguration = GenericDocumentRecognizerConfiguration()
val intent = GenericDocumentRecognizerActivity.newIntent(this, genericDocumentConfiguration)
startActivityForResult(intent, GENERIC_DOCUMENT_RECOGNIZER_DEFAULT_UI)
}
override fun onActivityResult(requestCode: Int, resultCode: Int, data: Intent?) {
super.onActivityResult(requestCode, resultCode, data)
if (requestCode == GENERIC_DOCUMENT_RECOGNIZER_DEFAULT_UI) {
val result: Result = GenericDocumentRecognizerActivity.extractResult(resultCode, data)
if (!result.resultOk) {
return
}
// Get the ResultWrapper object from the intent
val resultWrappers = result.result
val firstWrapper = resultWrappers.first()
val document = scanbotSDK.resultRepositoryForClass(firstWrapper.clazz).getResultAndErase(firstWrapper.resultId)
Toast.makeText(
activity,
document?.fields?.map { "${it.type.name} = ${it.value?.text}" }.toString(),
Toast.LENGTH_LONG
).show()
}
}
We offer some syntactic sugar for handling the result from RTU components via AndroidX Result API:
every RTU component's activity contains a
Result
class which, in turn, along with theresultCode
value exposes a BooleanresultOk
property. This will be true ifresultCode
equalsActivity.RESULT_OK
;when you only expect
Activity.RESULT_OK
result code - you can use theAppCompatActivity.registerForActivityResultOk
extension method instead ofregisterForActivityResult
- it will be triggered only when there is a non-nullable result entity present.
Always use the corresponding activity's static newIntent
method to create intent when starting the RTU UI activity using deprecated startActivityForResult
approach. Creating android.content.Intent
object using its constructor (passing the activity's class as a parameter) will lead to the RTU UI component malfunctioning.
An instance of GenericDocumentRecognizerConfiguration
is required for starting the RTU UI activity. It allows configuration changes through methods it exposes:
val genericDocumentConfiguration = GenericDocumentRecognizerConfiguration()
// Apply the color configuration
genericDocumentConfiguration.setTopBarButtonsInactiveColor(context.getColor(this, R.color.white))
genericDocumentConfiguration.setTopBarBackgroundColor(context.getColor(this, R.color.colorPrimaryDark))
...
// Apply the text configuration
genericDocumentConfiguration.setClearButtonTitle(context.getString(R.string.clear_button))
genericDocumentConfiguration.setSubmitButtonTitle(context.getString(R.string.submit_button))
...
// Apply the parameters for fields
genericDocumentConfiguration.setFieldsDisplayConfiguration(
hashMapOf(
// Use constants from NormalizedFieldNames objects from the corresponding document type
DePassport.NormalizedFieldNames.PHOTO to FieldProperties(
"My passport photo",
FieldProperties.DisplayState.AlwaysVisible
),
MRZ.NormalizedFieldNames.CHECK_DIGIT to FieldProperties(
"Check digit",
FieldProperties.DisplayState.AlwaysVisible
)
...
)
)
Excluding fields from scanning in RTU UI
It is also possible to exclude certain fields from the scanning process altogether. When implemented, these excluded fields will not even be attempted to be recognized. This is useful for security and/or privacy reasons. All other fields will be scanned as usual. Fields should be set ONLY as normalized field names.
// Exclude some document fields from being recognized
genericDocumentConfiguration.setExcludedFieldTypes(hashSetOf(
DeIdCardFront.NormalizedFieldNames.PHOTO,
DeIdCardFront.NormalizedFieldNames.CARD_ACCESS_NUMBER,
DePassport.NormalizedFieldNames.PHOTO,
DePassport.NormalizedFieldNames.SIGNATURE,
DeIdCardBack.NormalizedFieldNames.EYE_COLOR
))
All parameters in GenericDocumentRecognizerConfiguration
are optional.
Full API references for these methods can be found on this page.
Handling the result
Below is a simple example of handling the result: List<ResultWrapper<GenericDocument>>
.
For simplicity we will take only the first document
val firstResultWrapper = resultWrappers.first()
Get the ResultRepository
from the ScanbotSDK instance.
scanbotSDK
was created in onCreate via ScanbotSDK(context).
val resultRepository = scanbotSDK.resultRepositoryForClass(firstResultWrapper.clazz)
Receive an instance of GenericDocument
class from the repository. This call will also remove the result from the repository (to optimize the memory usage):
val genericDocument = resultRepository.getResultAndErase(firstResultWrapper.resultId)
If you do not need to remove the result, use resultRepository.getResult(firstDocument.resultId)
. Please note that this repository does not use persistent storage and is based on an LRU Cache.
The repository will be cleared if the app process is terminated or the memory consumption is too high.
Now you can simply show the detected document fields in a Toast notification:
Toast.makeText(
activity,
genericDocument?.fields?.joinToString("\n") { "${it.type.name} = ${it.value?.text}" } ?: "",
Toast.LENGTH_LONG
).show()
It is also possible to use the GenericDocumentWrapper
successors to use strongly typed objects and conveniently get access to the fields of the corresponding document.
To receive an instance of the scanned document wrapper, use GenericDocumentLibrary
or wrap()
extension function as follows:
loading...
Full API references for the result wrapper class are available here and for GenericDocument class here.
Classic component
To integrate the classic component of the Generic Document Scanner you can take a look at our Generic Document Recognizer Example For Live Detection, Generic Document Recognizer Example For Auto Snapping or check the following step-by-step integration instructions.
GenericDocumentRecognizer
can be used both in conjunction with ScanbotCameraXView
(e.g. live detection for preview) and by itself for detection on a Bitmap
or JPEG byte array. Let's have a look at an example with ScanbotCameraXView
.
Add feature dependencies and initialize the SDK
First of all, you have to add the SDK package and feature dependencies as described here.
Initialize the SDK as described here. More information about the SDK license initialization can be found here.
Add ScanbotCameraXView
to layout
<io.scanbot.sdk.ui.camera.ScanbotCameraXView
android:id="@+id/camera_view"
android:layout_width="match_parent"
android:layout_height="match_parent" />
Get GenericDocumentRecognizer
instance from ScanbotSDK
, set the required document types and blurriness acceptance score, then attach it to ScanbotCameraXView
val scanbotSdk = ScanbotSDK(this)
// Please note that each call to this method will create a new instance of GenericDocumentRecognizer
// It should be used on a "single instance per screen" basis
val genericDocumentRecognizer = scanbotSdk.createGenericDocumentRecognizer()
genericDocumentRecognizer.acceptedSharpnessScore = 80f
// Uncomment to scan only ID cards and passports
// genericDocumentRecognizer.acceptedDocumentTypes = listOf(
// RootDocumentType.DePassport,
// RootDocumentType.DeIdCardFront,
// RootDocumentType.DeIdCardBack
// )
// Uncomment to scan only Driver's licenses
// genericDocumentRecognizer.acceptedDocumentTypes = listOf(
// RootDocumentType.DeDriverLicenseFront,
// RootDocumentType.DeDriverLicenseBack
// )
// Uncomment to scan only Residence permit cards
// genericDocumentRecognizer.acceptedDocumentTypes = listOf(
// RootDocumentType.DeResidencePermitFront,
// RootDocumentType.DeResidencePermitBack
// )
// Uncomment to scan only back side of European health insurance cards
// genericDocumentRecognizer.acceptedDocumentTypes = listOf(
// RootDocumentType.EuropeanHealthInsuranceCard
// )
// Uncomment to scan only front side of German health insurance cards
// genericDocumentRecognizer.acceptedDocumentTypes = listOf(
// RootDocumentType.RootDocumentType.DeHealthInsuranceCardFront
// )
// To scan all the supported document types (default value)
genericDocumentRecognizer.acceptedDocumentTypes = RootDocumentType.ALL_TYPES
val frameHandler = GenericDocumentRecognizerFrameHandler.attach(cameraView, genericDocumentRecognizer)
Excluding fields from scanning for genericDocumentRecognizer
It is also possible to exclude certain fields from the scanning process altogether. When implemented, these excluded fields will not even be attempted to be recognized. This is useful for security and/or privacy reasons. All other fields will be scanned as usual. Fields should be set ONLY as normalized field names.
// Exclude some document fields from being recognized
genericDocumentRecognizer.excludedFieldTypes = setOf(
DeIdCardFront.NormalizedFieldNames.PHOTO,
DeIdCardFront.NormalizedFieldNames.CARD_ACCESS_NUMBER,
DePassport.NormalizedFieldNames.PHOTO,
DePassport.NormalizedFieldNames.SIGNATURE,
DeIdCardBack.NormalizedFieldNames.EYE_COLOR)
Add a result handler for GenericDocumentRecognizerFrameHandler
Add a frame handler which, for example, observes consecutive successful recognition statuses and shows a toast notification whenever two or more such statuses are received.
loading...
Method handle(result: FrameHandlerResult<GenericDocumentRecognitionResult, SdkLicenseError>)
will be triggered every time GenericDocumentRecognizer
detects a document in the camera preview frame or if a license error has occurred.
If the result of the scanning was successful, the user gets the GenericDocumentRecognitionResult
object which contains a cropped document image and a GenericDocument
object.
Each field is represented by the Field
class, holding the field's type, cropped visual source, recognized text and confidence level value.
You can now run your app and should see a simple camera preview that can scan your documents.
Pass snapped picture to GenericDocumentRecognizer
, process results
First, decode the image ByteArray
obtained from the camera's callback, taking into account the image orientation. Our ImageProcessor
component can be used for this:
val resultBitmap = ImageProcessor(image).rotate(imageOrientation).processedBitmap()
Next, we perform a recognition:
val recognitionResult = genericDocumentRecognizer.scanBitmap(resultBitmap)
As an example of further application, set the obtained scan parameters to a TextView:
val myTextView = findViewById<TextView>(R.id.my_text_view)
val resultsMessage = "Recognition results:\n" +
"Recognition status: ${recognitionResult.status}\n" +
"Card type: ${recognitionResult.document.type}\n" +
"Number of fields scanned: ${recognitionResult.document?.fields?.size ?: 0}"
myTextView.text = resultsMessage
It is also possible to use the GenericDocumentWrapper
successors to use strongly typed objects and conveniently get access to the fields of the corresponding document.
To receive an instance of the scanned document wrapper, use GenericDocumentLibrary
or wrap()
extension function as follows:
val recognitionResult = genericDocumentRecognizer.scanBitmap(resultBitmap)
val genericDocument = recognitionResult.document
if (genericDocument != null) {
// Alternatively, use GenericDocumentLibrary.wrapperFromGenericDocument(genericDocument)
when (val wrapper = genericDocument.wrap()) {
is DeIdCardFront -> {
val id = wrapper.id
val name = wrapper.givenNames
val surname = wrapper.surname
val cardAccessNumber = wrapper.cardAccessNumber
}
is DeDriverLicenseFront -> {
val id = wrapper.id
val name = wrapper.givenNames
val surname = wrapper.surname
val categories = wrapper.licenseCategories
}
is DeResidencePermitFront -> {
val id = wrapper.id
val name = wrapper.givenNames
val surname = wrapper.surname
val cardAccessNumber = wrapper.cardAccessNumber
}
else -> {
// Handle other document types
}
}
}
Add a Finder Overlay
In addition, it is recommended to add a "Finder Overlay". This feature allows you to predefine a document over the ScanbotCameraXView
screen. By using this overlay the Generic Document scanner can skip the time-consuming step "Search for the document area" and perform the recognition directly in the specified "Finder Overlay" area. By using this approach the Generic Document scanner recognizes and extracts the document content much faster.
Details about applying finder view logic in the layout and in the code can be found here.
Want to scan longer than one minute?
Generate a free trial license to test the Scanbot SDK thoroughly.
Get your free Trial LicenseWhat do you think of this documentation?
What can we do to improve it? Please be as detailed as you like.