Data Scanner | Android Document Scanner
Introduction
The Scanbot SDK provides the ability to perform text recognition directly on the Camera frames.
As the result of scanning, the user gets the GenericTextRecognitionResult
class instance, which contains the raw text extracted from the frame and symbol boxes.
The Data scanner is available both as an RTU UI and as a classic component (types of components are explained here).
Integration
Take a look at our Example Apps to see how to integrate the Data Scanner.
- Ready-To-Use UI: ready-to-use-ui-demo
- Classic UI Components: generic-text-recognizer
Add Feature as a Dependency
Data Scanner SDK
is available with SDK Package 2 (Data Capture Modules). You have to add the following dependencies for it:
implementation("io.scanbot:sdk-package-2:$latestSdkVersion")
implementation("io.scanbot:sdk-generictext-assets:$latestSdkVersion")
Please do not use multiple scanners at the same time. For example, do not combine generic document scanner, health insurance scanner, text data scanner, etc. at the same time! Each scanner instance requires a lot of memory, GPU, and processor resources. Using multiple scanners will lead to performance issues for the entire application.
Initialize the SDK
Add the OCR training data file (.traineddata) for the desired language to the assets
. See Optical Character Recognition.
Add a call of .prepareOCRLanguagesBlobs(true)
method in ScanbotSDKInitializer.
override fun onCreate() {
super.onCreate()
ScanbotSDKInitializer()
.license(this, licenseKey)
// TODO: other configuration calls
.prepareOCRLanguagesBlobs(true)
.initialize(this)
}
Ready-To-Use UI Component
Ready-To-Use UI Component (activity) that is responsible for scanning raw text is TextDataScannerActivity
.
Have a look at our end-to-end working example of the RTU components usage here.
Starting and configuring RTU UI Data scanner
First of all, you have to add the SDK package and feature dependencies as described here.
Initialize the SDK as described here. More information about the SDK license initialization can be found here.
To use any of the RTU UI components you need to include the corresponding dependency in your build.gradle
file:
implementation("io.scanbot:sdk-package-ui:$scanbotSdkVersion")
Get the latest $scanbotSdkVersion
from the Changelog.
To start the RTU UI Data scanner you only have to start a new activity and be ready to process its result later.
Starting from version 1.90.0, the SDK RTU components contain predefined AndroidX Result API contracts. They handle part of the boilerplate for starting the RTU activity component and mapping the result once it finishes.
If your code is bundled with Android's deprecated startActivityForResult
API - check the other approach we offer for this case.
- AndroidX Result API
- old 'startActivityForResult' approach
val gtrResult: ActivityResultLauncher<TextDataScannerActivity.InputParams>
...
gtrResult = activity.registerForActivityResult(TextDataScannerActivity.ResultContract()) { result ->
if (result.resultOk) {
// TODO: here you can add the result handling
}
}
...
myButton.setOnClickListener {
val configuration = TextDataScannerConfiguration()
val step = TextDataScannerStep("Step1Tag", "Step1Title", "Scan a number")
val input = TextDataScannerActivity.InputParams(configuration, step)
gtrResult.launch(input)
}
myButton.setOnClickListener {
val configuration = TextDataScannerConfiguration()
val step = TextDataScannerStep("Step1Tag", "Step1Title", "Scan a number")
val intent = TextDataScannerActivity.newIntent(context, configuration, step)
startActivityForResult(intent, GTR_REQUEST_CODE_CONSTANT)
}
override fun onActivityResult(requestCode: Int, resultCode: Int, data: Intent?) {
super.onActivityResult(requestCode, resultCode, data)
if (requestCode == GTR_REQUEST_CODE_CONSTANT) {
val resultEntity: TextDataScannerActivity.Result = TextDataScannerActivity.extractResult(resultCode, data)
if (resultEntity.resultOk) {
// TODO: here you can add the result handling
}
}
}
We offer some syntactic sugar for handling the result from RTU components via AndroidX Result API:
every RTU component's activity contains a
Result
class which, in turn, along with theresultCode
value exposes a BooleanresultOk
property. This will be true ifresultCode
equalsActivity.RESULT_OK
;when you only expect
Activity.RESULT_OK
result code - you can use theAppCompatActivity.registerForActivityResultOk
extension method instead ofregisterForActivityResult
- it will be triggered only when there is a non-nullable result entity present.
Always use the corresponding activity's static newIntent
method to create intent when starting the RTU UI activity using deprecated startActivityForResult
approach. Creating android.content.Intent
object using its constructor (passing the activity's class as a parameter) will lead to the RTU UI component malfunctioning.
As a first input parameter an activity takes an instance of TextDataScannerConfiguration
. It's required for starting the RTU UI activity and allows configuration changes through methods it exposes:
val configuration = TextDataScannerConfiguration()
configuration.setTopBarBackgroundColor(ContextCompat.getColor(context, R.color.colorPrimaryDark))
configuration.setTopBarButtonsColor(ContextCompat.getColor(context, R.color.greyColor))
All parameters in TextDataScannerConfiguration
are optional.
Full API references for these methods can be found on this page.
As a second input parameter an activity takes an instance of TextDataScannerStep
. It's required for defining the data scanning flow.
TextDataScannerStep
has mandatory parameters!
TextDataScannerStep
contains a set of parameters that specify the scanning and result validation processes and the UI user guidance:
stepTag
- mandatory field. The tag of the scanning step to make. It will be indicated in the result.title
- mandatory field. The title for a value.guidanceText
- mandatory field. User guidance for the step.pattern
- Sets a validation pattern (Supports ? - any character, # - any digit, all other characters represent themselves). An empty string or null will disable the validation.shouldMatchSubstring
- find and match only part of the whole string if the pattern is usedvalidationCallback
- a callback for text validationcleanRecognitionResultCallback
- a callback to clean the recognized string prior to validationpreferredZoom
- the digital zoom level required for this stepaspectRatio
- the aspect ratio for the finder view for the stepunzoomedFinderHeight
- the default height of the finder (for zoom-level 1.0)allowedSymbols
- the allowed symbols to be passed to result. "My scanned string" withallowedSymbols = setOf('M', 'y', 'c', 'a', 'n', 'd', 't', 'r', 'i', 'g')
will result in "Mycanndtring"textFilterStrategy
- an additional parameter to set the type of scanned object. Default isTextFilterStrategy.Document
.significantShakeDelay
- Specify this value in milliseconds. Detection will be paused after significant movement. -1 is a default value (disabled). Default = 0 ms forTextFilterStrategy.Document
or 1000 forTextFilterStrategy.LcdDotMatrixDisplay
.
Full API references for these parameters can be found on this page.
Handling the result
As the result users will receive TextDataScannerActivity.Result
instance, which aggregates activity resultCode
, license validation flag and result: List<TextDataScannerStepResult>?
(a list of step results).
Each TextDataScannerStepResult
will contain:
tag
- the tag of the scanning step. The same as it was in the step configuration.text
- the validated result of the scanconfidence
- the confidence of the recognized text
Full API references for these methods can be found on this page.
So, as a simple example, we can show the scanned text in a Toast notification:
if (result.resultOk) {
Toast.makeText(context, result.result!!.first().text, Toast.LENGTH_LONG).show()
}
Classic component
Try our Data Scanner SDK App or check the following step by step integration instructions.
GenericTextRecognizer
can be conveniently used in conjunction with ScanbotCameraXView
(e.g. live detection). Let's have a look at an example with ScanbotCameraXView
.
Add feature dependencies and initialize the SDK
First of all, you have to add the SDK package and feature dependencies as described here.
Initialize the SDK as described here. More information about the SDK license initialization can be found here.
Add ScanbotCameraXView
to layout
<io.scanbot.sdk.ui.camera.ScanbotCameraXView
android:id="@+id/camera_view"
android:layout_width="match_parent"
android:layout_height="match_parent" />
Get GenericTextRecognizer
instance from ScanbotSDK
and attach it to ScanbotCameraXView
val scanbotSdk = ScanbotSDK(this)
val textRecognizer = scanbotSdk.createGenericTextRecognizer()
val textRecognizerFrameHandler = GenericTextRecognizerFrameHandler.attach(cameraView, textRecognizer)
Set up the needed config for the Generic Text Recognizer instance
// will pass all the strings in the format "0123 123456"
textRecognizer.setValidator("#### ######")
Add a result handler for GenericTextRecognizerFrameHandler
textRecognizerFrameHandler.addResultHandler(object : GenericTextRecognizerFrameHandler.ResultHandler {
override fun handle(result: FrameHandlerResult<GenericTextRecognitionResult, SdkLicenseError>): Boolean {
if (result is FrameHandlerResult.Success && result.value.validationSuccessful) {
// NOTE: 'handle' method runs in background thread - don't forget to switch to main before touching any Views
runOnUiThread {
proceedToResult(result.value.rawText)
}
return true
}
return false
}
})
Improve the quality and performance of the recognition by setting custom cleaner and validation callbacks and changing options
// CUSTOM VALIDATION FUNCTION in addition to a pattern:
genericTextScanner.setValidator("######", object : GenericTextRecognizer.GenericTextValidationCallback {
override fun validate(text: String): Boolean {
return text.first() in listOf('1', '2') // TODO: add additional validation for the recognized text
}
})
// CUSTOM CLEANER FUNCTION.
// If the string you intend on scanning is not clearly separated from other parts of the text
// then enable this setting. This will only work with 'pattern' variable from the validator:
genericTextScanner.matchSubstringForPattern = true
// As an alternative it is possible to extract the required text from the raw scanned text manually
// using a Cleaner. The effective implementation of this function might significantly improve the speed
// of scanning
genericTextScanner.setCleaner(object : GenericTextRecognizer.CleanRecognitionResultCallback {
override fun process(rawText: String): String {
return extractValuableDataFromText(rawText)
}
})
// Set needed supported languages (it is required to add needed blobs to assets)
genericTextScanner.supportedLanguages = setOf(Language.ENG, Language.DEU)
// Set which symbols are supported by recognizer
genericTextScanner.allowedSymbols = setOf('a', 'b', 'c')
// These parameters allow customizing the performance and quality of recognition. The default values mean that,
// to return a result from the recognizer, it is required that 2 of the 3 latest scanned frames contain
// the same recognized result
genericTextScanner.minimumNumberOfRequiredFramesWithEqualRecognitionResult // (default is 2)
genericTextScanner.maximumNumberOfAccumulatedFrames // (default is 3)
Visualize the scanning process
The Scanbot SDK provides an additional custom view that helps to visualize the text scanning process - WordboxPreviewView
.
It highlights the words in the camera preview which were recognized by GenericTextRecognizer
.
To enable this functionality you have to add WordboxPreviewView
as a child view in ScanbotCameraXView
:
<io.scanbot.sdk.ui.camera.ScanbotCameraXView
android:id="@+id/cameraView"
android:layout_width="match_parent"
android:layout_height="match_parent">
<io.scanbot.sdk.generictext.ui.WordboxPreviewView
android:id="@+id/wordbox_preview_view"
android:layout_width="match_parent"
android:layout_height="match_parent" />
</io.scanbot.sdk.ui.camera.ScanbotCameraXView>
Then bind WordboxPreviewView
with ScanbotCameraXView
and GenericTextRecognizerFrameHandler
:
cameraView.addFrameHandler(object : FrameHandler {
override fun handleFrame(previewFrame: FrameHandler.Frame): Boolean {
binding.wordboxPreviewView.frameWidth = previewFrame.width
binding.wordboxPreviewView.frameHeight = previewFrame.height
binding.wordboxPreviewView.frameOrientation = previewFrame.frameOrientation
return false
}
})
val genericTextRecognizerFrameHandler = GenericTextRecognizerFrameHandler.attach(cameraView, genericTextScanner)
genericTextRecognizerFrameHandler.addResultHandler { result ->
// `wordboxPreviewView.updateCharacters(...)` triggers the update of the UI, so it should be called from the UI thread
runOnUiThread {
wordboxPreviewView.updateCharacters(
when (result) {
is FrameHandlerResult.Success -> {
result.value.wordBoxes
}
else -> listOf()
}
)
}
false
}
Want to scan longer than one minute?
Generate a free trial license to test the Scanbot SDK thoroughly.
Get your free Trial LicenseWhat do you think of this documentation?
What can we do to improve it? Please be as detailed as you like.