Skip to main content

Turning document scans into searchable PDF files with the Android OCR Module

You can use the SDK's built-in PDF generator together with the OCR Module to create searchable PDF files, using either an existing Document object or an image file.

Integration

Adding the feature as a dependency

The OCR feature is included in Scanbot SDK package 2. Therefore, add the dependency io.scanbot:sdk-package-2 or higher in your build.gradle along with the necessary assets:

implementation("io.scanbot:sdk-package-2:$scanbotSdkVersion")
implementation("io.scanbot:sdk-common-ocr-assets:$scanbotSdkVersion") // <<-- please also add this dependency

Get the latest $scanbotSdkVersion from the changelog.

(Only for EngineMode.TESSERACT: Downloading and providing the OCR language files)

You can find a list of all supported OCR languages and corresponding download links in the Tesseract documentation.

caution

Please download the proper version of the language data files:

Download the files and place them in the assets sub-folder assets/ocr_blobs/ of your app.

Example:

  • assets/ocr_blobs/osd.traineddata (required special data file)
  • assets/ocr_blobs/eng.traineddata (English language file)
  • assets/ocr_blobs/deu.traineddata (German language file)

Initialization

To initialize the Scanbot SDK, call the ScanbotSDKInitializer#initialize(context: Context) method.

In your Application class:

Initialize SDK
loading...
For Tesseract engine users

For EngineMode.TESSERACT, call ScanbotSDKInitializer#prepareOCRLanguagesBlobs(true) before the first usage of the OCR feature.

Then get an instance of the OcrEngine from ScanbotSDK.

In your Activity or Service class:

Create OCR Engine
loading...
For Tesseract engine users

For EngineMode.TESSERACT, to achieve better OCR results, you can enable image binarization in OcrSettings:

Enable Binarization in OCR Settings
loading...

Define the list of languages and set the engine mode to EngineMode.TESSERACT:

Engine Mode Tesseract
loading...

Example code for creating a PDF with an OCR layer from a Document object

If you're working with an image that's already part of a Document object:

Creating a PDF from a Document
loading...

Example code for creating a PDF with an OCR layer from images

If you're working with an image imported from the gallery as a bitmap:

Creating a PDF from images with OCR
loading...

You can omit the PdfConfiguration parameter to use the default PDF settings. In this case, PdfConfiguration.default() will be used. It has an empty PdfAttributes, PageSize.CUSTOM as the page size, and PageDirection.AUTO as the default page orientation.

The details of PdfConfiguration can be found in the SDK's API documentation. Please refer to the API references for the OcrResult class for more details.

Want to scan longer than one minute?

Generate a free trial license to test the Scanbot SDK thoroughly.

Get free trial license