Skip to main content

Import Images from PDF | Android Document Scanner

With the Scanbot SDK Document Scanner, you can render pages from a PDF file as separate images and extract them.

First, initialize the SDK and obtain an instance of PdfPagesExtractor:

 val pdfExtractor = ScanbotSdk(context).createPdfPagesExtractor()

This class handles page extraction from a PDF file.

To get SDK pages from PDF files:

 val pdfExtractor = ScanbotSdk(context).createPdfPagesExtractor()
val listOfPages = pdfExtractor.pagesFromPdf(file)

Alternatively, you can get the image URIs:

 val pdfExtractor = ScanbotSdk(context).createPdfPagesExtractor()
val listOfPages = pdfExtractor.imageUrlsFromPdf(file)

Here is a list of all parameters for extractor calls:

class PdfPagesExtractor {
/**
* Converts pdf document to separate image files as pages with given options.
* If the encryption of SDK is enabled, all pages WILL be encrypted.
* @param pdfFile input pdf document
* @param scaling size of the image modifier. 2 by default
* @param bitmapConfig bitmap config to use. ARGB_8888 by default
* @param cancelCallback callback that can cancel the operation during the extraction of pages
* @param progressCallback callback that can get the number of pages already processed during the extraction of pages
* @return list of imported page ids
* @throws PdfImportException if pdf file is not available
*/
@Throws(IOException::class, OperationCancelledException::class)
fun pagesFromPdf(
pdfFile: File,
scaling: Float = DEFAULT_SCALING,
bitmapConfig: Bitmap.Config = Bitmap.Config.ARGB_8888,
cancelCallback: LongOperationCancelCallback? = null,
progressCallback: ProgressCallback? = null
): List<String>

/**
* Converts pdf document to separate image files with given options.
* If the encryption of SDK is enabled, all images WILL be encrypted.
* @param pdfFile input pdf document
* @param outputDir directory where the output files will be stored
* @param prefix prefix for output image files, naming will be <prefix>_<page_number>.<compress_format>
* @param compression bitmap compress format to use. JPEG by default
* @param quality compression quality. 90 by default
* @param scaling size of the image modifier. 2 by default
* @param bitmapConfig bitmap config to use. ARGB_8888 by default
* @param cancelCallback callback that can cancel the operation during the extraction of pages
* @param progressCallback callback that can get the number of pages already processed during the extraction of pages
* @return list of URIs of JPEG files
* @throws PdfImportException if pdf file is not available
*/
@Throws(IOException::class, OperationCancelledException::class)
fun imageUrlsFromPdf(
pdfFile: File,
outputDir: File,
prefix: String,
compression: Bitmap.CompressFormat = Bitmap.CompressFormat.JPEG,
quality: Int = DEFAULT_COMPRESSION_QUALITY,
scaling: Float = DEFAULT_SCALING,
bitmapConfig: Bitmap.Config = Bitmap.Config.ARGB_8888,
cancelCallback: LongOperationCancelCallback? = null,
progressCallback: ProgressCallback? = null
): List<Uri>
}

Want to scan longer than one minute?

Generate your free "no-strings-attached" Trial License and properly test the Scanbot SDK.

Get your free Trial License

What do you think of this documentation?