Document extraction

The purpose of document extraction is to read record data from documents such as invoices and sales orders. In this action, the software renders the document to make it machine-readable and the logic of IRISXtract extracts it, transferring the resulting index data to xSuite Bus.

The new PDF 2.0 standard is supported both for data extraction and for use with GdPicture (e.g. for rendering the web Viewer images)

The following settings are available for configuration of the data-extraction actions:

Element	Parameter	Description
Common	Provider	Provider of the data extraction solution. At the moment, only FPS is available as a provider.
	Input Document File Type	The format to be used for incoming documents is specified here. The options are PDF or TIFF.
	Page No From / To	Here, you can specify which page or pages (from page... / to page...) are to be used for classification. To keep the processing of the data extraction short, a maximum of five pages are read by default.
	Input Catalog	Input field catalog for data extraction. This catalog is set up in xSuite Bus and is used for the field mapping of field values that go along with it. In the standard version, this is the Scan catalog.
	Output Catalog	Output field catalog for data extraction. This catalog is set up in xSuite Bus and is used for the field mapping of the data-data-extraction results. In the standard version, this is the InvoiceLegalRule catalog.
IRIS Extraction	Toolkit Major Version	Shows the IDR toolkit version.
	License Service Uri	Connection to the IRIS License Web Service, which manages the licenses by I.R.I.S. AG. The license information can be seen in Customer Clients → Licensing.
	License Tenant	Name of the license to be used for the data extraction. It must also be set up in on the IRIS License server. Then different licenses can be assigned to different users.
	Project Scope	Name of the data-extraction project that is to be used for data extraction. Notice When using multiple IRISXtract actions, a different value must be specified in the Project Scope field for each action.
	Project Path	The data extraction projects are placed in the folder `...\xSuiteData\xSuiteBusPrism\Services\DocExtraction\IRIS Xtract\Projects\` during the installation, if the path was not adjusted in the setup.
	Backup Path	The file path for the data-extraction results for inspection at a later point in time. Data is stored there until it is deleted by the xSuite Bus System Clean-Up Service.
	Work Path	File path for the data-extraction results for the time at which results are determined. The contents are deleted after the extracted data has been transferred.
	IDR Subprocess Termination	Resource-saving processing of data extraction If the Terminate subprocesses after document processing checkbox is activated, the subprocesses required for document processing are reloaded and terminated each time a document is processed. This helps to conserve resources and ensure system stability. Notice This checkbox is only relevant for IDR Toolkit version 5.1 or higher.
	Compile Master Data	Manual compilation of the Master data with the IRIS data-extraction project. For more information, see Compiling Master Data.
	Open IDR Toolkit Demo	Here, you can load demo software made available by I.R.I.S. AG. You can use this software to test the quality of the master data and the OCR results. The file path of this software must be specified in the xSuite Bus settings Settings → Files → IDR Toolkit Demo.

Compiling Master Data

The OCR results of the IRIS data-extraction project greatly depend on the quality of the master data. This master data is adopted into the project structure in regularly, in intervals that can be planned.

Normally, compilation of the master data is planned as an xSuite Bus task and then takes place when the server’s load is minimal.

Compile Master Data can also be executed manually when new master data is added spontaneously.

The xSuite Group recommends the following procedure for compilation as it has proven itself as a best practice:

Update the master data with current information from the target system, e.g., vendor master data(Master_Creditor.csv) or purchase order data(Master_Transaction.csv).
Stop the data-extraction services.
Click the Compile Master Data button.
➣ The master data is now compiled.
Start the data-extraction services.

In this section:

xSuite Bus Prism Administration Guide

Document extraction

Notice

Notice

Compiling Master Data

Search results