The Input Step
A processing operation begins with the input step, which comprises the transfer of data from a third-party system to xSuite Interface. For this purpose, the input worker actively checks the external system at certain intervals to see whether new input data is available. In the simplest case, the technical connection to the third-party system might work through a shared file exchange directory (i.e., without using a programming interface for a specific system). The same applies to systems that are connected on the output side.
Found data is read separately by data object. If the data is provided by the source system as a batch, it is read by batch. Either the batch is accepted without the exception of a single document, or it is rejected as a whole. If any of the batch documents causes an error, the entire batch will be rejected. If batch processing is successful, administrative records will be created in the database for the batch and its constituent documents. The following processing steps work with these administrative data records. If file attachments are present, they will be copied to the internal storage.
When using an Input Web Service instead of an input worker, bear these two considerations in mind:
The input worker generates a database entry on the batch level even if the batch is read incorrectly. This entry has an error status and only serves to restart the batch reading process. The administrator must always reset the batch status so that batch reading can be restarted. Beforehand, the content should be corrected.
The Input Web Service, on the other hand, returns the error message directly to the calling process, and the calling process must resend the batch.
The input worker sets a flag directly in the input system to reflect the current processing status of the batch in xSuite Interface. The prerequisite for this is that the input system supports setting this kind of flag. If an exchange directory in the file system is used, for example, the input worker generates an additional status file that is empty of content, and a corresponding file extension. This status file also serves to prevent the system from reading the same data multiple times. All data objects that are flagged as having been read are ignored. Thus, this data does not have to be moved or deleted immediately after import, but, until they have been completely processed in xSuite Interface, can remain at the location from which they were transferred. If no final backup of the data is configured, the data can also remain there until after processing has been completed.
The Input Web Service, on the other hand, does not set an indicator such as this. The Input Web Service only signals to the calling process querying whether the receipt of the data was successful. This process does not receive any active feedback regarding the status of the downstream asynchronous processing flow in xSuite Interface. The process can only query this status via the separate Status Web Service.
Only the raw data of the batches and the associated documents is read. Since the content of the data is not yet interpreted here, the data can be successfully transferred even if the content of the data is not correct; the data can thus be edited on the basis of the records that have been read into the database rather than directly rejecting receipt of data that has an error.
This approach does not, however, work for all configurations. If any of the functionalities described below are used, access to the content of the data will be required during this reading process. For example, if the data coming through is XML, only the pure XML data stream will be adopted. Content is not read until the downstream processing step. Since, however, in the input step, the data can be split into separate documents and embedded file attachments can be extracted, basic syntax errors in the XML structure already affect the data in the input step.
Reading the data proceeds in two stages. In the reading process, the input system and the input format are mutually independent, with the input into the system proceeding independent of the identification of the format. As an example, an XRechnung from the file system would not be transmitted via a specialized input interface in which document input is dependent on the combination of the XRechnung data format and the transmission path (i.e., file system).
Instead, the new xSuite Interface has various general interfaces to input systems (e.g., file system, email, or database). These interfaces can deliver any input format supported by the program. The interpretation of the data format does not take place until the document is further downstream. So the way in which an XRechnung arrives – whether read from a file system folder or transmitted as an attachment to an email – has no bearing on downstream processing.
Reading the Raw Data
First of all, the input worker reads the raw data for each new object provided from the configured input system. If the data is not structured as a batch but in the form of individual documents, an artificial pseudo-batch is implicitly generated from it.
If the Input Web Service rather than an input worker serves as an input channel, there is no differentiation between input systems. The web service itself represents a specific input system.
Directly after reading, a duplicate check can be performed if desired. Batches that are considered already read according to certain criteria are rejected.
Processing by Input Format
If necessary, the documents of the batch are processed specifically according to what the input format demands. This processing may include additional separation into individual documents for certain formats (e.g., JSON and XML). This is the case when the input system cannot perform this kind of separation. Then, the format must first be interpreted. For example, a JSON file or an XML file that was initially read in by the input system as a single document may actually contain multiple child objects, each of which is processed downstream as a separate document.
File attachments that are either embedded or externally referenced can, in addition, be extracted or read. How this proceeds depends on the format. The prerequisite for this is that the format offer basic support for the extraction of attachments. For example, a native email interface that serves as the input system already has the functionality for extracting the attachments integrated. If, on the other hand, mails are provided in the form of EML files via the file system, the attachments to these files need to be extracted separately via the corresponding input format. If an input system provides data in different formats, whether alternately or combined in one document, multiple input formats can also be configured to process the same documents.
Splitting Documents
As an optional final step, the documents in the batch can be restructured. In this process, the documents are divided to create several separate documents. This is done based on certain criteria derived from the file attachments.
For example, by default, an email is treated as a batch consisting of a single document, and, in some cases multiple attachments. If any related attachments are to be processed as independent documents, the attachments can be separated at this point.
At a later point in the processing flow, when the associated administrative records have been created in the database for a batch and the documents in that batch, this kind of restructuring is no longer intended.
Notice
The division of an entire batch into separate batches is not supported (the 1:1 relationship between input data and batch records in xSuite Interface would be lost).