The "Scheduler" configuration node
The scheduler is a Windows service that executes tasks in the background. You can define scheduler jobs and scheduler triggers. Each job represents a specific task that is executed in the background.
The triggers contain the specific parameters that are required to execute the jobs. The parameters relate to the time of execution and the directory that is monitored for an import, among other things. The triggers are specified in the form of a cron expression.
Notice
You can view the status of the defined jobs in the task control.
Notice
The basic configuration of default jobs is automatically created in the configuration database when the scheduler service is started for the first time. Changes to the configuration take effect either immediately or the next time the job is executed in the scheduler.
MonitorImportJob
At regular intervals, the "MonitorImportJob" monitors an import directory located locally on the archive server. If new data is stored in this directory, the directory will be recursively searched for import.job files. These files contain the parameters for import. The files are checked, and the corresponding locks (import.lock) are created in the file system. The files are prepared and checked in the import database so that the "ImportWorkerJobs" can carry out the import asynchronously and in parallel.
If the "MonitorImportJob" finds new subfolders, these folders will be read in and processed. When a job is read in, the job creates an import.lock file in the folder. This file signals that the job is being processed and no longer needs to be imported. The job checks the import data and creates an import request with the status "Pending" in the import database. The job divides the data into smaller units ("chunks") for processing.
An import always consists of a job definition and chunks. The chunks contain a defined number of documents.
Parameters
The following job parameters are available:
Name | Type | Default value | Description |
|---|---|---|---|
ChunkSize | Integer |
| Number of documents per part of an import request (chunk) |
MonitoredDirectory | String |
| Monitoring folder for bulk imports |
ImportWorkerJob
At regular intervals, the "ImportWorkerJob" reads the import database for imports that have not yet been processed. If new data is available, a worker will retrieve a part (chunk) of an import job, lock this chunk and import all the documents it contains. The chunks are processed according to priority and creation time.
After processing, the status of the documents and the import chunk is updated in the import database. An entry is made in the monitoring log. Documents with errors are flagged.
Notice
To process incorrect documents again, reset the chunk in the database manually.
Parameters
The following job parameters are available:
Name | Type | Default value | Description |
|---|---|---|---|
WorkerId | String |
| Unique name of the "ImportWorkerJob" |
ImportMaintenanceJob
The "ImportMaintenanceJob" checks the completeness of the bulk imports at regular intervals. If no errors are found, the data that is no longer required will be deleted from the file system. The import request data will then be flagged as archived.
If any errors are found, an email with an error report is sent to the email address specified in the configuration under SystemSettings → DefaultSettings → Reporting.
IndexerSchedulerJob
The "IndexerSchedulerJob" performs the full-text indexing. Full-text indexing takes place asynchronously after import if it has been defined this way. When importing into the "Indexjobber" database, the archive server creates entries that are processed by the job instances. There is a temporary entry in the "Indexjobber" database for each document to be indexed.
If indexing fails, a maximum of three additional indexing attempts will be made for the document.
Notice
To start a new indexing attempt, manually remove the error counter and the document lock.
Parameters
The following job parameters are available:
Name | Type | Default value | Description |
|---|---|---|---|
BulkSize | Number | -- | The number of documents indexed asynchronously at the same time |
Tenants | Text | Blank | Comma-separated list, with tenants or empty |
IndexMigrationJob
The "IndexMigrationJob" migrates an Elasticsearch-2.x index to an Elasticsearch 7 index. Technically, the Elasticsearch 2 index of each archive in the tenant is migrated iteratively to an Elasticsearch 7 index. The job is repeated until all documents have been migrated to the Elasticsearch 7 index.
The archive can be used without restrictions during the migration. A temporary index is created for each index. The temporary index is deleted as soon as the migration is completed. Archives that are being migrated and archives that have already been migrated will receive the configuration parameter Migrated with the value InIndexMigration.
Once the migration is 100% complete, the job will automatically replace the Elasticsearch 2 index with the Elasticsearch 7 index. From that point on, only the Elasticsearch 7 index will be used.
Caution
When migrating nested archives, make sure that each archive has an index.
Subordinate archives of a node that do not have an index will not be migrated.
Parameters
The following job parameters are available:
Name | Type | Description |
|---|---|---|
BulkSize | Number | The number of archive documents that are migrated at the same time |
MaximumMigrationJobs | Number | The number of archive indexes that are migrated at the same time |
Tenants | Text | Comma-separated specification of the tenants to be migrated If the parametric value is empty, the default tenant will be migrated. If the parameter value |
TransferJob
"TransferJob" searches the configured archives for documents that are older than a configured time period. If enough documents are available for a transfer (minimum container size), these documents are written to a container and transferred to the "EndArchived" status.
Prerequisite: A shard of the ContainerBox type is present in the archive that is being processed.
Parameters
The following job parameters are available:
Name | Type | Default value | Description |
|---|---|---|---|
Archives | Text | -- | Comma-separated list of archives that the "TransferJob" processes If the field is empty, all archives will be checked. |
MaxSizeMB | Number |
| Maximum size of the containers that are created (in MB) |
MinSizeMB | Number |
| Minimum size of the containers that are created (in MB) |
TimeSpan | Text | -- | Period Syntax: The following units are available:
Example: |
Tenant | Text | -- | Tenant If the field is not available, the default tenant will be used. |
LogArchiverSchedulerJob
The "LogArchiverSchedulerJob" archives log entries that are older than a configured value as a JSON structure in an archive document. If this is configured, full-text indexing will also be performed.
Parameters
The following job parameters are available:
Name | Type | Default value | Description |
|---|---|---|---|
Archive | Text |
| Archive to be written to |
EntriesPerDoc | Number |
| The number of log entries that are combined into one archive document |
Timespan | Number |
| The number of days after which archiving takes place |
Tenant | Text | Blank | Tenant or empty |
RetentionJob
The "RetentionJob" searches the configured archives for documents whose standard expiry date has expired. These documents are deleted. The standard expiry date is defined via the Retention archive property or the Retention document type property.
Notice
Documents that have a legal hold are not deleted.
Parameters
The following job parameters are available:
Name | Type | Default value | Description |
|---|---|---|---|
Archives | Text | -- | Comma-separated list of archives that the "RetentionJob" processes |
Tenant | Text | -- | Tenant If the field is not available, the default tenant will be used. |
TempCleanup
The "TempCleanup" job cleans up the directory in which the temporary files are stored at regular intervals.
Notice
You can change the directory under SystemSettings → TempFiles to configure it.
Parameters
The following job parameters are available:
Name | Type | Default value | Description |
|---|---|---|---|
ExpireHours | Number | 24 | Age of the file in hours If a file exceeds this age, the file will be deleted. |
ReplicaJob
The "ReplicaJob" transfers all documents that are to be replicated to the foreign servers (ForeignServers), executing and saving all replications to local archives.
The aim of replication is to ensure data security and that data is not lost in the case of downtime. In an archive with a replica configuration, a randomly generated value (change token) is written to the document with each write operation. If the document has been replicated correctly, the change token will also be available in the replication. The change token indicates that the master and the slave replications are the same.
If the Check replication property is activated in the archive configuration, a replication check is also carried out. A replication check can also be carried out for an archive. During the replication check, the master archive and the slave archive are compared with each other.
Parameters
The following job parameters are available:
Name | Type | Default value | Description |
|---|---|---|---|
JobSize | Number |
| Maximum number of documents that can be duplicated in one run If no value is specified, the "ReplicaJob" will duplicate all documents. |
BatchSize | Number |
| Number of documents that are replicated locally in a batch action |
BulkSize | Number |
| Size of the data block (in MB) The data is transferred to a ForeignServer. |
ForeignSize | Number |
| Maximum number of documents that can be transferred to a ForeignServer in one batch |
ImportConverterJob
The "ImportConverterJob" converts import files into JSON format. The converted data can be archived after conversion using a standard file import job. The conversion process does not include any content checks of the data.