Itential Platform

Archiving & Purging

21 Jan 2025

Dark

Light
PDF

Archiving & Purging

Updated on 21 Jan 2025

Dark

Light
PDF

Article summary

Did you find this summary helpful?

Thank you for your feedback

Retention policies in place for archiving, backing up files and purging should also include your Itential Platform and Itential Automation (IAG) Gateway assets. This article provides several best practices and suggested methods of complying with such policies for on-premises Itential Platform and IAG installations.

⚠ Itential Platform is not a system of record. Itential strongly recommends that you specify a retention policy that suits your business needs.

Itential Platform

Itential Platform uses MongoDB to store application and instance data. Itential recommends that regular backups of the Mongo database are performed and backup data is stored in a separate location for safekeeping and recovery requirements. The specific Itential Platform .bin file used for installation should also be retained to facilitate a re-install to exact specifications should restoration be necessary.

Itential Automation Gateway

IAG maintains host files, scripts, playbooks, and custom modules on a Linux file system. Those file locations should be backed up periodically at scheduled times using cron jobs.

Data Storage Size

In the broader context of data preservation and ensuring accessible data backups:

Check the configuration and default directory size. The maximum storage size for backup data is often limited.
Avoid archiving long running workflows.
Make large job documents modular. In MongoDB, GridFS is used for storing files larger than 16 MB.

Log Rotation

Log data should be gathered from a rolling (rollover) log. As a recommended default, a rolling log should be distributed across 99 files over time (this number can be adjusted lower). To keep the logging system within specified space limits, rolled log files can be emptied (purged) whenever the log directory reaches a particular size and space is needed, or whenever the active log file generated by debug and trace has reached its file size threshold.

Checklist

Use the following checklist as an aid to help guide you through the process of archiving and performing backup operations.

	Operation	Recommendations
☐	Take snapshot and dump MongoDB data.	- Can be the entire Mongo database, if no CI/CD process in place to repair workflows, JSTs, or forms. - Can also be only jobs, tasks, Golden Config compliance collections, and LifeCycle Manager entities (2023.1). -All data files should be zipped and stored in a separate location.
☐	Set log rolling to 99 files.	- Rollover log threshold can be adjusted to a lower file number. - Configure a time-based policy to purge rollover logs.
☐	Archive IAG assets.	- A CI/CD process should be used to manage hosts, scripts, playbooks, etc. - All assets should be zipped and placed in a storage archive for snapshot purposes.
☐	Backup device inventory	⚠ If using an internal inventory in IAG, backup the SQLite database that holds the device data.
☐	Set a frequency for retention and purging.	- 30 days is the minimum standard for purging stale data. - 60-90 days or longer is the minimum standard for retaining data, depending on storage, organizational policy, and business needs.

Collections

Itential recommends the following list of MongoDB collections be included in a data backup.

Collection Name	Description
`job_history`	Number of jobs in the system over a period of time.
`job_output`	Output generated after running a workflow.
`jobs`	Job documents.
`wfe_job_metrics`	Workflow Engine metrics data for workflows.
`wfe_task_metrics`	Workflow Engine performance data on tasks within workflows.
`ucm_configs`	Configuration backups.
`ucm_compliance_reports`	Compliance reports.

Additional Collections

While additional collections from Lifecyle Manager (LCM) can be archived and cleaned (purged), the cadence can vary based on policy and business need. Some will find this type of data is valid for several years, while others may only want to maintain LCM data that covers the last 12-months. Since LCM collections are comprised of integral state data and resource models for network configuration, Itential strongly recommends that you specify what LCM collections to store, where they should be stored and for how long.

Archiving Jobs (2023.2.0)

This section provides practices and recommendations as they relate to archiving jobs in the IP/2023.2 release.

Job Data Collections

All job variable data in the 2023.2 release version, as well as incoming and outgoing task data, has been moved out of the job and task collections.

For data less than 16MB, it will live in a collection called job_data.
For data greater than 16MB, it will live in the GridFS bucket in MongoDB, which uses two collections named job_data.chunks and job_data.files.

Job ID Required for Archiving

When archiving a job in the 2023.2 release:

The job_id is now required to retrieve all job_data documents for archiving.
If the job_id is is not provided, the overwhelming majority of job-related data will remain in the database.

Job Data Collection

All data in the job_data collection can be found by query on job_id. Below is an example of what data in the job_data collection is currently structured to look like.

{
    "_id" : ObjectId("657cc218a4f4b23e8083bff0"),
    "job_id" : "f04d7175b9d7452bb718c3c6",
    "data" : "example"
}

GridFS Bucket

For GridFS, the data is split between the job_data.chunks and job_data.files collections.

Data in job_data.files contains the metadata and all queryable information.

You will need to use the MongoDB driver, and consult the official GridFS documentation for how to query the GridFS bucket, as well as how to delete files.

Below is an example of the job_data.files document structure in the 2023.2 release. You can see that the job_id is located in metadata.job for the GridFS document.

{
    "_id" : ObjectId("657b6f14c928923a20266cc6"),
    "length" : 30000002,
    "chunkSize" : 261120,
    "uploadDate" : ISODate("2023-12-14T21:09:40.525Z"),
    "filename" : "3f0f8f8e-ded1-4c80-8c40-d5eb3657a018",
    "metadata" : {
        "job" : "24b0ea5ffbe646b181531585"
    }
}

Sample Commands

The following is a list of frequently used commands to support your backup strategy and preserve data.

MongoDB

To run mongodump:

mongodump --db=<old_db_name> --collection=<collection_name> --out=data/

To run mongorestore:

mongorestore --db=<new_db_name> --collection=<collection_name> data/<db_name>/<collection_name>.bson

Linux

To enter edit mode for a Cron job:

crontab -e

To backup IAG inventory:

sqlite3 iag.sq3 ".backup 'inventory'"

To restore IAG inventory:

sqlite3 iag.sq3 ".restore 'inventory'"

Itential Pre-Builts

This pre-built allows Itential Platform users to archive the jobs and tasks collections for the 2023.1 and 2022.1 release only: Archive Job Data.

 A forthcoming pre-built will enable job archiving on 2023.2 and later versions; however, the targeted release has yet to be determined. Once available, this page will be updated to reflect the newer pre-built for archiving job data in subsequent versions of Itential Platform (23.2+).

Was this article helpful?

What's Next

ChildJob Looping in Workflow Design

Table of contents

Itential Platform
Itential Automation Gateway
Data Storage Size
Log Rotation
Checklist
Collections
Archiving Jobs (2023.2.0)
Sample Commands
Itential Pre-Builts
Related Reading

Changing your password will log you out immediately. Use the new password to log back in.

Current password

New password

Confirm password

First Name

First name must have atleast 2 characters. Numbers and special characters are not allowed.

Last Name

Last name must have atleast 1 characters. Numbers and special characters are not allowed.

New email

Enter a valid email

Re-enter your password

Enter a valid password

Your profile has been successfully updated.

Logout

Archiving & Purging

Itential Platform

Itential Automation Gateway

Data Storage Size

Log Rotation

Checklist

Collections

Additional Collections

Archiving Jobs (2023.2.0)

Job Data Collections

Job ID Required for Archiving

Job Data Collection

GridFS Bucket

Sample Commands

MongoDB

Linux

Itential Pre-Builts

Related Reading

What's Next