Archiving & Purging
  • 15 May 2024
  • Dark
    Light
  • PDF

Archiving & Purging

  • Dark
    Light
  • PDF

Article summary

Retention policies in place for archiving, backing up files and purging should also include your IAP and IAG assets. This article provides several best practices and suggested methods of complying with such policies for on-premise IAP/IAG installations.

The Itential platform is not a system of record. Itential strongly recommends that you specify a retention policy that suits your business needs.

Itential Automation Platform (IAP)

IAP uses MongoDB to store application and instance data. Itential recommends that regular backups of the Mongo database are performed and backup data is stored in a separate location for safekeeping and recovery requirements. The specific IAP .bin file used for installation should also be retained to facilitate a re-install to exact specifications should restoration be necessary.

Itential Automation Gateway (IAG)

IAG maintains host files, scripts, playbooks, and custom modules on a Linux file system. Those file locations should be backed up periodically at scheduled times using cron jobs.

Data Storage Size

In the broader context of data preservation and ensuring accessible data backups:

  • Check the configuration and default directory size. The maximum storage size for backup data is often limited.
  • Avoid archiving long running workflows.
  • Make large job documents modular. In MongoDB, GridFS is used for storing files larger than 16 MB.

Log Rotation

Log data should be gathered from a rolling (rollover) log. As a recommended default, a rolling log should be distributed across 99 files over time (this number can be adjusted lower). To keep the logging system within specified space limits, rolled log files can be emptied (purged) whenever the log directory reaches a particular size and space is needed, or whenever the active log file generated by debug and trace has reached its file size threshold.

Checklist

Use the following checklist as an aid to help guide you through the process of archiving and performing backup operations.

Operation Recommendations
Take snapshot and dump MongoDB data. - Can be the entire Mongo database, if no CI/CD process in place to repair workflows, JSTs, or forms.
- Can also be only jobs, tasks, Golden Config compliance collections, and LifeCycle Manager entities (2023.1).
-All data files should be zipped and stored in a separate location.
Set log rolling to 99 files. - Rollover log threshold can be adjusted to a lower file number.
- Configure a time-based policy to purge rollover logs.
Archive IAG assets. - A CI/CD process should be used to manage hosts, scripts, playbooks, etc.
- All assets should be zipped and placed in a storage archive for snapshot purposes.
Backup device inventory If using an internal inventory in IAG, backup the SQLite database that holds the device data.
Set a frequency for retention and purging. - 30 days is the minimum standard for purging stale data.
- 60-90 days or longer is the minimum standard for retaining data, depending on storage, organizational policy, and business needs.

Collections

Itential recommends the following list of MongoDB collections be included in a data backup.

Collection Name Description
job_history Number of jobs in the system over a period of time.
job_output Output generated after running a workflow.
jobs Job documents.
wfe_job_metrics Workflow Engine metrics data for workflows.
wfe_task_metrics Workflow Engine performance data on tasks within workflows.
ucm_configs Configuration backups.
ucm_compliance_reports Compliance reports.

Additional Collections

While additional collections from Lifecyle Manager (LCM) can be archived and cleaned (purged), the cadence can vary based on policy and business need. Some will find this type of data is valid for several years, while others may only want to maintain LCM data that covers the last 12-months. Since LCM collections are comprised of integral state data and resource models for network configuration, Itential strongly recommends that you specify what LCM collections to store, where they should be stored and for how long.

Archiving Jobs (2023.2.0)

This section provides practices and recommendations as they relate to archiving jobs in the 2023.2 IAP release.

Job Data Collections

All job variable data in IAP 2023.2, as well as incoming and outgoing task data, has been moved out of the job and task collections.

  • For data less than 16MB, it will live in a collection called job_data.
  • For data greater than 16MB, it will live in the GridFS bucket in MongoDB, which uses two collections named job_data.chunks and job_data.files.

Job ID Required for Archiving

When archiving a job in the 2023.2 release:

  • The job_id is now required to retrieve all job_data documents for archiving.
  • If the job_id is is not provided, the overwhelming majority of job-related data will remain in the database.

Job Data Collection

All data in the job_data collection can be found by query on job_id. Below is an example of what data in the job_data collection is currently structured to look like.

{
    "_id" : ObjectId("657cc218a4f4b23e8083bff0"),
    "job_id" : "f04d7175b9d7452bb718c3c6",
    "data" : "example"
}

GridFS Bucket

For GridFS, the data is split between the job_data.chunks and job_data.files collections.

Data in job_data.files contains the metadata and all queryable information.

You will need to use the MongoDB driver, and consult the official GridFS documentation for how to query the GridFS bucket, as well as how to delete files.

Below is an example of the job_data.files document structure in the 2023.2 release. You can see that the job_id is located in metadata.job for the GridFS document.

{
    "_id" : ObjectId("657b6f14c928923a20266cc6"),
    "length" : 30000002,
    "chunkSize" : 261120,
    "uploadDate" : ISODate("2023-12-14T21:09:40.525Z"),
    "filename" : "3f0f8f8e-ded1-4c80-8c40-d5eb3657a018",
    "metadata" : {
        "job" : "24b0ea5ffbe646b181531585"
    }
}

Sample Commands

The following is a list of frequently used commands to support your backup strategy and preserve data.

MongoDB

To run mongodump:

mongodump --db=<old_db_name> --collection=<collection_name> --out=data/

To run mongorestore:

mongorestore --db=<new_db_name> --collection=<collection_name> data/<db_name>/<collection_name>.bson

Linux

To enter edit mode for a Cron job:

crontab -e

To backup IAG inventory:

sqlite3 iag.sq3 ".backup 'inventory'"

To restore IAG inventory:

sqlite3 iag.sq3 ".restore 'inventory'"

Itential Pre-Builts

This pre-built allows IAP users to archive the jobs and tasks collections for the 2023.1 and 2022.1 release only: Archive Job Data.

A forthcoming pre-built will enable job archiving on 2023.2 and later versions; however, the targeted release has yet to be determined. Once available, this page will be updated to reflect the newer pre-built for archiving job data in subsequent versions of IAP (23.2+).


Was this article helpful?

Changing your password will log you out immediately. Use the new password to log back in.
First name must have atleast 2 characters. Numbers and special characters are not allowed.
Last name must have atleast 1 characters. Numbers and special characters are not allowed.
Enter a valid email
Enter a valid password
Your profile has been successfully updated.