Archive and purge data
Retention policies for archiving, backing up, and purging data should include your Itential Platform and Itential Automation Gateway (IAG) assets. This page covers best practices and suggested methods for on-premises Itential Platform and IAG installations.
Itential Platform is not a system of record. Itential strongly recommends specifying a retention policy that suits your business needs.
Itential Platform
Itential Platform uses MongoDB to store application and instance data. Itential recommends performing regular backups of the MongoDB database and storing backup data in a separate location for safekeeping and recovery. The specific Itential Platform .bin installation file should also be retained to support re-installation to exact specifications if restoration is necessary.
Itential Automation Gateway
IAG maintains host files, scripts, playbooks, and custom modules on a Linux file system. Back up those file locations periodically using cron jobs.
Data storage size
When planning data preservation and backups:
- Check your configuration and default directory size. The maximum storage size for backup data is often limited.
- Avoid archiving long-running workflows.
- Make large job documents modular. MongoDB uses GridFS for storing files larger than 16MB.
Log rotation
Gather log data from a rolling log. As a recommended default, configure a rolling log distributed across 99 files over time (this number can be adjusted lower). Purge rolled log files when the log directory reaches a particular size or when the active log file reaches its file size threshold.
Archiving and backup checklist
Collections
Itential recommends including the following MongoDB collections in data backups.
Additional collections
Lifecycle Manager (LCM) collections can also be archived and purged, though the cadence depends on policy and business need. LCM data may be relevant for several years in some organizations, while others may only need to retain data from the past 12 months. Because LCM collections contain integral state data and resource models for network configuration, Itential strongly recommends defining which LCM collections to store, where to store them, and for how long.
Archive jobs (2023.2)
Job data collections
In the 2023.2 release, all job variable data — including incoming and outgoing task data — has been moved out of the job and task collections.
- Data less than 16MB is stored in the
job_datacollection. - Data greater than 16MB is stored in the GridFS bucket, which uses the
job_data.chunksandjob_data.filescollections.
Job ID required for archiving
When archiving a job in 2023.2:
- The
job_idis required to retrieve alljob_datadocuments for archiving. - If
job_idis not provided, the overwhelming majority of job-related data will remain in the database.
job_data collection
All data in the job_data collection can be queried by job_id:
GridFS bucket
For GridFS, data is split between the job_data.chunks and job_data.files collections. The job_data.files collection contains metadata and all queryable information. The job_id is located in metadata.job for GridFS documents.
Use the MongoDB driver and consult the GridFS documentation for guidance on querying and deleting files.
Example job_data.files document:
Sample commands
MongoDB
Run mongodump:
Run mongorestore:
Linux
Edit a cron job:
Back up IAG inventory:
Restore IAG inventory:
Itential pre-builts
The Archive Job Data pre-built allows Itential Platform users to archive the jobs and tasks collections for the 2023.1 and 2022.1 releases.
A forthcoming pre-built will enable job archiving on 2023.2 and later versions. Once available, this page will be updated with the newer pre-built.