Active-standby architecture
Active-standby architecture
An active/standby architecture (ASA) is an Itential architecture where all components are redundant and can gracefully tolerate at least one catastrophic failure while providing redundancy for the primary data center. This architecture is the recommended architecture for production environments that must adhere to strict business continuity and uptime demands. It builds on the HA2 architecture, essentially using two HA2 installs in geographically redundant locations with a larger MongoDB replica set that is also geographically redundant.
Architecture overview
The Itential Platform application performs many reads and writes against the database and is sensitive to high latencies. All active components must run in the same data center. The MongoDB replication process ensures that data written to the primary node in the active data center replicates to the geographically redundant MongoDB nodes in the secondary data center. All components must have authentication enabled.
The minimum ASA architecture is composed of 17 VMs:
- Four Itential Platform servers
- Five MongoDB servers
- Six Redis servers
- One IAG server
Highly available Itential Platform
Itential Platform instances communicate with one another through Redis and share data via MongoDB. Adding a new Itential Platform node and pointing it to the correct MongoDB and Redis is sufficient to achieve high availability. As Itential Platforms are added and configured, they are enabled to perform work.
Itential Platforms must have the following configurations:
- MongoDB connection strings must contain a reference to all members of the replica set
- Redis configurations must specify the list of all known Redis Sentinels and their Sentinel username and password (connections to HA Redis occur through Sentinels, not directly to Redis)
Configure standby site servers
Applies to Itential Automation Platform 2023.1 and later.
In an ASA, the secondary data center (standby site) contains Itential Platform servers that you must configure to remain passive until a failover event occurs. This configuration prevents both data centers from processing workloads simultaneously.
Configure all Itential Platform servers in the standby site with the following settings.
Disable task worker and job processing
Disable task worker and job start on all Itential Platform servers in the standby site. This prevents standby servers from claiming workflow tasks or starting jobs, which ensures only the active site processes workloads.
Method 1: Pre-configure in properties.json (recommended)
Set the following properties in the properties.json file before you start the Itential Platform:
processTasksOnStart: false- Prevents the Task Worker from processing tasks on startupprocessJobsOnStart: false- Prevents jobs from starting on startup
File location: /opt/pronghorn/current/properties.json
Standby site configuration:
Active site configuration (for comparison):
Pre-configuring these settings in properties.json is the recommended approach for standby sites.
Method 2: Use UI toggles
If the Itential Platform is already running, you can disable task worker and job start through the UI.
Stop Operations Manager
Stop Operations Manager to prevent the standby site from executing scheduled triggers, API triggers, and event triggers. If Operations Manager runs on both sites, triggers fire twice and cause duplicate workflow executions.
Configure these settings on all Active/Standby deployments running version 2023.1 or later.
Highly available MongoDB
MongoDB clusters operate in a primary/secondary model where data written to the primary replicates to the secondary. If a primary MongoDB node fails, the replica set detects this failure and forces an election for a new primary. During this time the replica set may not accept reads and writes until the new primary is selected, usually after a few seconds. Once finished and a new primary is identified, the Itential Platform application resumes normal operation. Operators do not need to take action during this election.
To preserve an odd number of replicas to prevent a split-brain scenario when/if an election occurs, this architecture requires the MongoDB cluster to be split across three data centers or regions: 2 in the primary region, 2 in the secondary region, and 1 in a tertiary region. When a region is lost there remain three voting members of the replica set. The replica set configuration must enforce a preference to influence the voting in this architecture to guarantee that the primary MongoDB shifts to the secondary region in the case of a disaster.
Itential’s MongoDB cluster must have the following requirements:
- All replica set members must be defined in the Itential Platform config
- Authentication between the replica members must be done with either a shared key or X.509 certificate
- The database must have an admin user able to perform any operation
- The database must have an “itential” user that is granted the least amount of privileges required by the Itential Platform application (Itential Platform must be configured to use this user account)
- The replica set configuration must leverage the priority settings to influence voting as follows:
Related reading:
Highly available Redis
Redis clusters operate in a primary/secondary model where data written to the primary replicates to the secondary. If a primary Redis node fails, the replica set detects this failure via Redis Sentinels and forces an election for a new primary. During this time the replica set may not accept reads and writes until the new primary is selected, usually after a few seconds. Once finished and a new primary is identified, the Itential Platform application resumes normal operation. Operators do not need to take action during this election.
Itential’s Redis cluster must have the following requirements:
- All Redis nodes must be defined in the Itential Platform profile configuration
- Authentication between the replica members is done with users defined in the Redis config file
- Redis must have an admin user able to perform any operation
- Redis must have an “itential” user that is granted the least amount of privileges required by the application (Itential Platform must be configured to use this user account)
- Redis must have a replication user that is granted the least amount of privileges required by the replication process
- Redis Sentinel must be included to monitor the Redis cluster and must be colocated with Redis
- Redis Sentinel must have an admin user able to perform any Sentinel task
- Redis nodes must maintain a low latency connection between nodes to avoid replication failures
For more information, see Redis Replication documentation.
Required user accounts
The validated designs are opinionated installations of Itential and its dependencies. The following user accounts are required by the dependencies.
MongoDB
Redis
Network requirements
In an environment where components are installed on more than one host, the following network traffic flows need to be allowed. All ports and networking specs are TCP protocol unless otherwise noted. Not all ports will need to be open for every supported architecture. Secure ports are only required when explicitly configured.
Hardware requirements
Processor
Processor specification requirements:
- Second generation or better Intel Xeon Platinum 8000 series processors
- Third generation or better AMD EPYC 7000 series processors
Memory
Memory specification requirement:
- DDR5 DRAM 3200 MHz or higher
Storage
Storage performance requirements in IOPS (16 kiB):
- 20000+ IOPS
- Non-spinning media (SSD, NVMe)
Network
Network speed requirement:
- 10 Gbps or higher
In some instances, adding additional dedicated interfaces that are focused on routing specific traffic to specific external systems can be explored. This routing of traffic would be configured at the OS-level (custom interfaces and routes) and requires the system administrator to manage it. An example would be separating NSO traffic from Redis/MongoDB destined traffic.
Hypervisor/host OS settings
These settings are strongly recommended for high load applications of Itential Platform:
- CPU affinity settings or similar functionality to prevent CPU starvation
- Full memory reservation
- One physical CPU per VM is preferred
- Huge pages for memory support enabled (except MongoDB)
- Memory compression disabled
- Minimal CPU allocation settings for scheduler according to CPU clock
Example: Assuming an Itential Platform VM on a server capable of 2.5GHz nominal speed:
Follow hypervisor recommendations when performing CPU reservations. In most cases the total of all CPU reservations for all VMs on a host cannot be more than 90% of the host capacity as 10% is reserved by the host itself.
MongoDB discourages the utilization of Transparent Huge Pages.
Server specifications
For production environments, all Itential Platform components should be installed on their own individual servers to properly support High Availability (HA). Disk references to pronghorn (seen in older deployments) should be changed to itential.



