Active-standby architecture

An active/standby architecture (ASA) is an Itential architecture where all components are redundant and can gracefully tolerate at least one catastrophic failure while providing redundancy for the primary data center. This architecture is the recommended architecture for production environments that must adhere to strict business continuity and uptime demands. It builds on the HA2 architecture, essentially using two HA2 installs in geographically redundant locations with a larger MongoDB replica set that is also geographically redundant.

Architecture overview

The Itential Platform application performs many reads and writes against the database and is sensitive to high latencies. All active components must run in the same data center. The MongoDB replication process ensures that data written to the primary node in the active data center replicates to the geographically redundant MongoDB nodes in the secondary data center. All components must have authentication enabled.

The minimum ASA architecture is composed of 17 VMs:

Four Itential Platform servers
Five MongoDB servers
Six Redis servers
One IAG server

Active-standby architecture diagram showing four Platform servers, five MongoDB servers, six Redis servers, and one IAG server across geographically redundant data centers — Itential Highly Available Architecture

Highly available Itential Platform

Itential Platform instances communicate with one another through Redis and share data via MongoDB. Adding a new Itential Platform node and pointing it to the correct MongoDB and Redis is sufficient to achieve high availability. As Itential Platforms are added and configured, they are enabled to perform work.

Itential Platforms must have the following configurations:

MongoDB connection strings must contain a reference to all members of the replica set
Redis configurations must specify the list of all known Redis Sentinels and their Sentinel username and password (connections to HA Redis occur through Sentinels, not directly to Redis)

Configure standby site servers

Applies to Itential Automation Platform 2023.1 and later.

In an ASA, the secondary data center (standby site) contains Itential Platform servers that you must configure to remain passive until a failover event occurs. This configuration prevents both data centers from processing workloads simultaneously.

Configure all Itential Platform servers in the standby site with the following settings.

Disable task worker and job processing

Disable task worker and job start on all Itential Platform servers in the standby site. This prevents standby servers from claiming workflow tasks or starting jobs, which ensures only the active site processes workloads.

Method 1: Pre-configure in properties.json (recommended)

Set the following properties in the properties.json file before you start the Itential Platform:

processTasksOnStart: false - Prevents the Task Worker from processing tasks on startup
processJobsOnStart: false - Prevents jobs from starting on startup

File location: /opt/pronghorn/current/properties.json

Standby site configuration:

1 {
2   "processTasksOnStart": false,
3   "processJobsOnStart": false,
4   "pathProps": {
5     "sdk_dir": "/opt/pronghorn-applications",
6     "encrypted": true
7   },
8   "id": "StandbyProfile",
9   "mongoProps": {
10     "credentials": {
11       "passwd": "itentialPassword",
12       "user": "itentialUser"
13     },
14     "db": "pronghorn",
15     "url": "mongodb://mongo1:27017,mongo2:27017,mongo3:27017/?replicaSet=rs0"
16   }
17 }

Active site configuration (for comparison):

1 {
2   "processTasksOnStart": true,
3   "processJobsOnStart": true,
4   "pathProps": {
5     "sdk_dir": "/opt/pronghorn-applications",
6     "encrypted": true
7   },
8   "id": "ActiveProfile",
9   "mongoProps": {
10     "credentials": {
11       "passwd": "itentialPassword",
12       "user": "itentialUser"
13     },
14     "db": "pronghorn",
15     "url": "mongodb://mongo1:27017,mongo2:27017,mongo3:27017/?replicaSet=rs0"
16   }
17 }

Pre-configuring these settings in properties.json is the recommended approach for standby sites.

Method 2: Use UI toggles

If the Itential Platform is already running, you can disable task worker and job start through the UI.

Navigate to Admin Essentials

Navigate to Admin Essentials from the Itential Platform home page.

Toggle off job processing

Under Operation Execution, toggle off Accept New Jobs and Execute Job Tasks.

Job and task toggle

Job and task toggle confirmation

Stop Operations Manager

Stop Operations Manager to prevent the standby site from executing scheduled triggers, API triggers, and event triggers. If Operations Manager runs on both sites, triggers fire twice and cause duplicate workflow executions.

Navigate to Applications

Navigate to Admin Essentials > Applications.

Locate Operations Manager

Locate Operations Manager in the applications list.

Stop the application

Click Stop.

Verify stopped status

Verify that the application status shows stopped, which is indicated by the Play icon.

Stopped Operations Manager

Operations Manager stopped state

Configure these settings on all Active/Standby deployments running version 2023.1 or later.

Highly available MongoDB

MongoDB clusters operate in a primary/secondary model where data written to the primary replicates to the secondary. If a primary MongoDB node fails, the replica set detects this failure and forces an election for a new primary. During this time the replica set may not accept reads and writes until the new primary is selected, usually after a few seconds. Once finished and a new primary is identified, the Itential Platform application resumes normal operation. Operators do not need to take action during this election.

To preserve an odd number of replicas to prevent a split-brain scenario when/if an election occurs, this architecture requires the MongoDB cluster to be split across three data centers or regions: 2 in the primary region, 2 in the secondary region, and 1 in a tertiary region. When a region is lost there remain three voting members of the replica set. The replica set configuration must enforce a preference to influence the voting in this architecture to guarantee that the primary MongoDB shifts to the secondary region in the case of a disaster.

Itential’s MongoDB cluster must have the following requirements:

All replica set members must be defined in the Itential Platform config
Authentication between the replica members must be done with either a shared key or X.509 certificate
The database must have an admin user able to perform any operation
The database must have an “itential” user that is granted the least amount of privileges required by the Itential Platform application (Itential Platform must be configured to use this user account)
The replica set configuration must leverage the priority settings to influence voting as follows:

MongoDB Node	Priority Setting
Primary Region Database 1	10
Primary Region Database 2	10
Secondary Region Database 1	5
Secondary Region Database 2	5
Tertiary Region Database 3	1

Related reading:

Highly available Redis

Redis clusters operate in a primary/secondary model where data written to the primary replicates to the secondary. If a primary Redis node fails, the replica set detects this failure via Redis Sentinels and forces an election for a new primary. During this time the replica set may not accept reads and writes until the new primary is selected, usually after a few seconds. Once finished and a new primary is identified, the Itential Platform application resumes normal operation. Operators do not need to take action during this election.

Itential’s Redis cluster must have the following requirements:

All Redis nodes must be defined in the Itential Platform profile configuration
Authentication between the replica members is done with users defined in the Redis config file
Redis must have an admin user able to perform any operation
Redis must have an “itential” user that is granted the least amount of privileges required by the application (Itential Platform must be configured to use this user account)
Redis must have a replication user that is granted the least amount of privileges required by the replication process
Redis Sentinel must be included to monitor the Redis cluster and must be colocated with Redis
Redis Sentinel must have an admin user able to perform any Sentinel task
Redis nodes must maintain a low latency connection between nodes to avoid replication failures

For more information, see Redis Replication documentation.

Required user accounts

The validated designs are opinionated installations of Itential and its dependencies. The following user accounts are required by the dependencies.

MongoDB

Account	Description
`admin`	Has full root access to the `mongo` database. Can read and write to any logical database. Can be used to issue admin commands like forcing an election and configuring replica sets. This is NOT used by the Itential application but is created for admin purposes.
`itential`	Has read and write access to the `"itential"` database only. This is the account used by the Itential Platform application.
`localaaa`	Has read and write access to the `"LocalAAA"` database. This is used by the Local AAA adapter for local, non-LDAP logins.

Redis

Account	Description
`admin`	Has full root access to the Redis database, all channels, all keys, all commands. This is NOT used by the Itential application but is created for admin purposes.
`itential`	Has full access to the Redis database, all channels, all keys, EXCEPT the following commands: `asking`, `cluster`, `readonly`, `readwrite`, `bgrewriteaof`, `bgsave`, `failover`, `flushall`, `flushdb`, `psync`, `replconf`, `replicaof`, `save`, `shutdown`, `sync`. This is the account used by the Itential Platform application.
`repluser`	Has access to the minimum set of commands to perform replication: `psync`, `replconf`, `ping`.
`admin` (Sentinel)	Full root access to Redis Sentinel. This is NOT used by the Itential application but is created for admin purposes of Redis Sentinel.
`sentineluser`	Has access to the minimum set of commands to perform sentinel monitoring: `multi`, `slaveof`, `ping`, `exec`, `subscribe`, `config

Network requirements

In an environment where components are installed on more than one host, the following network traffic flows need to be allowed. All ports and networking specs are TCP protocol unless otherwise noted. Not all ports will need to be open for every supported architecture. Secure ports are only required when explicitly configured.

Source	Destination	Port	Description
Desktop Devices	Itential Platform	3000	Web browser connections to Itential Platform over HTTP
Desktop Devices	Itential Platform	3443	Web browser connections to Itential Platform over HTTPS
Desktop Devices	IAG	8083	Web browser connections to IAG over HTTP
Desktop Devices	IAG	8443	Web browser connections to IAG over HTTPS
Desktop Devices	HashiCorp Vault	8200	Web browser connections to HashiCorp Vault
Itential Platform	MongoDB	27017	Itential Platform connects to MongoDB
Itential Platform	Redis	6379	Itential Platform connects to Redis
Itential Platform	Redis	26379	Itential Platform connects to Redis Sentinel (HA installations only)
Itential Platform	IAG	8083	Itential Platform connects to IAG over HTTP
Itential Platform	IAG	8443	Itential Platform connects to IAG over HTTPS
Itential Platform	HashiCorp Vault	8200	Itential Platform connects to HashiCorp Vault
Itential Platform	LDAP	389	Itential Platform connects to LDAP (when LDAP adapter is used for authentication)
Itential Platform	LDAP	636	Itential Platform connects to LDAP with TLS (when LDAP adapter is used for authentication)
Itential Platform	RADIUS	1812	Itential Platform connects to RADIUS (when RADIUS adapter is used for authentication; uses UDP)
MongoDB	MongoDB	27017	Each MongoDB talks to other MongoDBs for replication (HA installations only)
Redis	Redis	6379	Each Redis talks to other Redis sources for replication (HA installations only)
Redis	Redis	26379	Each Redis uses Redis Sentinel to monitor the Redis processes (HA installations only)

Hardware requirements

Processor

Processor specification requirements:

Second generation or better Intel Xeon Platinum 8000 series processors
Third generation or better AMD EPYC 7000 series processors

Memory

Memory specification requirement:

DDR5 DRAM 3200 MHz or higher

Storage

Storage performance requirements in IOPS (16 kiB):

20000+ IOPS
Non-spinning media (SSD, NVMe)

Network

Network speed requirement:

10 Gbps or higher

In some instances, adding additional dedicated interfaces that are focused on routing specific traffic to specific external systems can be explored. This routing of traffic would be configured at the OS-level (custom interfaces and routes) and requires the system administrator to manage it. An example would be separating NSO traffic from Redis/MongoDB destined traffic.

Hypervisor/host OS settings

These settings are strongly recommended for high load applications of Itential Platform:

CPU affinity settings or similar functionality to prevent CPU starvation
Full memory reservation
One physical CPU per VM is preferred
Huge pages for memory support enabled (except MongoDB)
Memory compression disabled
Minimal CPU allocation settings for scheduler according to CPU clock

Example: Assuming an Itential Platform VM on a server capable of 2.5GHz nominal speed:

CPU clock reservation = 16vCPU × 2.5GHz

Follow hypervisor recommendations when performing CPU reservations. In most cases the total of all CPU reservations for all VMs on a host cannot be more than 90% of the host capacity as 10% is reserved by the host itself.

MongoDB discourages the utilization of Transparent Huge Pages.

Server specifications

For production environments, all Itential Platform components should be installed on their own individual servers to properly support High Availability (HA). Disk references to pronghorn (seen in older deployments) should be changed to itential.

Itential Platform server

Spec	Requirement	Production ENV
CPU	64-bit x86 CPU cores	16
OS	RHEL Rocky	8/9 8/9
RAM	DDR5 DRAM 3200 MHz	64 GB
Disk (Solid State Media, SSD, NVMe)	Total `/var/log/itential` `/opt/itential` `/`	250 GB 100 GB 100 GB 50 GB

MongoDB server

Spec	Requirement	Production ENV
CPU	64-bit x86 CPU cores	16
OS	RHEL Rocky	8/9 8/9
RAM	DDR5 DRAM 3200 MHz	128 GB
Disk (Solid State Media, SSD, NVMe)	Total `/var/log/mongodb` `/var/lib/mongo` `/`	1000 GB 100 GB 850 GB 50 GB

Redis server

Spec	Requirement	Production ENV
CPU	64-bit x86 CPU cores	8
OS	RHEL Rocky	8/9 8/9
RAM	DDR5 DRAM 3200 MHz	32 GB
Disk (Solid State Media, SSD, NVMe)	Total `/var/log/redis` `/var/lib/redis` `/`	100 GB 10 GB 50 GB 40 GB

IAG server

Spec	Requirement	Production ENV
CPU	64-bit x86 CPU cores	16
OS	RHEL Rocky	8/9 8/9
RAM	DDR5 DRAM 3200 MHz	32 GB
Disk (Solid State Media, SSD, NVMe)	Total `/var/log/automation-gateway` `/var/lib/automation-gateway` `/opt/automation-gateway` `/`	80 GB 10 GB 50 GB 10 GB 10 GB