- 15 Nov 2023
-
DarkLight
-
PDF
Manual Failover
- Updated on 15 Nov 2023
-
DarkLight
-
PDF
Manual Failover Procedure in a Disaster Recovery Scenario
Itential Automation Platform (IAP) has been designed to allow for multiple IAP servers to be used in a clustered and High Availability (HA) fashion when deployed within the same datacenter. When further assurrance is needed, a second disaster recovery system can be deployed which will stand ready to assume processing duties from a second datacenter.
This document provides the steps required to perform a manual failover from the Primary Active site to the Standby site when an IAP cluster is in the Large: High Availability (HA) & Disaster Recovery (Active/Standby) architecture (also known as HA-6). At the end of this procedure, the Standby system will become the active and processing system.
For component information along with architectural sizing for High Availability in IAP, refer to the High Availability Architecture guide.
Active and Standby Servers
It is assumed that an IAP server in Standby mode was previously configured and started with the correct setting for the processTasksOnStart
property in the properties.json file. As a recommendation, make sure your Active and Standby servers are configured differently when it comes to default startup task processing behavior by verifying the value of the processTasksOnStart
property in each of the running servers properties.json file.
server | Property | Expected Value |
---|---|---|
Standby | processTasksOnStart | false |
Active | processTasksOnStart | true (Default value) |
Example properties.json
The following presents an example properties.json file with processTasksOnStart
configured for Standby mode. The properties.json is normally located at /opt/pronghorn/current
directory.
{
"processTasksOnStart":false,
"pathProps": {
"`": "File Path Variables",
"sdk_dir": "/opt/pronghorn-applications",
"encrypted": true
},
"id": "Profile2",
"mongoProps": {
"credentials": {
"passwd": "itentialPassword",
"user": "itentialUser"
},
"db": "pronghorn",
"url": "mongodb://127.0.0.1:27017"
}
}
Conversely, the processTasksOnStart
is set to 'true' within the properties.json file on the server considered in Active mode.
How to Enable/Disable Task Execution
In order to manually failover from IAP Active to Standby node, disable the TaskWorker on Active node from the IAP UI, and then enable it on IAP Standby node. Please note that in a disaster recovery (DR) situation, disabling TaskWorker on Active node as the first step may not be possible.
Enabling or disabling Task Execution (on the corresponding nodes) can be performed from the IAP UI in either of two ways.
-
Navigate to Admin Essentials from the IAP home page.
-
Click the Pause Task Execution button under "Running". A message banner appears indicating that Automation (Workflow) Engine was suspended.
Figure 1: Admin Essentials
-
Alternately, click the Current Operations link to open the console for Active Jobs and Running Tasks.
-
Using the Suspend Workflow Engine toggle switch (upper-right), slide it to the right to to disable Workflow Engine on Active node; and similarly, use the same button to enable Workflow Engine (normally on Standby node).
Figure 2: Current Operations
-
Verify the following message banner and success notification appears when Workflow Engine is disabled (normally IAP Active node in a manual failover scenario).
Figure 3: Task Execution Suspended
-
Verify the message banner disappears from the node when TaskWorker is enabled (normally IAP Standby node in a manual failover scenario). The success notification will indicate that Automation (Workflow) Engine was enabled (activated).
Figure 4: Automation (Workflow) Engine Activated
-
Isolate the Active node from the network until it is ready to return to its original Active node role. Restarting the Active node while the Standby node TaskWorker is enabled should be prevented by isolating it from reaching the rest of network.
Task Execution Service Config
In some scenarios, you may require a setting that will disable TaskWorker under the services_configs
.
The startup property for TaskWorker should be activate: true
.
Below is an example of the parameters that may need to be added to the TaskWorker Configuration under the Applications collection in Admin Essentials. Moreover, you will need to use two different profiles, one for Active and one for Standby (DR). The Active IAP profile will use TaskWorker-ACTIVE
while the Standby IAP profile will use TaskWorker-STBY
.
TaskWorker-Active
{
"loggerProps": {
"description": "Logging",
"log_max_files": 100,
"log_max_file_size": 1048576,
"log_level": "warn",
"log_directory": "/var/log/pronghorn",
"log_filename": "TaskWorker.log",
"console_level": "warn"
},
"isEncrypted": true,
"model": "@itential/app-task_worker",
"name": "TaskWorker-ACTIVE",
"type": "Application",
"properties": {
"activate":true
},
"rabbitmq": {
"protocol": "amqp",
"port": 5672,
"username": "guest",
"password": "guest",
"locale": "en_US",
"frameMax": 0,
"heartbeat": 0,
"vhost": "/",
"certPath": "",
"keyPath": "",
"passphrase": "guest",
"caPath": "",
"hosts": [
"localhost"
]
}
}
TaskWorker-Standby (DR)
{
"loggerProps": {
"description": "Logging",
"log_max_files": 100,
"log_max_file_size": 1048576,
"log_level": "warn",
"log_directory": "/var/log/pronghorn",
"log_filename": "TaskWorker.log",
"console_level": "warn"
},
"isEncrypted": true,
"model": "@itential/app-task_worker",
"name": "TaskWorker-STBY",
"type": "Application",
"properties": {
"activate":false
},
"rabbitmq": {
"protocol": "amqp",
"port": 5672,
"username": "guest",
"password": "guest",
"locale": "en_US",
"frameMax": 0,
"heartbeat": 0,
"vhost": "/",
"certPath": "",
"keyPath": "",
"passphrase": "guest",
"caPath": "",
"hosts": [
"localhost"
]
}
}
Figure 5: Workflow Engine Service Config