This white paper explores the technologies that provide the foundation and capability for High Availability (HA) in the Itential Automation Platform (IAP) and discusses multiple deployment scenarios. Business critical applications have always required continuous availability. As more organizations launch services online for consumption by global users, availability and scalability across distributed geographic regions become increasingly important considerations in systems design. There are three principal reasons for the geographic distribution of databases across multiple data:
Continuous Availability - Whether the database is deployed on-premise or in a public cloud, the business needs assurance that the service will survive a regional disaster that causes a complete data center outage. Examples include fires, floods or hurricanes. Gartner estimates downtime costs a business an average of $300,000 per hour, with losses much higher for global, internet-based operations .
Customer Experience - Global users need consistent, low latency experiences, wherever they are located. Amazon famously concluded that each 100 ms in added latency resulted in a 1% loss of sales.
Regulatory Compliance - National governments and industry regulatory agencies are placing controls on where customer data is physically located, as well as expectations of service restoration times in case of catastrophic system outages.
Under normal operating conditions, an IAP deployment will perform according to the performance and functional goals of the system. However, from time to time certain inevitable failures or unintended actions can affect a system in adverse ways. Storage devices, network connectivity, power supplies, and other hardware components will fail. These risks can be mitigated with redundant hardware components. Similarly, an IAP deployment provides configurable redundancy throughout both its software components and its data storage.
Before describing IAP HA capability, a brief explanation of the platform architecture is required. IAP is developed using the MEAN stack. MEAN is a set of Open Source components that collectively provide an end-to-end framework for building dynamic web applications, starting from the top (code running in the browser) to the bottom (database). The stack consists of:
In addition to the above, IAP uses Redis to manage client session tokens.
The following sections describe how each element of IAP architecture can be deployed to meet customer requirements for High Availability (HA) and Disaster Recovery (DR), from server availability, client sessions management and database replication, and resilience.
High Availability - The Basics
The recommended minimum number of servers required to implement HA for IAP is five. This being two application servers and three MongoDB servers.
Application Server Load Balancing
High Availability (HA) refers to a site being up for as long as possible. This means there is enough infrastructure in the right locations to ensure there is no single point of failure that could take the site down.
Both Failover and Load Balancing are means to achieve High Availability:
Load Balancing spreads the load of the application across multiple application servers or multiple web servers to help smooth out the peaks if there is a lot of traffic all at once. Load Balancing is one piece of the puzzle when implementing high availability.
Failover protects applications from downtime by having redundant equipment that can take over if one part were to fail.
There are numerous techniques that can be used to load balance client access requests across servers. When the load is low then one of the simple load balancing methods will suffice. But in times of high load then the more complex methods are used to ensure an even distribution of requests under network and service stress.
Below is a list of example techniques that load balancers use.
This method is used when deploying an ‘active/standby’ architecture (see below for more detail). Each server within a group is assigned a number that corresponds to the rank within the group. Using the ranking of each answer, the load-balancer tries each server in the order that has been assigned, selecting the first available live answer to serve a user request. List members are given precedence and tried in order, and a member is not used unless all previous members fail to provide a suitable result.
A set of servers are configured to handle load in a rotating sequential manner. The algorithm assumes that each server is able to process the same number of requests but not able to account for active connections.
Weighted Round Robin
Servers are rated based on the relative number of requests each is able to process. Those having higher capacities are sent more requests.
Requests are sent to the server having the fewest number of active connections, assuming all connections generate an equal amount of server load.
Weighted Least Connections
Servers are rated based on their processing capabilities. Load is distributed according to both the relative capacity of the servers and the number of active connections on each one.
Source IP Hash
Combines the source and destination IP address in a request to generate a hash key, which is then designated to a specific server. This lets a dropped connection be returned to the same server originally handling it.
More complex and sophisticated algorithms are available. Descriptions of which are outside the scope of this white paper.
This white paper assumes a load-balancer (either HAProxy or NGINX) will be used.
Note: The actual configuration of load-balancers is outside the scope of this document. It is important however to point out that IAP supports multiple load balancing patterns.
IAP Health Checks
To handle failover, IAP provides a health check API that can be used by load-balancers to ensure traffic is not directed to a server that is currently unavailable. By using the [hostname]/status API, a load balancer routes requests only to the healthy instances. When the load balancer determines that an instance is unhealthy, it stops routing requests to that instance. The load balancer resumes routing requests to the instance when it has been restored to a healthy state.
Also known as hot/hot, an active-active cluster is typically comprised of at least two nodes, both actively running the same kind of service simultaneously. The main purpose of an active-active cluster is to achieve load balancing. Load balancing distributes workloads across all nodes in order to prevent any single node from becoming overloaded. Because there are more nodes available to serve, there will also be a marked improvement in throughput and response times.
Assigning clients to nodes in a cluster is not an arbitrary process. Rather, it is based on whatever load balancing algorithm is set on the load balancer. For example, in a “Round Robin” algorithm, the first client to connect is sent to the 1st server, the second client to the 2nd server, the 3rd client back to the 1st server, the 4th client back to the 2nd server, and so on.
Also known as hot/warm or Active/Passive, like the active-active configuration, active-standby also consists of at least two nodes. However, as the name “active-standby” implies, not all nodes are going to be active. In the case of two nodes, for example, if the first node is already active, the second node must be passive or on standby.
The standby (Failover) server serves as a backup that is ready to take over as soon as the active (Primary) server is disconnected or unable to serve.
RabbitMQ is open-source software that allows services, systems, and applications to communicate and exchange information with each other. In other words, it is a message broker. RabbitMQ is used for communication between IAP and its applications. RabbitMQ clustering and message queue mirroring can be used for high availability. The IPC (Inter-Process Communication) functions are handled solely by RabbitMQ.
Each instance of IAP requires a Redis installation, which is installed on the same server as IAP. The Itential Automation Platform uses Redis for shared authentication token storage and expiration.
Shared Authentication Session Tokens
Once a connection request to IAP has been successfully authenticated, either via an API call or a user connection from a browser, the session is allocated a token, which is then stored in Redis. Any subsequent call made within IAP will remain authenticated while the session token remains valid. The following sections discuss the configuration options available in an Active/Standby and Active/Active setup.
When configured in Active/Standby mode, each instance of Redis is configured standalone, and tokens are not shared. When directed via a load-balancer that has been configured with ‘sticky’ connections, all subsequent requests will be sent to the same IAP server, and the session tokens are valid.
Using the figure below as an example, assume that the IAP server in Data Center 1 fails, the Load-balancer will redirect traffic to the alternate IAP server. As the session tokens stored in the Redis instance in DC1 are not replicated to DC2, the session token submitted with the request is invalid and the request is rejected. As a result, any session will need to be re-established via new login request.
When running in Active/Active mode, the session tokens must be replicated to all Redis databases. With this configuration, using the example below, when the server in Data Center 1 fails, the load balancer will direct traffic to the server in Data Center 2. As the session tokens have been replicated to the Redis database in DC2, the session is deemed active.
To achieve an HA Topology, Redis sentinels are used to provide master-slave replication for the shared authentication token storage function.
Given the enhanced HA capabilities of IAP, active/active is the recommended configuration option for Redis whenever possible.
As stated above, the database used by IAP is MongoDB. Out of the box, MongoDB provides extensive HA and data replication functionality and application drivers.
Using the node.js specific driver allows IAP to leverage all of the HA and replication capabilities that MongoDB provides, with the main functionality being the ability to monitor the state of the MongoDB deployment and ensure that the application is always connected to the Primary server. The sections below describe Replica Sets and member elections in more detail. Additional detail on MongoDB drivers is available on the following site.
MongoDB maintains multiple copies of data, called replica sets, using native replication. Users should deploy replica sets to help prevent database downtime. Replica sets are self-healing as failover and recovery is fully automated, so it is not necessary to manually intervene to restore a system in the event of a failure.
Replica sets also provide operational flexibility by providing a way to perform system maintenance (i.e. upgrading hardware and software) while preserving service continuity. This is an important capability as these operations can account for as much as one third of all downtime in traditional systems.
A replica set consists of multiple database replicas. At any given time, one-member acts as the primary replica set member and the other members act as secondary replica set members. If the primary member suffers an outage (e.g., a power failure, hardware fault, network partition), one of the secondary members is automatically elected to primary, typically within several seconds, and the client connections automatically failover to that new primary. Read operations can continue to be serviced by secondary replicas while the election of a new primary is in progress.
The number of replicas in a MongoDB replica set is configurable, with a larger number of replica members providing increased data durability and protection against database downtime (e.g., in cases of multiple machine failures, rack failures, data center failures, or network partitions). Up to 50 members can be configured per replica set, providing operational flexibility and wide data distribution across multiple geographic sites.
You can learn more about the members of a replica set from the MongoDB documentation.
Replica Set Elections
In the event of a primary replica set member failing, the election process is controlled by sophisticated algorithms based on an extended implementation of the Raft consensus protocol. Not only does this allow fast failover to maximize service availability, the algorithms ensure only the most suitable secondary members are evaluated for election to primary and reduce the risk of unnecessary failover (also known as “false positives”). Before a secondary replica set member is promoted, the election algorithms evaluate a range of parameters including:
Analysis of election identifiers, timestamps and journal persistence to identify those replicas set members that have applied the most recent updates from the primary member.
Heartbeat and connectivity status with the majority of other replica set members.
User-defined priorities assigned to replica set members. For example, administrators can configure all replicas located in a secondary data center to be candidates for election only if the primary data center fails.
Once the election process has determined the new primary, the secondary members automatically start replicating from it. If the original primary comes back online, it will recognize its change in state and automatically assume the role of a secondary.
A tutorial is available providing best practices and guidance on deploying MongoDB replica sets across data centers.
MongoDB Data Center Awareness
MongoDB provides a rich set of features to help users deploy highly available and scalable systems. In designing for high availability, administrators must evaluate read and write operations in the context of different failure scenarios. The performance and availability SLAs (Service Level Agreements) of a system play a significant role in determining:
- The number of replicas (copies) of the data.
- The physical location of replica sets, both within and across multiple data centers.
Administrators can configure the behavior of the MongoDB replica sets to enable data center awareness. Configuration can be based on a number of different dimensions, including awareness of geographical regions in multi-data center deployments, or racks, networks, and power circuits in a single data center.
With MongoDB, administrators can:
- Ensure write operations propagate to specific members of a replica set, deployed locally and in remote data centers. This reduces the risk of data loss in the event of a complete data center outage. Alternatively, replication can be configured to ensure data is only replicated to nodes within a specific region to ensure data never leaves a country’s borders.
- Ensure that specific members of a replica set respond to queries - for example, based on their location. This reduces the effect of geographic latency.
- Place specific data partitions on specific shards, each of which could be deployed in different data centers. Again, this can be used to reduce geographic latency and maintain data sovereignty.
Configuring Write Operations Across Data Centers
MongoDB allows users to specify write availability in the system, using an option called write concern. Each operation can specify the appropriate write concern, ranging from unacknowledged to an acknowledgement that writes have been committed to:
A single replica (i.e. the Primary replica set member).
A majority of replicas.
It is also possible to configure the write concern so that writes are only acknowledged once specific policies have been fulfilled, such as writing to at least two replica set members in one data center and at least one replica in a second data center.
From this white paper, the use of robust load-balancing techniques, clustered Redis deployments and geographically distributed MongoDB replica sets allows your organization to design an HA infrastructure that will allow you to respond to events quickly, minimize the number of disruptions, and nearly eliminate the resulting periods of downtime.
For more information, please visit itential.com or contact us at email@example.com.