Syneto’s Solution for High Availability
In order to achieve High Availability (HA) several techniques of redundancy and automatic failover actions must be considered. The goal is to create a system with as little downtime as possible using hardware redundancy like disks organized in RAID, multi-CPU systems with CPU hot-swapping, multi-head storage solutions and, finally, software controlled clustering solutions where multiple independent storage devices work as a whole.
Each type of redundancy offers protection at a certain level in the functionality chain. Each redundancy and technology protects you against different types of possible failures. For example, disk RAID protects against individual disk failure, clustering as an HA solution protects against whole system failure.
Syneto’s special HA features allow you to configure two different machines (aka “nodes”) to work as one. Each node manages a number of disk pools that can be accessed via one or more assigned IPs. We will refer to them as HA pools. The cluster behaves as if you would have another storage device, but behind the scenes, there are two physical machines serving the requests. The user never knows which physical machine serves his/her requests.
Syneto OS supports active-active clustering. This means each machine is actively serving requests and can take over immediately the other one’s pools if the node fails. There is no hierarchy between the machines, they are equal.
The two cluster nodes – Node 1 and Node 2 – will be connected to an external JBOD via SAS or SATA connections. Each Storage Node will have its own connection. On the JBOD one or more HA disk pools can be created.
The communication between the two Storage Nodes will be achieved on a specified network interface called Heartbeat Interface. On our schema the heartbeat interface is e1000g1.
Administration and configuration access to the two nodes will be ensured over the administration network. System administrators should use this network in order to access both nodes. In the example diagram, we used the interface e1000g2 for this purpose and we will refer to these interfaces as Administration Interfaces from now on.
Each HA Pool will have one or more floating IP addresses assigned if it is shared on an ethernet network. Fibre channel can be also be used. Each floating IP will be accessible from the client network and it will be dynamically assigned to one of the nodes. The storage client won’t know which node serves the requests.
HA pool configurations will be synchronized between the nodes using the Heartbeat Interface. The synchronized configurations include the pool’s name to be served via the cluster, the cluster IP, the sharing services (iSCSI, NFS, SMB, Fibre channel) and the backup and replication services configured for the HA pool.
Prepare the System
Decide what physical ports to use for the Heartbeat Interface. Connect these ports with a cross-over cable and set an IP address on each node so that they can communicate (from Network > Interfaces). The interfaces must be the same on both nodes (e.g. e1000g1). See Network Configuration if needed.
Create the Cluster
After the required preparations are done, the first step is to create a cluster. We will need to configure high availability on both nodes. Go to the HA menu on the first node and click on the Create Cluster button. You will be presented with the following form:
The field you have to specify is the Heartbeat interface and describes the network interface that will be used for heartbeat between both nodes. This is selected only at cluster creation and it must be the same on all nodes.
After you click Create cluster an animation will provide feedback on the progress while the first half of the cluster is created and all necessary settings are applied to the system. This process may take some time, please be patient.
After your cluster has been successfully created, a screen will show you its state.
The left column represents the first node along with its name. The green rectangle next to its name confirms that the node is active and healthy. On the current node, there will be a small label saying This node so that you know which of the nodes you are logged into via the web interface. At this point, there is only a single active node.
The right column represents the remote node – the one that will join the cluster.
The two nodes are connected by the Heartbeat interface.
Join Another Node to the Cluster
Once you have a configured the cluster, joining the second node is very simple. Just go to the HA menu item again and choose Join Existing Cluster.
Fill in the Heartbeat interface’s IP address from the other node, add the other node’s admin password and click Join cluster. An animation which shows the progress will start. This process may take some time, please be patient.
After the process is finished, the interface will show both nodes as part of the High Availability cluster.
The current node, marked with the label This node, is the one that joined the cluster.
Attaching a Pool to a Cluster
The newly created cluster is now active but doesn’t protect any pools yet. Users will need access to data so the next step is to start adding pools to the cluster. These pools will be associated with IP addresses so that the users will know how to access them. Multiple pools can be added to each node so that the administrator can balance the workload between the nodes.
You can attach an HA pool by clicking on the plus button, located at the top right corner of the node.
The first required field is the disk pool you wish to assign to the cluster.
Attaching the HA pool can be done either over Fibre channel or via floating IP (iSCSI). Fibre channel does not require additional information. When you select floating IP, you will be asked to choose one or multiple interfaces and to assign an IP address and netmask to each chosen interface. These will be the interfaces through which your clients will be able to access the HA pool.
Once the HA disk pool is attached, you can see all its related information: the name, the IP(s) and interface(s) assigned. For more information on the failed/successful operations, check out the Activity log.
Move a Pool
In case you decide that an HA pool should be attached to the other node, you can manually migrate it by clicking the Move pool button.
Remove a Pool From Cluster Control
You can easily remove the HA pool from the cluster control by clicking on the Remove this pool from cluster control. The cluster will remove the HA pool from its management.
Exporting the Pool
Clicking ‘Export pool from cluster control’ will remove the pool from the HA configuration. The pool will no longer be accessible to the clients. It can be imported back from the Disk pools page.
Inactive Cluster Pools
When a pool becomes inactive, it will disappear from the node and will be shown in the Inactive cluster pools box.
Note: when performing certain operations (e.g. moving a pool) the cluster makes the HA pool inactive for a brief period of time. After the operation is completed, the pool will be marked as active and will disappear from the Inactive cluster pools box. It is advisable that no operations should be performed on the pools that are in transition.
If an HA pool is inactive, you have two options: reattach it to one of the nodes or remove it.
If you desire to reattach an inactive pool to one of the nodes, click on the Reattach button, select a node from the suggested ones and hit Reattach pool. Before trying to reattach the pool to the desired node, make sure that the reason that made the pool inactive is removed. Otherwise, the pool will try to start, fail and return to the inactive state.
To remove an HA pool from the cluster click on the Remove button located in the top right corner of each inactive pool.
Leaving the Cluster
On each node, the Take this Node Offline button is at the top-right corner. Clicking that button will remove the current node from the cluster. If there are any HA pools running on this node, they will be migrated to the other node.
If the current node is the last one in the cluster, leaving the cluster means the cluster will be destroyed and it will have to be recreated. Since there are no other available nodes to move the HA pools to, all their related service will be stopped. The pool configured for the cluster will be exported and no clients will be able to access it. Sharing services for the pool will be stopped.
Removing Node From Cluster
After you took a node offline it will still be visible on the HA state page from the other, still online, node. The offline node will be marked as Offline and Remove button will be available on it. Click this button to permanently remove the node from the cluster.
A node can become offline in several ways. The most common situations are when you manually take the node offline from the interface. Other circumstances may include a power outage, a node shut down or a hardware fault that makes it incapable of functioning properly.
There are several reasons for an HA node to become unresponsive: loss of power, interrupted communication with the client network, interrupted communication with the pool and other hardware problems. In these cases, the other node will take over all the task.
Recovery scenarios, failover situations, and split-head conditions are explained in the Recovery Scenarios document.
Putting Cluster in Maintenance
To put your cluster in maintenance click on the Put cluster in maintenance button on the top-right of the page. When the state of the cluster is changed you will be presented with the following warning.
The cluster will maintain its structure and configuration.