This pluggable extension introduces 'simple-failover' group of appliances. A failover group contains two or more appliances, which share certain common attributes (for instance, a so called "failover IP address") and replicated storage. The plugin integrates with Nexenta AutoCDP service, to support failover of NexentaStor volumes block-mirroredover IP network.
Please note that this plugin is installed on all appliances that are members of a 'simple-failover' group.
Definitions
Network interface configuration
Each appliance in the group must have a spare (logical or physical) network interface, also termed "failover networking interface". Starting version 2.0, the plugin supports up to two failover interfaces. At failover time, one of those selected "failover" networking interfaces performs an important function - it gets configured with a per-group static settings:
- failover IP
- failover mask
The idea (and the requirement for the simple-failover configuration to work) is that the clients can access any of the appliances in the failover group via the configured failover IP and failover mask. On the picture below the concept of the failover is illustrated:

For this configuration to work, each of the designated (failover) network interfaces on the corresponding appliances - members of the group - must be programmable with this same parameters of static network interface configuration.
From creating the group to failing over
This is how it works. At group creation time you will be prompted to specify failover network interface on each appliance in the group. Secondly, and in addition, you will be asked to specify a per-group preferences {failover IP, failover mask}.
Later, at actual failover time, this static network interface configuration effectively "travels" to the failover appliance - that is, to the appliance that is used to manually perform failover, via:
nmc:/$ setup group simple-failover [group_name] failover
In other words, during the actual failover:
failed appliance => failover appliance
The previously selected failover interface of the "failover appliance" gets configured with the per-group attributes (failover IP and failover mask). Assuming - as illustrated on the picture above - that CIFS/NFS/iSCSI/etc. clients were using (failover IP and failover mask) to access the failed appliance and data volumes behind it, the failover will remain transparent to these same clients and applications, in terms of being able to access the same data volumes.
Upon "failing over", the appliance that is used to carry out the operation takes over the networking and storage functions previously performed by the failed appliance, and the entire simple-failover group of appliances transitions to a failover state. Failover is a compound operation that is either done as a whole, or not done. Once (and if) the failed appliance is brought back to service, the failover operation can be (again, atomically) undone.
Starting from 3.0.4 version failover MAC address is assigned automatically. When one of the node in a cluster is down, the second one is getting up with the same IP and MAC address and subnet mask.
For advanced users the posibility to specify MAC address manually was left. During the proccess of creating simple failover group, run:
nmc:/$ setup group simple-failover [group_name] failover -m
What happens after failover?
Upon failover the specified group of appliances transitions to a failover state, with the local (failover) appliance taking over networking and storage functions previously performed by the failed appliance. Failover operation is a compound transaction that can be undone (via 'undo-failover' operation), and that includes the following steps:
- making sure that the failed appliance in the given failover group is not reachable. Note that failover will fail if the failed appliance is still up;
- validating the failed appliance's configuration - that is, confguration of the failed appliance. Note appliances in the failover group periodically backup their respective configurations, with backup destinations being other members of the group.
- creating rollback checkpoint. This checkpoint is later used to "undo" the effects of applying failed appliance's configuration. As such, the corresponding rollback is part of the 'undo-failover' operation.
- configuring the specified (failover) networking interface. The corresponding parameters, as well as appliances - members of the failover group and other persistent parameters, are specified at failover group creation time.
Successful failover operation results in:
- Re-importing (activating) of all volumes that were block-mirrored from the failed appliance => to the local (failover) appliance. For related information, please see "Continuous Data Protection" section of the NexentaStor User Guide, or search F.A.Q. pages for "auto-cdp", or review the corresponding pluggable extension web page.
- Re-importing (activating) of all iSCSI based volumes accessible from both the failed appliance and the failover appliance. (Note: It is important to ensure that at any given time only one appliance member of the group accesses truly iSCSI-shared volumes (as opposed to 'auto-cdp' replicated volumes). This can be done manually. However, as of now the corresponding functionality is NOT part of the simple-failover plugin.)
- NexentaStor auto-services: auto-sync, auto-tier, auto-snap, auto-scrub defined on the shared and replicatd volumes (see above).
- Finally, restoring all NFS and CIFS shares defined on all shared and replicated volumes.
How to activate the previously failed appliance?
Caution: For NexentaStor versions 2.0 or older, please make sure to physically disconnect the failover interface prior to powering up the previously failed appliance. This requirement/limitation is removed in NexentaStor v2.1.
As far as activating the previously failed appliance, the corresponding verb is called undo-failover:
nmc:/$ setup group simple-failover [group_name] undo-failover
Both the 'failover' and 'undo-failover' operations are executed on the same local (the failed-over) appliance. For instance, if you have nodeA and nodeB, and nodeA failed, you then drive both 'failover' and 'undo-failover' operations on nodeB.
Note again that simple-failover (being simple and manually operated) does not guard against concurrent access to shared data.
For more information, please see NexentaStor User Guide, or the man page (-h) for the following command:
nmc:/$ switch group
Limitations
Shared storage is NOT supported. The simple-failover extension module does not employ SCSI PGR or any alternative mechanism to ensure that no two appliances in the cluster access to the shared storage (if available) simultaneously. For shared storage support and automated failover, please check out another NexentaStor extension module: HA Cluster.
Getting started
To install this plugin, run:
nmc:/$ setup plugin install simple-failover
This pluggable module must be installed on each appliance that is a member of a simple-failover group of appliances.
You can use management console 'show' command to view already installed plugins, and plugins that can be installed:
nmc:/$ show plugin -h
Or, you can view, install and uninstall the NexentaStor extension modules using appliance's web GUI, as shown below:
To get started, and for the man page, run:
nmc:/$ create group simple-failover -h
References
- User Guide, Section "High Availability"
- Knowledge Base article: Failover with one public IP
- RSF-1 cluster
Live Demo



