How can I test HA cluster functionality?
Applicable Products
- QuTS hero h5.3.0 or later
- High Availability Manager
Scenario
I have set up a QNAP high-availability (HA) cluster and want to confirm it’s working correctly. What are the recommended ways to test its functionality?
Procedure
To confirm your HA cluster is working correctly, perform the following tests.
- Always perform a full data backup before initiating any HA-related tests to prevent data loss.
- If your HA cluster is operating in a production environment, avoid simulating power or network failures directly. Perform these tests in a non-production environment if possible.
- During testing, closely monitor event logs and status changes in High Availability Manager.
Test 1: Manual switchover
This test verifies that the switchover mechanism is working properly.
- Go to High Availability Manager > Cluster.
- Confirm the statuses of the current active and passive nodes.
- Click Manage, and then select Switch Over.
- After the switchover is complete, verify the following:
- The cluster status is "Good".
- Services continue without interruption.
Test 2: Automatic failover
This test verifies that the failover mechanism is working properly.
- Simulate active node failure by performing one of the following:
- Unplug the network cable from the active node.
- Force the active node to shut down by pressing the physical power button for 10 seconds.
- Log in to the passive node.
- Go to High Availability Manager > Cluster.
- Verify the following:
- The passive node has become the new active node.
- The original active node is offline.
Test 3: Automatic rejoin after reconnecting offline node
This test verifies that after an offline node is reconnected to the cluster, it will automatically rejoin the cluster, and the cluster will resume HA functionality.
- After performing test 2, do one of the following:
- Reconnect the network cable to the offline node.
- Power on the offline node.
- On the other node, go to High Availability Manager > Cluster.
- Verify the following:
- The previously offline node automatically rejoins the cluster as the passive node.
- After synchronization is complete, the cluster status is "Good".