Split Brain and Amnesia
Two types of problems can arise from cluster partitions: split brain and amnesia. Split brain occurs when the cluster interconnect between hosts is lost and the cluster becomes partitioned into subclusters, and each subcluster believes that it is the only partition. A sub-cluster that is not aware of the other subclusters could cause a conflict in shared resources such as duplicate network addresses and data corruption.
Each node of RAC cluster communicate through high speed Private Interconnect. Let us first understand Private Interconnect ,
The private interconnect is the physical construct that allows inter-node communication. It can be a simple crossover cable with UDP or it can be a proprietary interconnect with specialized proprietary communications protocol. When setting up more than 2- nodes, a switch is usually needed. This provides the maximum performance for RAC, which relies on inter-process communication between the instances for cache-fusion implementation.
Now , when Split Brain situation occurs ??
As we know , each node in this cluster are interconnected through private interconnect and end users connects to cluster through public network.When nodes are physically up and running and db instance on each of these servers is also running but private interconnect fails between two or more nodes, Instance member in RAC cluster fail to ping or connect to each other,then due to lack of communication in private interconnect , instance thinks that the other instance that is not able to connect is down and both instance works independently.The individual nodes are running fine and can accept user connections and work independently.
But ,If the situation remain same , a same block might get read, write in these individual instances and the data integrity issue might occur as block changes in one instance will not be recorded in another instance.
In simpler terms, in a split-brain situation, there are in a sense two (or more) separate clusters working on the same shared storage. This has the potential for data corruption.
This situation is also known as Multi-Master problem.
How cluster-ware resolve split-brain situation ?
Oracle has efficiently implemented check for the split brain syndrome.
When a node fails, the failed node is prevented from accessing all the shared disk devices and groups. This methodology is called I/O Fencing, Disk Fencing or Failure Fencing.
I/O fencing: It is provided by the kernel-based fencing module (vxfen), performs identically on node failures and communications failures. When the fencing module on a node is informed of a change in cluster membership by the GAB module, it immediately begins the fencing operation. The node tries to eject the key for departed nodes from the coordinator disks using the pre-empt and abort command. When the node successfully ejects the departed nodes from the coordinator disks, it also ejects the departed nodes from the data disks. In a split-brain scenario, both sides of the split would race for control of the coordinator disks. The side winning the majority of the coordinator disks wins the race and fences the loser. The loser then panics and restarts the system.
Now, who will decide which node will survive and which node will face fencing ??
The answer is Voting Disk
In a split brain situation, voting disk will be used to determine which node(s) survive and which node(s) will be evicted.
Voting Disk : The voting disk is a file that manages information about node membership.Voting disk is used by Oracle Cluster Synchronization Services Daemon (ocssd) on each node, to mark its own attendance and also to record the nodes it can communicate with.
Following algorithm is applied for which node will survive and which node will be evicted from cluster.
1.If the sub-clusters are of the different sizes, the clusterware identifies the largest sub-cluster, and aborts all the nodes which do not belong to that sub-cluster.
2.If all the sub-clusters are of the same size, the sub-cluster having the lowest numbered node survives so that, in a 2-node cluster, the node with the lowest node number will survive
Amnesia occurs when the cluster restarts after a shutdown with cluster data older than at the time of the shutdown. This can happen if multiple versions of the framework data are stored on disk and a new incarnation of the cluster is started when the latest version is not available. An example is a two-node cluster with nodes A and B. If node A goes down, the configuration data in the CCR is updated on node B only, and not node A. If node B goes down at a later time, and if node A is rebooted, node A will be running with old contents of the CCR. This state is called amnesia and might lead to running a cluster with stale configuration information.
|Partition Type||Quorum Solution|
|Split brain||Enables only the partition (subcluster) with a majority of votes to run as the cluster (only one partition can exist with such a majority). After a node loses the race for quorum, that node panics.|
|Amnesia||Guarantees that when a cluster is booted, it has at least one node that was a member of the most recent cluster membership (and thus has the latest configuration data).|
Stay Tuned for Reason for Node Evication in Oracle RAC.
Thank you for giving your valuable time to read the above information.
If you want to be updated with all our articles send us the Invitation or Follow us:
Skant Gupta’s LinkedIn: www.linkedin.com/in/skantali/
Joel Perez’s LinkedIn: Joel Perez’s Profile
Anuradha’s LinkedIn: Anuradha’s Profile
LinkedIn Group: Oracle Cloud DBAAS
Facebook Page: OracleHelp