When we are planning for HA solutions for Oracle database, the first product that we think about is Oracle Real Application Cluster (RAC). Oracle only store network interface name and subnet ID in OCR, not the netmask.
It may be necessary to change or update interface names, or subnet associated with an interface if there is a network change affecting the servers, or if the original information that was input during the installation was incorrect. It may also be the case that for some reason, the Oracle Interface Configuration Assistant (‘oifcfg’) did not succeed during the installation.
Oifcfg utility uses subnet ID which looks similar to the IP Address with last part as “0”. This understanding is of DBAs unless they have networking knowledge. However, the calculation has to be done with IP address and subnet mask. This article will be useful to get the knowledge of concluding the inputs to be given while changing the IP address.
In RAC, oifcfg utility will give following output.
root@pkssp020# oifcfg iflist -p -n
bge1 184.108.40.206 UNKNOWN 255.255.255.0
nxge0 192.168.0.0 PRIVATE 255.255.240.0
nxge2 220.127.116.11 UNKNOWN 255.255.254.0
nxge4 18.104.22.168 UNKNOWN 255.255.254.0
nxge6 22.214.171.124 UNKNOWN 255.255.254.0
sppp0 10.0.0.0 PRIVATE 255.0.0.0
root@pkssp020# oifcfg getif -global
nxge0 192.168.0.0 global cluster_interconnect
nxge2 126.96.36.199 global public
nxge6 188.8.131.52 global public
At unix level, ipconfig command output for above mentioned interfaces is as follows
# ifconfig nxge0 nxge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 7 inet 192.168.12.13 netmask fffff000 broadcast 192.168.15.255 ether 0:14:4f:b9:2a:28 # ifconfig nxge2 nxge2: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 3 inet 184.108.40.206 netmask fffffe00 broadcast 220.127.116.11 ether 0:14:4f:b9:2a:2a # ifconfig nxge6 nxge6: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 5 inet 18.104.22.168 netmask fffffe00 broadcast 22.214.171.124 ether 0:14:4f:b9:2c:a
Now the IP mentioned by oifcfg is different from that of OS level command. That is if we map it one to one it will be as follows –
|Interface||OS IP Address||Subnet Mask||Oracle IP (oifcfg)|
Oracle calculates using “AND” operation on IP and Subnet mask. It is one to one mapping at IP level.
In RAC environment, if any of the value (IP Address and Subnet Mask) is different, you will get following message in alert log while starting up the database –
Starting ORACLE instance (normal) LICENSE_MAX_SESSION = 0 LICENSE_SESSIONS_WARNING = 0 Interface type 1 nxge0 192.168.0.0 configured from OCR for use as a cluster interconnect WARNING 192.168.0.0 could not be translated to a network address error 1 Interface type 1 nxge2 126.96.36.199 configured from OCR for use as a public interface Interface type 1 nxge6 188.8.131.52 configured from OCR for use as a public interface WARNING: No cluster interconnect has been specified. Depending on the communication driver configured Oracle cluster traffic may be directed to the public interface of this machine. Oracle recommends that RAC clustered databases be configured with a private interconnect for enhanced security and performance. Picked latch-free SCN scheme 3
Here it clearly says that, for interconnect oracle opted for public interconnect. This is alarming for performance issues for database.
So what is exact issue?
Following is the oifcfg output from both the nodes –
root@pkssp021# oifcfg getif -global nxge0 192.168.0.0 global cluster_interconnect nxge2 184.108.40.206 global public nxge6 220.127.116.11 global public root@pkssp020# oifcfg getif -global nxge0 192.168.0.0 global cluster_interconnect nxge2 18.104.22.168 global public nxge6 22.214.171.124 global public
For cluster interconnect interface, following is the output ifconfig command (OS level)
root@pkssp020# ifconfig nxge0 nxge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 7 inet 192.168.12.13 netmask fffff000 broadcast 192.168.15.255 ether 0:14:4f:b9:2a:28 root@pkssp021# ifconfig nxge0 nxge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 inet 192.168.12.14 netmask ffffff00 broadcast 192.168.12.255 ether 0:14:4f:b9:97:c8
Now if you observed, the cluster interconnect output is same for both nodes for Oracle. But at OS level, netmask (subnet) is different on both the nodes, which must be same across the nodes. The same can be checked by oifcfg command with following options.
root@pkssp020# oifcfg iflist -p -n | grep nxge0 nxge0 192.168.0.0 PRIVATE 255.255.240.0 root@pkssp021# oifcfg iflist -p -n | grep nxge0 nxge0 192.168.12.0 PRIVATE 255.255.255.0
Due to the above problem, oracle is not accepting the given private IP Address for interconnect and using public IP.
Now as I have mentioned, oracle calculates the IP number using IP Address and Subnet mask. In our case, on both nodes as subnet mask is different, IP should be different.
pkssp020 => ( 192.168.12.13 & 255.255.240.0 ) => 192.168.0.0 pkssp021 => ( 192.168.12.14 & 255.255.255.0 ) => 192.168.12.0
So on both the nodes, ifcfg output should be different but as per the above output it is same on both. This is because, oracle stores this value in OCR disks. The node from which first time OCR disks get formatted (execution of root.sh file during installation), oracle will put value and other node will use the same value. Hence in this case, it must have executed from pkssp020 node. So we are getting 192.168.0.0 value.
How Oracle does one to one mapping –
Lets assume we have IP Address 192.168.12.13 and Subnet mask as 255.255.240.0
For mapping, we have to break the IP address and subnet mask using “.” Delimiter and then convert the number into binary format and perform AND operation on it.
|IP Address||11000000 – 192
11111111 – 255
|10101000 – 168
11111111 – 255
|00001100 – 12
11110000 – 240
|00001101 – 13
00000000 – 0
|AND Output||11000000 – 192||10101000 – 168||00000000 – 0||00000000 – 0|