rac高可用测试私网中断为什么不会立即重启服务器?

来源:这里教程网 时间:2026-03-03 22:20:00 作者:

操作系统日志:

从私网两根线都拔掉,到服务器宕机耗时大概36s(从12:02:01到12:03:36)失去日志记录

Jul  5 12:02:59 hisdb1 kernel: i40e 0000:b1:00.1 ens5f1: NIC Link is Down

Jul  5 12:03:01 hisdb1 kernel: i40e 0000:4b:00.1 ens2f1: NIC Link is Down

Jul  5 12:03:32 hisdb1 su: (to grid) root on pts/0

Jul  5 12:03:32 hisdb1 dbus[1732]: [system] Activating service name='org.freedesktop.problems' (using servicehelper)

Jul  5 12:03:32 hisdb1 dbus[1732]: [system] Successfully activated service 'org.freedesktop.problems'

Jul  5 12:03:35 hisdb1 abrt-hook-ccpp: Process 6312 (ocssd.bin) of user 1001 killed by SIGABRT - dumping core

Jul  5 12:03:36 hisdb1 abrt-server: Executable '/oracle/grid/crs_1/bin/ocssd.bin' doesn't belong to any package and ProcessUnpackaged is set to 'no'

Jul  5 12:03:36 hisdb1 abrt-server: 'post-create' on '/var/spool/abrt/ccpp-2025-07-05-12:03:35-6312' exited with 1

Jul  5 12:03:36 hisdb1 abrt-server: Deleting problem directory '/var/spool/abrt/ccpp-2025-07-05-12:03:35-6312'

集群alert日志记录

2025-07-05 12:03:01.217 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens5f1(:.*)?:172.16.1.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][ens2f1(:.*)?:169.254.13.48][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0][ens2f1(:.*)?:172.16.0.0].

2025-07-05 12:03:02.219 [ORAROOTAGENT(4485)]CRS-5050: HAIP failover due to network interface ens5f1 not functioning 《《《 由于网络接口ens5f1故障而触发了故障转移机制, HAIP会自动将流量切换到其他可用接口(如果有配置冗余接口)

2025-07-05 12:03:05.321 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens2f1(:.*)?:172.16.0.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:05.322 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens5f1(:.*)?:172.16.1.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:07.323 [OCTSSD(8556)]CRS-7503: The Oracle Grid Infrastructure process 'octssd' observed communication issues between node 'hisdb1' and node 'hisdb2', interface list of loc

al node 'hisdb1' is '172.16.0.1:44403;', interface list of remote node 'hisdb2' is '172.16.0.2:43201;172.16.1.2:57012;'. 《《《 进程crsd检测到节点hisdb1和hisdb2之间存在通信问题,两条私网通信均异常

2025-07-05 12:03:09.425 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens2f1(:.*)?:172.16.0.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:09.426 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens5f1(:.*)?:172.16.1.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

。。。重复记录(略)。。。

2025-07-05 12:03:16.172 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens2f1(:.*)?:172.16.0.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:16.172 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens5f1(:.*)?:172.16.1.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:16.397 [OCSSD(6312)]CRS-1612: Network communication with node hisdb2 (2) has been missing for 50% of the timeout interval.  If this persists, removal of this node from clust

er will occur in 14.850 seconds 《《《 网络通信丢失已达到超时间隔的50%,若持续丢失,将在14.85秒后将该节点从集群中移除

2025-07-05 12:03:18.288 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens2f1(:.*)?:172.16.0.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:18.289 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens5f1(:.*)?:172.16.1.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

。。。重复记录(略)。。。

2025-07-05 12:03:23.962 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens2f1(:.*)?:172.16.0.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:23.963 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens5f1(:.*)?:172.16.1.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:24.398 [OCSSD(6312)]CRS-1611: Network communication with node hisdb2 (2) has been missing for 75% of the timeout interval.  If this persists, removal of this node from clust

er will occur in 6.850 seconds 《《《 网络通信丢失已达到超时间隔的75%,若持续丢失,将在6.850秒后将该节点从集群中移除

2025-07-05 12:03:24.862 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens2f1(:.*)?:172.16.0.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:24.863 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens5f1(:.*)?:172.16.1.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:27.173 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens2f1(:.*)?:172.16.0.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:27.173 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens5f1(:.*)?:172.16.1.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:28.399 [OCSSD(6312)]CRS-1610: Network communication with node hisdb2 (2) has been missing for 90% of the timeout interval.  If this persists, removal of this node from clust

er will occur in 2.850 seconds 《《《 网络通信丢失已达到超时间隔的90%,若持续丢失,将在2.850秒后将该节点从集群中移除

2025-07-05 12:03:29.290 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens2f1(:.*)?:172.16.0.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:29.291 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens5f1(:.*)?:172.16.1.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:30.470 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens2f1(:.*)?:172.16.0.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:30.470 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens5f1(:.*)?:172.16.1.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:30.629 [CRSD(9215)]CRS-2771: Maximum restart attempts reached for resource 'ora.asmnet2.asmnetwork'; will not restart. 《《《 crs 对资源ora.asmnet2.asmnetwork重启达到最大尝试次数,将不再自动重启该资源‌

2025-07-05 12:03:30.630 [CRSD(9215)]CRS-2769: Unable to failover resource 'ora.asmnet2.asmnetwork'.

2025-07-05 12:03:31.466 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens2f1(:.*)?:172.16.0.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:31.467 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens5f1(:.*)?:172.16.1.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:31.665 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens2f1(:.*)?:172.16.0.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:31.666 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens5f1(:.*)?:172.16.1.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0][bond0:1(:.*)?:10.20.121.0][bond0:2(:.*)?:10.20.121.0].

2025-07-05 12:03:31.752 [OCSSD(6312)]CRS-1609: This node is unable to communicate with other nodes in the cluster and is going down to preserve cluster integrity; details at (:CSSNM00008:) in

/oracle/gridbase/diag/crs/hisdb1/crs/trace/ocssd.trc. 《《《 RAC集群中的一个节点无法与其他节点通信,为保持集群完整性,该节点将自动关闭,触发的保护性措施,防止"脑裂"现象发生

2025-07-05 12:03:31.753 [OCSSD(6312)]CRS-1656: The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /oracle/gridbase/diag/crs/hisdb1/crs/trace/ocssd.trc 《《《 其实从私网不通经过30s就已经发起节点停机操作了,由于停服务进程还需要一些时间,所以实际关机要比30大

2025-07-05 12:03:31.781 [ORAAGENT(9337)]CRS-5818: Aborted command 'check' for resource 'ora.ASMNET2LSNR_ASM.lsnr'. Details at (:CRSAGF00113:) {0:3:347} in /oracle/gridbase/diag/crs/hisdb1/crs/trace/crsd_oraagent_grid.trc.

2025-07-05 12:03:31.776 [OCSSD(6312)]CRS-1652: Starting clean up of CRSD resources.

2025-07-05 12:03:32.430 [CRSD(9215)]CRS-2771: Maximum restart attempts reached for resource 'ora.asmnet1.asmnetwork'; will not restart.

2025-07-05 12:03:32.430 [CRSD(9215)]CRS-2769: Unable to failover resource 'ora.asmnet1.asmnetwork'.

2025-07-05 12:03:32.567 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens2f1(:.*)?:172.16.0.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0].

2025-07-05 12:03:32.569 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens5f1(:.*)?:172.16.1.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0].

2025-07-05 12:03:32.673 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens2f1(:.*)?:172.16.0.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0].

2025-07-05 12:03:32.673 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens5f1(:.*)?:172.16.1.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0].

2025-07-05T12:03:32.800846+08:00

Errors in file /oracle/gridbase/diag/crs/hisdb1/crs/trace/crsd.trc  (incident=1):

CRS-6015 [] [] [] [] [] [] [] [] [] [] [] []

Incident details in: /oracle/gridbase/diag/crs/hisdb1/crs/incident/incdir_1/crsd_i1.trc

2025-07-05 12:03:32.783 [CRSD(9215)]CRS-6015: Oracle Clusterware has experienced an internal error. Details at (:CLSGEN00100:) {0:3:347} in /oracle/gridbase/diag/crs/hisdb1/crs/trace/crsd.tr

c.

2025-07-05 12:03:32.825 [CRSD(9215)]CRS-8505: Oracle Clusterware CRSD process with operating system process ID 9215 encountered internal error CRS-06015

2025-07-05 12:03:33.284 [ORAAGENT(9337)]CRS-5822: Agent '/oracle/grid/crs_1/bin/oraagent_grid' disconnected from server. Details at (:CRSAGF00117:) {0:1:19} in /oracle/gridbase/diag/crs/p8mesd

b1/crs/trace/crsd_oraagent_grid.trc.

2025-07-05 12:03:33.284 [ORAROOTAGENT(9348)]CRS-5822: Agent '/oracle/grid/crs_1/bin/orarootagent_root' disconnected from server. Details at (:CRSAGF00117:) {0:3:349} in /oracle/gridbase/diag/c

rs/hisdb1/crs/trace/crsd_orarootagent_root.trc.

2025-07-05 12:03:33.285 [OCSSD(6312)]CRS-1653: The clean up of the CRSD resources failed.

2025-07-05 12:03:33.284 [ORAAGENT(11556)]CRS-5822: Agent '/oracle/grid/crs_1/bin/oraagent_oracle' disconnected from server. Details at (:CRSAGF00117:) {0:8:427} in /oracle/gridbase/diag/crs/p8

mesdb1/crs/trace/crsd_oraagent_oracle.trc.

2025-07-05 12:03:33.284 [SCRIPTAGENT(56264)]CRS-5822: Agent '/oracle/grid/crs_1/bin/scriptagent_grid' disconnected from server. Details at (:CRSAGF00117:) {0:9:9} in /oracle/gridbase/diag/crs/

hisdb1/crs/trace/crsd_scriptagent_grid.trc.

2025-07-05 12:03:33.329 [CRSD(9518)]CRS-8500: Oracle Clusterware CRSD process is starting with operating system process ID 9518

2025-07-05 12:03:34.506 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens2f1(:.*)?:172.16.0.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0].

2025-07-05 12:03:34.507 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens5f1(:.*)?:172.16.1.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0].

2025-07-05 12:03:34.974 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens2f1(:.*)?:172.16.0.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0].

2025-07-05 12:03:34.975 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens5f1(:.*)?:172.16.1.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0].

2025-07-05T12:03:35.306464+08:00

Errors in file /oracle/gridbase/diag/crs/hisdb1/crs/trace/ocssd.trc  (incident=9):

CRS-8503 [] [] [] [] [] [] [] [] [] [] [] []

Incident details in: /oracle/gridbase/diag/crs/hisdb1/crs/incident/incdir_9/ocssd_i9.trc

2025-07-05 12:03:35.303 [OCSSD(6312)]CRS-8503: Oracle Clusterware process OCSSD with operating system process ID 6312 experienced fatal signal or exception code 6.

2025-07-05 12:03:35.687 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens2f1(:.*)?:172.16.0.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0].

2025-07-05 12:03:35.687 [GIPCD(4977)]CRS-42216: No interfaces are configured on the local node for interface definition ens5f1(:.*)?:172.16.1.0: available interface definitions are [ens6f0(:.*

)?:10.20.110.0][bond0(:.*)?:10.20.121.0].

从日志看是到35s之后集群才完成停止。

那么从私网中断到发起停机的时间是通过下面这个misscount值来计算的

[grid@hisdb1 ~]$ crsctl get css misscount

CRS-4678: Successful get misscount 30 for Cluster Synchronization Services.

相关推荐