适用范围
Oracle Database 19.3 + SUSE Linux Enterprise Server 12 SP4 EMC存储
问题概述
Oracle 19c RAC 节点2 操作系统由SUSE Linux 12 SP4升级到SP5后,重启操作系统,该节点存储无法识别,cssd进程无法启动到real time 模式,crs无法正常启动。
问题原因
1、操作系统安全加固产品导致19c集群的CSSD进程无法启动real time模式加固产品调整了操作系统CPU Accounting。请参考 Oracle 12c RAC CSSD进程无法启动real time模式。
解决方案
1、关闭操作系统安全加固软件的服务; 2、安装SUSE 12 SP5匹配的EMC存储管理软件DellEMCPower.LINUX-7.1.0.00.00-075.SLES12SP5.x86_64.rpm
分析过程
1、启动节点2 crs启动时的alert日志
2025-03-27 20:17:50.802 [OCSSD(14861)]CRS-1714: Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:)in /u01/app/grid/diag/crs/host01/crs/trace/ocssd.trc
节点2的crs启动时alert日志显示无法找到vote盘,15秒重试,仍然无法找到vote盘。 2、节点2 ocssd日志
2025-03-27 20:17:50.801 : CSSD:3324163840: clssnmReadDiscoveryProfile: voting file discovery string(/dev/emcpower) 2025-03-27 20:17:50.801 : CSSD:3324163840: clssnmvDDiscThread: using discovery string /dev/oracleasm/disks for initial discovery 2025-03-27 20:17:50.801 : SKGFD:3324163840: Discovery with str:/dev/emcpower: 2025-03-27 20:17:50.801 : SKGFD:3324163840: UFS discovery with :/dev/emcpower: 2025-03-27 20:17:50.801 : SKGFD:3324163840: Execute glob on the string /dev/emcpower 2025-03-27 20:17:50.801 : SKGFD:3324163840: OSS discovery with :/dev/emcpower 2025-03-27 20:17:50.802 : SKGFD:3324163840: Discovery skipping bad :ASM:: 2025-03-27 20:17:50.802 : CSSD:3324163840: clssnmvDiskVerify: Successful discovery of 0 disks 2025-03-27 20:17:50.802 : CSSD:3324163840: clssnmCompleteInitVFDiscovery: Completing initial voting file discovery 2025-03-27 20:17:50.802 : CSSD:3324163840: clssnmvFindInitialConfigs: No voting files found 2025-03-27 20:17:50.802 : CSSD:3324163840: (:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds 2025-03-27 20:17:51.687 : CSSD:3558553344: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization 2025-03-27 20:17:52.687 : CSSD:3558553344: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization 2025-03-27 20:17:53.687 : CSSD:3558553344: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization 2025-03-27 20:17:54.687 : CSSD:3558553344: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization 2025-03-27 20:17:55.688 : CSSD:3558553344: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization 2025-03-27 20:17:56.688 : CSSD:3558553344: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization 2025-03-27 20:17:57.688 : CSSD:3558553344: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization
cssd日志显示无法找到/dev/emcpower存储,clsssc_CLSFAInit_CB初始化失败,cssd进程无法正常启动。 3、集群状态检查
$crsctl stat res -t -init -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.asm 2 ONLINE OFFLINE ora.cluster_interconnect.haip 2 ONLINE OFFLINE ora.crf 2 ONLINE ONLINE fadb02 ora.crsd 2 ONLINE OFFLINE ora.cssd 2 ONLINE OFFLINE STARTING ora.cssdmonitor 2 ONLINE ONLINE fadb02 ora.ctssd 2 ONLINE OFFLINE ora.diskmon 2 OFFLINE OFFLINE ora.evmd 2 ONLINE OFFLINE ora.gipcd 2 ONLINE ONLINE fadb02 ora.gpnpd 2 ONLINE ONLINE fadb02 ora.mdnsd 2 ONLINE ONLINE fadb02复制
cssd进程是starting状态,因为vote无法找到,所以cssd进程无法正常启动。 4、存储检查 节点2:
$ls -l /dev/emcpower*
没有返回值,节点2无法识别存储。
节点1:
$ls -l /dev/emcpower* /dev/emcpowera /dev/emcpowerb /dev/emcpowerc /dev/emcpowere
节点1的存储盘正常。 4、检查节点2的存储
$powermt display dev=all initialization error $rpm -qa | grep EMCpower EMCpower.LINUX-7.0.0.00.00-064.suse12sp4.x86_64
节点2操作系统升级到SUSE12 SP5后EMCpower软件仍然是sp4的,初步判断EMCpower与操作系统不兼容导致EMC存储无法识别 5、安装EMC匹配的软件 下载并安装DellEMCPower.LINUX-7.1.0.00.00-075.SLES12SP5.x86_64.rpm,节点2 存储可以正常识别,重新启动crs可以正常启动。
总结
数据库基础环境生产环境变更,一定要经过严格测试后在实施,OS和数据库升级等操作,应考虑该服务器上齐滔软件的兼容性。 -the end-
