X7一体机数据库迁移问题处理

来源:这里教程网 时间:2026-03-03 13:58:52 作者:

数据库版本:Oracle 12.1.2 相同版本迁移。

X4 操作系统为RHEL 5.10。 X7 操作系统为RHEL 7.4。 通过拷贝修改X4数据库参数文件到X7上,启动数据库到NOMOUNT阶段正常,创建SPFILE文件指定到ASM磁盘组时,实例异常停止,alert日志如下:

ORACLE_BASE from environment = /u01/app/oracle

Wed Jul 10 15:13:08 2019

WARNING: unknown state for DB spfile location resource, Return Value: 3

The spfile name is ?/dbs/spfile@.ora

Wed Jul 10 15:13:15 2019

DSKM process appears to be hung. Initiating system state dump.

Wed Jul 10 15:13:15 2019

System state dump requested by (instance=1, osid=305061 (GEN0)), summary=[system state dump request (ksz_check_ds)].

System State dumped to trace file /u01/app/oracle/diag/rdbms/gncdb/gncdb1/trace/gncdb1_diag_305067_20190710151315.trc

Wed Jul 10 15:13:17 2019

Decreasing number of real time LMS from 3 to 0

Wed Jul 10 15:13:45 2019

Errors in file /u01/app/oracle/diag/rdbms/gncdb/gncdb1/trace/gncdb1_dskm_305077.trc:

ORA-56867: Cannot connect to Master Diskmon on pipe "default pipe"

ORA-27300: OS system dependent operation:connect failed with status: 2

ORA-27301: OS failure message: No such file or directory

ORA-27302: failure occurred at: skgznpcon6

Wed Jul 10 15:13:45 2019

USER (ospid: 305077): terminating the instance due to error 56867

Wed Jul 10 15:13:45 2019

System state dump requested by (instance=1, osid=305077 (DSKM)), summary=[abnormal instance termination].

System State dumped to trace file /u01/app/oracle/diag/rdbms/gncdb/gncdb1/trace/gncdb1_diag_305067_20190710151345.trc

Wed Jul 10 15:13:45 2019

Dumping diagnostic data in directory=[cdmp_20190710151345], requested by (instance=1, osid=305077 (DSKM)), summary=[abnormal instance termination].

Wed Jul 10 15:13:46 2019

Instance terminated by USER, pid = 305077

Wed Jul 10 15:15:43 2019

WARNING: unknown state for DB spfile location resource, Return Value: 3

根据日志信息由于DSKM( This process is active only if Exadata Storage is used. DSKM performs operations related to Exadata I/O fencing and Exadata cell failure handling. ) 进程挂起,触发Oracle system dump, 根据查看DSKM日志数据库由于ORA-56867错误导致DIAG进程crash数据库。

Oracle ora-27300 相关错误多数由于系统相关资源限制导致,检查系统资源及系统messages日志未发现相关错误,由于已经有一套数据库存在也排除由于系统资源引起错误的产生。    在ASM磁盘组可以创建目录,将参数文件拷贝到磁盘组中尝试启动,实例终止,报错信息和最初错误一致。    根据相关错误信息,在MOS上查询相关问题,发现跟29164963类似,该BUG影响版本为 Exadata Storage Server Software 19 ,根据相关收集日志,该套EXADATA版本为19,在问题范围以内,根据MOS相关信息进行调整后        On database servers perform the following steps:

1. Add the following lines to the tmpfiles.d(5) configuration file /usr/lib/tmpfiles.d/tmp.conf :

x /tmp/.oracle*

x /var/tmp/.oracle*

x /usr/tmp/.oracle*

2. Restart systemd-tmpfiles-clean.timer service by running the following command as the root user:

# systemctl restart systemd-tmpfiles-clean.timer

3. If the system has already been affected by one of the errors described in the Symptoms section above, then restart clusterware.

4. Review open Advanced Intrusion Detection Environment (AIDE) alerts.

The change to /usr/lib/tmpfiles.d/tmp.conf must be registered in the AIDE database so critical software alerts are not generated as a result of the change.  Before updating the AIDE database, review and resolve open AIDE alerts by running the following DBMCLI command:

DBMCLI> list alerthistory where alertDescription like '.*AIDE.*' and endTime = null;

For details about AIDE see Security Guide for Exadata Database Machine.

5. Update the AIDE database by running the following command as the root user:

# /opt/oracle.SupportTools/exadataAIDE -update  

调整以上内容后,重启CRS集群,发现原系统上数据库无法启动,数据库报错信息如下:        ORA-00210: cannot open control file

ORA-00202: error in writing''+RECODG/utsdb/controlfile/current.256.732754521''

ORA-17503: ksfdopn: 2 Failed to open file +RECODG/utsdb/controlfile/current.256.732754521

ORA-15001: diskgroup "RECODG" does not exist or is not mounted

ORA-15055: unable to connect to ASM instance

ORA-27140: attach to post/wait facility failed

ORA-27300: OS system dependent operation:invalid_euid failed with status: 1

ORA-27301: OS failure message: Not owner

ORA-27302: failure occurred at: skgpwinit5

ORA-27303: additional information: startup euid = 100 (grid), current euid = 101 (oracle)

根据错误信息该问题多数由于grid的oracle文件权限导致,查看该文件权限为-rwxrwxr-x,调整为正确权限后,重启CRS集群恢复正常。

调整方式为:chmod 6751 oracle

数据库恢复后,继续原来恢复操作,通过PFILE创建SPFILE到ASM磁盘组恢复成功。  

参考文档

    (EX50) Exadata 19.1 / Oracle Linux 7 systemd-tmpfiles cleanup can cause database startup/connection failure, or clusterware connection failure ( 文档 ID 2498572.1)

Startup Instance Failed with ORA-27140 ORA-27300 ORA-27301 ORA-27302 and ORA-27303 on skgpwinit6 ( 文档 ID 1274030.1)

相关推荐