一次his数据库无监听故障处理

来源:这里教程网 时间:2026-03-03 21:38:14 作者:

故障描述

8 4 号晚上 2 2 ,运维侧反馈有系统报错,提示无监听程序,需要我们帮忙查下原因,这是一套 RAC 数据库

登录到服务器上面 ,为了尽快恢复实例,手工启动实例发现节点 1 crs 都已经 hang 住了,手工强制重启 crs ,之后启动数据库实例,实例可以正常启动。

问题详细诊断过程

检查数据库alert 日志 发现 ORA-07445 ORA-00600 [ 17147 ] 报错 Sun Aug 01 22:41:29 2021 Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x250E876, qertbRestoreFetchState2()+60] [flags: 0x0, count: 1] Errors in file /u01/app/oracle/diag/rdbms/zlemr/zlemr1/trace/zlemr1_ora_20043.trc  (incident=96939): ORA-07445: 鍑虹幇寮傚父閿欒 : 鏍稿績杞偍 [qertbRestoreFetchState2()+60] [SIGSEGV] [ADDR:0x0] [PC:0x250E876] [SI_KERNEL(general_protection)] [] Incident details in: /u01/app/oracle/diag/rdbms/zlemr/zlemr1/incident/incdir_96939/zlemr1_ora_20043_i96939.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Sun Aug 01 22:41:29 2021 Dumping diagnostic data in directory=[cdmp_20210801224129], requested by (instance=1, osid=20043), summary=[incident=96939]. Sun Aug 01 22:41:30 2021 Sweep [inc][96939]: completed Sweep [inc2][96939]: completed Sun Aug 01 22:42:08 2021 Errors in file /u01/app/oracle/diag/rdbms/zlemr/zlemr1/trace/zlemr1_pmon_22489.trc  (incident=96017): ORA-00600: internal error code, arguments: [17147], [0xB0BF733F8], [], [], [], [], [], [], [], [], [], [] Incident details in: /u01/app/oracle/diag/rdbms/zlemr/zlemr1/incident/incdir_96017/zlemr1_pmon_22489_i96017.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Errors in file /u01/app/oracle/diag/rdbms/zlemr/zlemr1/trace/zlemr1_pmon_22489.trc: ORA-00600: internal error code, arguments: [17147], [0xB0BF733F8], [], [], [], [], [], [], [], [], [], [] PMON (ospid: 22489): terminating the instance due to error 472 Sun Aug 01 22:42:11 2021 System state dump requested by (instance=1, osid=22489 (PMON)), summary=[abnormal instance termination]. System State dumped to trace file /u01/app/oracle/diag/rdbms/zlemr/zlemr1/trace/zlemr1_diag_22544_20210801224211.trc Instance terminated by PMON, pid = 22489 Wed Aug 04 22:52:02 2021

从上面的日志可以看出 数据库从8 1 号实例 1 就已经宕掉了 ,检查操作系统日志, crs 日志都没有发现其他报错信息。

查看对应的 trace 文件,可以判断是由于 plsql 程序执行了一个查询操作, sql 如下: select * from (Select Extractvalue(a.content,                      'zlxml//document/element[@title=" 住院号 "]/text()') as 住院号 ,        Extractvalue(a.content,                      'zlxml//document/element[@title=" 姓名 "]/text()') as 姓名 ,         extractvalue(b.detail,'root/ic_list/icode[@title=" 病人 ID"]') AS 病人 id,         extractvalue(b.detail,'root/ic_list/icode[@title=" 住院次数 "]') AS 主页 id,         replace(b_Doc_Base.f_get_content(a.actlog_id, ' 主诉 ', a.title),' ','') 主诉 ,       ----  length(b_Doc_Base.f_get_content(a.actlog_id, ' 主诉 ', a.title)) as aaa,         replace(b_Doc_Base.f_get_content(a.actlog_id, ' 现病史 ', a.title),' ','') 现病史 ,        a.title   From bz_doc_log a, bz_masters b,bz_act_log c   where b.id=c.master_id      --- and a.edit_time = (select max(edit_time) from bz_doc_log x where x.actlog_id = a.actlog_id and x.antetype_id = a.antetype_id)       and c.id=a.actlog_id        and a.title in (' XXXX ',' XXXX ',' XXXX ',' XXXX ',' XXXX ',' XXXX )       and a.edit_time > sysdate -1 )  D

检查trace 文件 zlemr1_pmon_22489_i96017.trc 发现 ,存在大量的 报错信息 信息如下    ………………………

Chunk        b0bf6f778 sz=     3072    free      "               "   Chunk        b0bf70378 sz=     4224    freeable  "SQLA^63c299a9  "  ds=0x107343770   Chunk        b0bf713f8 sz=     4096  NEXT NEXT CHUNK'S PREVIOUS POINTER NOT POINTING TO NEXT CHUNK        000000000   freeable  "SQLA^27cec08c  "  ds=0x106780760   Two suspect chunks will be dumped with their memory contents The first chunk (above) may have corrupted the size of the second chunk More likely is that the second chunk has corrupted the following chunk ……………

通过这个 trace 文件内容可以判断, plsql 程序执行查询语句,触发了共享池内存冲突,

问题定位 :通过官网对报错进行匹配,确认 遇到了 Bug 22243719 。这个 bug 补丁已经包含在比较早的数据库 P SU 里面 所以 怀疑这套库没打任何补丁,通知检查数据库补丁的情况,确实数据库没有打任何补丁。

解决办法和建议

1 . 避免使用 plsql 执行上面的语句

2. 打补丁,至少要打 11.2.0.4.161018 (Oct 2016) ,综合考虑风险等因素,打补丁 Patch:   22243719

最后经过客户确定,还是采用打补丁方式解决这个问题,当天晚上停应用,停库,打补丁。

补丁验证 [oracle@emrdb1 OPatch]$ ./opatch  lsinv Oracle Interim Patch Installer version 11.2.0.3.29 Copyright (c) 2021, Oracle Corporation.  All rights reserved. Oracle Home       : /u01/app/oracle/product/11.2.0/db_1 Central Inventory : /u01/app/oraInventory    from           : /u01/app/oracle/product/11.2.0/db_1/oraInst.loc OPatch version    : 11.2.0.3.29 OUI version       : 11.2.0.4.0 Log file location : /u01/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/opatch2021-08-05_22-29-58PM_1.log   Lsinventory Output file location : /u01/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/lsinv/lsinventory2021-08-31_15-50-26PM.txt ------------------------------------------------------------------------- Local Machine Information:: Hostname: emrdb1 ARU platform id: 226 ARU platform description:: Linux x86-64 Installed Top-level Products (1): Oracle Database 11g                                                  11.2.0.4.0 There are 1 products installed in this Oracle Home. Interim patches (1) :   Patch  22243719     : applied on Thu Aug 05 22:27:56 CST 2021 Unique Patch ID:  19972949    Created on 12 Mar 2016, 04:02:16 hrs PST8PDT    Bugs fixed:      22243719 ------------------------------------------------------------------------- OPatch succeeded.

相关推荐