1 在处理ORA-32701前,需要先了解以下数据库的hang管理器,参考Oracle的官方文档(Oracle Hang Manager (文档 ID 1534591.1))。内容如下,可以认为如果数据库发生hang,一般情况数据库会自动处理的,但有时由于情况复杂,不能处理: DETAILS Overview Hang Manager has existed since Oracle Database 10.2.0.1. The main goal of Hang Manager is to reliably detect and, if hang resolution is enabled, resolve hangs in a timely manner. Over various releases, Hang Manager has been enhanced along with the wait event infrastructure on which it relies. However, it is only in Oracle Database 11.2.0.2 that Hang Manager actually resolves any hangs by terminating sessions and/or processes. This remains the default operation in Oracle Database 12.1.0.1. Hang Manager is only active when Oracle RAC is enabled. You can determine whether an Oracle RAC database is used / enabled by querying the CLUSTER_DATABASE. If this parameter is set to "TRUE", Oracle RAC is enabled on this database. Terminology Cross Boundary Hang This is a hang which spans the ASM/DB boundary. In 12.1.0.1, it is possible for Hang Manager to detect hangs which span the boundary between a database and ASM or ASM and a database. Deadlock or Closed Chain A deadlock is a number of sessions making no progress which are in a closed chain or cycle. All sessions have a waiter and a blocker. The only way to break the chain is to cause one of the sessions to either progress or terminate. Hang or Open Chain From Hang Manager's perspective, a hang is a number of sessions making no progress which are in an open chain configuration. This is not a cycle or deadlock. An open chain or hang has a root or final blocker which is blocking all other sessions in the chain. It also contains a final waiter which is at the end of the chain and is blocked by all other sessions in the chain. Immediate Waiter The session in an open chain that is being blocked by the root of the hang. Quality of Service (QoS) Management Oracle Database QoS Management is an automated, policy-based product that monitors the workload requests for an entire system. Oracle Database QoS Management manages the resources that are shared across applications, and adjusts the system configuration to keep the applications running at the performance levels needed by your business. Root (also known as Victim or Final Blocker) This is the session in an open chain or hang that is blocking all other sessions in the chain. Self-Resolved Hang This is a hang or deadlock that was detected by Hang Manager but no longer exists. This could either be because the root or immediate waiter have progressed or either or both of them have been terminated but not by Hang Manager. Hang Manager in 12.1.0.1 Since Oracle Database 11.2.0.2, Hang Manager may resolve hangs it detects by terminating the root session or process. By default, Hang Manager will not terminate an instance or evict a node. Also, Hang Manager will not resolve all hangs that it detects. For example, hangs which involve a possible application issue are left to the user to determine the correct course of action. Also, if Hang Manager determines that the instance on which the root of a hang resides is experiencing high CPU or IO load, hang resolution will be delayed. This gives some time for the root to progress and the hang to resolve itself. Finally, Hang Manager currently does not resolve any hangs on ASM. Starting with Oracle Database 12.1.0.1 and if QoS is active on the cluster, Hang Manager will use additional information provided by QoS to determine if a hang should be ignored or resolved. If QoS would like Hang Manager to resolve a hang, hang resolution may occur more quickly than Hang Manager's normal detection and resolution times or hang resolution may be delayed. Previous to 12.1.0.1, hangs were detected only within the database or ASM cluster. However, if a session on ASM or the database was being blocked by a session on the other side of the ASM/Database boundary (a Cross Boundary Hang), there was no way to detect it. In Oracle Database 12.1.0.1, Hang Manager was enhanced to detect hangs which cross the ASM/Database boundary and outputs information to its various trace files so it is easier to troubleshoot these hangs. 同hang管理器相关的视图: V$ ViewsHang Manager exposes some information via various V$ views. These are now documented in the Oracle Database Reference. A summary of the information in these views is below. For a more detailed description of these views, please go to the appropriate section in the Oracle Database Reference. V$HANG_INFOThis view contains the active hangs and deadlocks detected by Hang Manager. It does not include any hangs which have been resolved by Hang Manager or have self-resolved. Only the latest 32 active hangs are in this view.The root or victim of the hang is included in the information for each active hang in the view. V$HANG_SESSION_INFOThis view contains the sessions in the main chain for all of the hangs included in the V$HANG_INFO view. Only the first 20 sessions of the main chain, including the root of the hang, for each hang are kept in this view. (G)V$HANG_STATISTICSThis view contains various Hang Manager statistics regarding hangs or deadlocks which it has detected. These statistics include: the number of hangs detected, the number of deadlocks detected, the number of hangs resolved by Hang Manager, the number of hangs ignored broken down by the reason for ignoring the hang, the number of hangs which self-resolved, etc. 2 数据库的告警,查看发生hang的相关信息 节点1: Sun Sep 27 00:31:23 2020 Errors in file /u01/app/oracle/diag/rdbms/test/test1/trace/test1_dia0_25054.trc (incident=920985): ORA-32701: Possible hangs up to hang ID=1 detected --hang ID=1 说明数据库自动诊断出Hang Incident details in: /u01/app/oracle/diag/rdbms/test/test1/incident/incdir_920985/test1_dia0_25054_i920985.trc DIA0 requesting termination of session sid:1244 with serial # 75 (ospid:88984) on instance 3 --请求终止实例3的会话,操作系统PID:88984 due to a LOCAL, HIGH confidence hang with ID=1. Hang Resolution Reason: Although the number of affected sessions did not justify automatic hang resolution initially, this previously ignored hang was automatically resolved. DIA0: Examine the alert log on instance 3 for session termination status of hang with ID=1. --发现实例3的会话阻塞住会话,检查实例3的alert日志 节点3: Sun Sep 27 00:31:23 2020 Errors in file /u01/app/oracle/diag/rdbms/test/test3/trace/test3_dia0_22856.trc (incident=880256): ORA-32701: Possible hangs up to hang ID=1 detected --hang ID=1 说明数据库自动诊断出Hang Incident details in: /u01/app/oracle/diag/rdbms/test/test3/incident/incdir_880256/test3_dia0_22856_i880256.trc DIA0 terminating blocker (ospid: 88984 sid: 1244 ser#: 75) of hang with ID = 1 --本节点Hang住了,故终止了此会话 requested by master DIA0 process on instance 1 --请求的进程来自于实例1 Hang Resolution Reason: Although the number of affected sessions did not justify automatic hang resolution initially, this previously ignored hang was automatically resolved. by terminating session sid:1244 with serial # 75 (ospid:88984) --终止了操作系统进程 3 首先根据hang ID=1 和 hang ID=2 查看如下视图,此视图会解释每个代码的含义: SYS@test3>select * from V$HANG_STATISTICS; STATISTIC# NAME VALUE ---------- --------------------------------------------- ---------- 0 number of deadlocks detected and ignored 0 1 number of hangs detected 0 2 number of local hangs 0 3 number of global hangs 0 4 number of transient hangs 0 5 hangs ignored due to high CPU on root's node 0 6 hangs ignored due to high IO on root's node 0 7 hangs ignored due to application contention 0 8 hangs ignored due to long running operations 0 9 hangs monitored due to archiving issues 0 10 hangs ignored due to archiving issues 0 11 hangs ignored, blocked by remote database 0 12 hangs ignored due to SQL parsing 0 13 hangs ignored due to dumping system state 0 14 hangs ignored, instance termination required 0 15 hangs ignored, only one active instance 0 16 number of explicitly resolved hangs 0 17 number of self-resolved hangs 0 18 total self-resolved hang time in seconds 0 19 minimum self-resolved hang time in seconds 0 20 maximum self-resolved hang time in seconds 0 21 number of HSC matched hangs 0 22 hangs resolved due to instance termination 0 23 rows selected. 4 分析数据库生成的trace文件,找出报错的根本原因 节点1 查看test1_dia0_25054_i920985.trc Dump file /u01/app/oracle/diag/rdbms/test/test1/incident/incdir_920985/test1_dia0_25054_i920985.trc Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP, Data Mining and Real Application Testing options ORACLE_HOME = /u01/app/oracle/product/11.2.0.4/db_1 System name: Linux Node name: h0783 Release: 2.6.32-642.el6.x86_64 Version: #1 SMP Wed Apr 13 00:51:26 EDT 2016 Machine: x86_64 Instance name: test1 Redo thread mounted by this instance: 1 Oracle process number: 10 Unix process pid: 25054, image: oracle@h0783 (DIA0) *** 2020-09-27 00:31:23.807 *** SESSION ID:(201.1) 2020-09-27 00:31:23.807 *** CLIENT ID:() 2020-09-27 00:31:23.807 *** SERVICE NAME:(SYS$BACKGROUND) 2020-09-27 00:31:23.807 *** MODULE NAME:() 2020-09-27 00:31:23.807 *** ACTION NAME:() 2020-09-27 00:31:23.807 Dump continued from file: /u01/app/oracle/diag/rdbms/test/test1/trace/test1_dia0_25054.trc ORA-32701: Possible hangs up to hang ID=1 detected ========= Dump for incident 920985 (ORA 32701) ======== ----- Beginning of Customized Incident Dump(s) ----- One or more resolvable hangs have been detected on your system. An attempt will be made to resolve these hangs. The stack included in this incident file is not the cause of the problem. It is merely pointing to the module that triggered the incident. The short stacks of the root or final blocked of the hang and its waiter can be found in the DIA0 trace files on their respective local instances. 'Resolvable Hangs' below indicate one or more hangs that were found and identify the final blocking session and instance on which they occurred. Since the current hang resolution state is 'PROCESS', any hangs requiring session or process termination will be automatically resolved. The following information will assist Oracle Support Services in further analysis of the root cause of the hang. *** 2020-09-27 00:31:23.809 Resolvable Hangs in the System Root Chain Total Hang Hang Hang Inst Root #hung #hung Hang Hang Resolution ID Type Status Num Sess Sess Sess Conf Span Action ----- ---- -------- ---- ----- ----- ----- ------ ------ ------------------- 1 HANG RSLNPEND 3 1244 2 2 HIGH LOCAL Terminate Process --此处说明Hang的会话在节点3上,会话的ID为1244 Hang Resolution Reason: Although the number of affected sessions did not justify automatic hang resolution initially, this previously ignored hang was automatically resolved. inst# SessId Ser# OSPID PrcNm Event ----- ------ ----- --------- ----- ----- 3 2562 353 28939 M001 enq: WF - contention 3 1244 75 88984 M000 not in wait --此处说明会话1244 等待是事件为 not in wait The incident file on instance 3 may contain information about session 1244 with serial number 75 and operating system process ID 88984. The incident file may contain a short stack and a process state dump. Victim Information Ignored HangID Inst# Sessid Ser Num OSPID Fatal BG Previous Hang Count ------ ----- ------ ------- --------- -------- ------------- ------- 1 3 1244 75 88984 N Existing Hang 1 *** 2020-09-27 00:31:23.809 Wait-For-Graphs collected at 09/27/20 00:25:31) --以下信息说明搜集的图形化阻塞的相关信息 =============================================================================== Non-intersecting chains: ------------------------------------------------------------------------------- Chain 1: --Chain 1 说明会话2562等待的事件为:'enq: WF - contention' ,被1244会话阻塞住, 1244会话的等待事件为 not int wait,但等待:event: 'gc current grant 2-way' ------------------------------------------------------------------------------- Oracle session identified by: { instance: 3 (test.test3) os id: 28939 process id: 512, oracle@h0785 (M001) session id: 2562 session serial #: 353 } is waiting for 'enq: WF - contention' with wait info: { p1: 'name|mode'=0x57460006 p2: '0'=0x38 p3: '0'=0x0 time in wait: 1 min 53 sec timeout after: never wait id: 73008 blocking: 0 sessions wait history: * time between current wait and wait #1: 0.003403 sec 1. event: 'db file sequential read' time waited: 0.000226 sec wait id: 73007 p1: 'file#'=0x2 p2: 'block#'=0x1eeff p3: 'blocks'=0x1 * time between wait #1 and #2: 0.000390 sec 2. event: 'db file sequential read' time waited: 0.000207 sec wait id: 73006 p1: 'file#'=0x2 p2: 'block#'=0xe178 p3: 'blocks'=0x1 * time between wait #2 and #3: 0.002593 sec 3. event: 'db file scattered read' time waited: 0.000276 sec wait id: 73005 p1: 'file#'=0x2 p2: 'block#'=0x1eefb p3: 'blocks'=0x2 } and is blocked by => Oracle session identified by: { instance: 3 (test.test3) os id: 88984 process id: 446, oracle@h0785 (M000) session id: 1244 session serial #: 75 } which is not in a wait: { last wait: 1165 min 7 sec ago blocking: 1 session wait history: 1. event: 'gc current grant 2-way' --此类等待事件为数据库从内存中读取数据块,发现没有, 然后从磁盘中读到内存中。故认为此等待事件正常。结合实际查询,可以认为2562的会话被1224会话阻塞住, 但由于1224会话正在从磁盘大量的读取数据块到内存中,导致的阻塞。经细查,2562会话当时操作的SQL为: delete from WRH$_SYSMETRIC_HISTORY tab where (:beg_snap <= tab.snap_id and tab.snap_id <= :end_snap and dbid = :dbid) and not exists (select 1 from WRM$_BASELINE b where (tab.dbid = b.dbid) and (tab.snap_id >= b.start_snap_id) and (tab.snap_id <= b.end_snap_id)) , 1224会话操作的SQL为: insert into wrh$_sql_bind_metadata ,两个会话都是操作AWR相关的表,故有可能导致阻塞。 time waited: 0.000321 sec wait id: 1878 p1: ''=0x2 p2: ''=0x2a1a6 p3: ''=0x2010001 * time between wait #1 and #2: 0.000204 sec 2. event: 'gc current grant 2-way' time waited: 0.000384 sec wait id: 1877 p1: ''=0x2 p2: ''=0x39862 p3: ''=0x2010001 * time between wait #2 and #3: 0.000523 sec 3. event: 'gc current block 3-way' time waited: 0.000692 sec wait id: 1876 p1: ''=0x2 p2: ''=0x29dea p3: ''=0x1 } Chain 1 Signature: <not in a wait><='enq: WF - contention' Chain 1 Signature Hash: 0xee4c6e02 ------------------------------------------------------------------------------- =============================================================================== Sessions in an involuntary wait or not in a wait: 查看实例3的test3_dia0_22856_i880256.trc Dump file /u01/app/oracle/diag/rdbms/test/test3/incident/incdir_880256/test3_dia0_22856_i880256.trc Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP, Data Mining and Real Application Testing options ORACLE_HOME = /u01/app/oracle/product/11.2.0.4/db_1 System name: Linux Node name: h0785 Release: 2.6.32-642.el6.x86_64 Version: #1 SMP Wed Apr 13 00:51:26 EDT 2016 Machine: x86_64 Instance name: test3 Redo thread mounted by this instance: 3 Oracle process number: 10 Unix process pid: 22856, image: oracle@h0785 (DIA0) *** 2020-09-27 00:31:23.982 *** SESSION ID:(201.1) 2020-09-27 00:31:23.982 *** CLIENT ID:() 2020-09-27 00:31:23.982 *** SERVICE NAME:(SYS$BACKGROUND) 2020-09-27 00:31:23.982 *** MODULE NAME:() 2020-09-27 00:31:23.982 *** ACTION NAME:() 2020-09-27 00:31:23.982 Dump continued from file: /u01/app/oracle/diag/rdbms/test/test3/trace/test3_dia0_22856.trc ORA-32701: Possible hangs up to hang ID=1 detected ========= Dump for incident 880256 (ORA 32701) ======== ----- Beginning of Customized Incident Dump(s) ----- There are resolvable hangs on your system. Hang Manger will attempt to resolve these hangs. Some information about these is output below. Complete information is available in an incident trace file on instance 1. If the hang is to be resolved by terminating the session or process that is the root or victim of the hang, additional information will be output on the local instance of that session or process. Below are the hangs for which resolution will be attempted. *** 2020-09-27 00:31:23.983 Resolvable Hangs in the System Root Chain Total Hang Hang Hang Inst Root #hung #hung Hang Hang Resolution ID Type Status Num Sess Sess Sess Conf Span Action ----- ---- -------- ---- ----- ----- ----- ------ ------ ------------------- 1 HANG RSLNPEND 3 1244 2 2 HIGH LOCAL Terminate Process Hang Resolution Reason: Although the number of affected sessions did not justify automatic hang resolution initially, this previously ignored hang was automatically resolved. inst# SessId Ser# OSPID PrcNm Event ----- ------ ----- --------- ----- ----- 3 2562 353 28939 M001 enq: WF - contention 3 1244 75 88984 M000 not in wait Dumping process info of pid[446.88984] (sid:1244, ser#:75) requested by master DIA0 process on instance 1. *** 2020-09-27 00:31:23.983 Process diagnostic dump for oracle@h0785 (M000), OS id=88984, pid: 446, proc_ser: 32, sid: 1244, sess_ser: 75 ------------------------------------------------------------------------------- os thread scheduling delay history: (sampling every 1.000000 secs) 0.000000 secs at [ 00:31:23 ] NOTE: scheduling delay has not been sampled for 0.647808 secs 0.000000 secs from [ 00:31:19 - 00:31:24 ], 5 sec avg 0.000000 secs from [ 00:30:24 - 00:31:24 ], 1 min avg 0.000000 secs from [ 00:26:24 - 00:31:24 ], 5 min avg loadavg : 3.04 3.01 2.86 Memory (Avail / Total) = 363436.51M / 516750.80M Swap (Avail / Total) = 65536.00M / 65536.00M F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD 0 R oracle 88984 1 99 80 0 - 315843 ? Sep26 ? 19:28:37 ora_m000_test3 Short stack dump: --此处为trace生成的二进制错误的堆键信息 ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-kghalf()+247<-kqlfbctc()+645<-kgligi()+31<-kglic_cbk()+345<-kglic0()+751<-kglic()+74 <-kqlfbct()+376<-qerfxFetch()+3072<-qerjotRowProc()+359<-qerhjInnerProbeHashTable()+491<-qerfxFetch()+1106<-rwsfcd()+103<-qerhjFetch()+621<-qerjotFetch()+2025 <-rwsfcd()+103<-qerltcFetch()+1223<-insexe()+691<-opiexe()+5632<-kpoal8()+2380<-opiodr()+917<-kpoodrc()+36<-rpiswu2()+1776<-kpoodr()+636<-upirtrc()+2436<-kpurcsc()+98 <-kpuexec()+10790<-OCIStmtExecute()+39<-kewrose_oci_stmt_exec()+55<-kewrgwxf1_gwrsql_exft_1()+317<-kewrgwxf_gwrsql_exft()+496<-kewrews_execute_wr_sql()+52 <-kewrftbs_flush_table_by_sql()+180<-kewrfabt_flush_attribute_table()+997<-kewrfsb_flush_sqlbind()+34<-kewrft_flush_table()+397<-kewrftec_flush_table_ehdlcx()+766 <-kewrftsq_flush_topsql()+768<-kewrft_flush_table()+397<-kewrftec_flush_table_ehdlcx()+766<-kewrfat_flush_all_tables()+1406<-kewrfsr_flush_snapshot_r()+191 <-kewrrfs_remote_flush_slave()+1169<-kebm_slave_main()+586<-ksvrdp()+1766<-opirip()+674<-opidrv()+603<-sou2o()+103<-opimai_real()+250<-ssthrdmain()+265<-main()+201 <-__libc_start_main()+253 ------------------------------------------------------------------------------- Process diagnostic dump actual duration=0.340000 sec (max dump time=15.000000 sec) *** 2020-09-27 00:31:24.319 --以下为报错是执行想相关SQL,相关的表为 insert into wrh$_sql_bind_metadata current sql: insert into wrh$_sql_bind_metadata (snap_id, dbid, sql_id, name, position, dup_position, datatype, datatype_string, character_sid, precision, scale, max_length) SELECT /*+ ordered use_nl(bnd) index(bnd sql_id) */ :lah_snap_id, :dbid, bnd.sql_id, name, position, dup_position, datatype, dataty ---------------------------------------- SO: 0x1ec14370a8, type: 2, owner: (nil), flag: INIT/-/-/0x00 if: 0x3 c: 0x3 proc=0x1ec14370a8, name=process, file=ksu.h LINE:12721, pg=0 (process) Oracle pid:446, ser:32, calls cur/top: 0x1d32cfb3c8/0x1d33002b20 flags : (0x2) SYSTEM flags2: (0x30), flags3: (0x10) intr error: 0, call error: 0, sess error: 0, txn error 0 intr queue: empty ksudlp FALSE at location: 0 5 根据第二步的会话ID,查询在故障时间点执行的SQL,根据如下信息,可以确定在故障点2562会话确实被1244会话阻塞住: SYS@test3>select SESSION_ID,SESSION_SERIAL#,SAMPLE_TIME,WAIT_TIME,BLOCKING_SESSION,SQL_ID from dba_hist_active_sess_history where INSTANCE_NUMBER=3 and SESSION_ID=1244 and SAMPLE_TIME between to_timestamp('2020-09-27 00:20:00','yyyy-mm-dd HH24:MI:SS') and to_timestamp('2020-09-27 00:40:00','yyyy-mm-dd HH24:MI:SS') order by 3; SESSION_ID SESSION_SERIAL# SAMPLE_TIME WAIT_TIME BLOCKING_SESSION SQL_ID ---------- --------------- ------------------------------------ ---------- ---------------- ------------- 1244 75 27-SEP-20 12.20.08.580 AM 321 82h7w3w07k63z 1244 75 27-SEP-20 12.20.18.620 AM 321 82h7w3w07k63z 1244 75 27-SEP-20 12.20.28.640 AM 321 82h7w3w07k63z ........................... 1244 75 27-SEP-20 12.27.40.744 AM 321 82h7w3w07k63z 1244 75 27-SEP-20 12.27.50.854 AM 321 82h7w3w07k63z 1244 75 27-SEP-20 12.28.00.894 AM 321 82h7w3w07k63z 1244 75 27-SEP-20 12.28.10.931 AM 321 82h7w3w07k63z 1244 75 27-SEP-20 12.28.20.971 AM 321 82h7w3w07k63z 1244 75 27-SEP-20 12.28.31.021 AM 321 82h7w3w07k63z ........................... 1244 75 27-SEP-20 12.32.42.270 AM 321 82h7w3w07k63z 1244 75 27-SEP-20 12.32.52.370 AM 321 82h7w3w07k63z 77 rows selected. SYS@test3>select SESSION_ID,SESSION_SERIAL#,SAMPLE_TIME,WAIT_TIME,BLOCKING_SESSION,SQL_ID from dba_hist_active_sess_history where INSTANCE_NUMBER=3 and SESSION_ID=2562 and SAMPLE_TIME between to_timestamp('2020-09-27 00:20:00','yyyy-mm-dd HH24:MI:SS') and to_timestamp('2020-09-27 00:40:00','yyyy-mm-dd HH24:MI:SS') order by 3; SESSION_ID SESSION_SERIAL# SAMPLE_TIME WAIT_TIME BLOCKING_SESSION SQL_ID ---------- --------------- --------------------------------- ---------- ---------------- ------------- 2562 347 27-SEP-20 12.20.28.640 AM 3 0v9v22h2nuc0q 2562 353 27-SEP-20 12.22.29.241 AM 0 7gfhbhxrxx8wf 2562 353 27-SEP-20 12.22.39.291 AM 408 9v9n97qj8z1dg 2562 353 27-SEP-20 12.22.49.421 AM 0 bzscyq07w79ab 2562 353 27-SEP-20 12.22.59.451 AM 0 bzscyq07w79ab 2562 353 27-SEP-20 12.23.09.511 AM 0 bzscyq07w79ab 2562 353 27-SEP-20 12.23.19.551 AM 0 2whm2vvjb98k7 2562 353 27-SEP-20 12.23.29.581 AM 186 acc988uzvjmmt 2562 353 27-SEP-20 12.23.39.621 AM 0 1244 2562 353 27-SEP-20 12.23.49.731 AM 0 1244 2562 353 27-SEP-20 12.23.59.751 AM 0 1244 2562 353 27-SEP-20 12.24.09.810 AM 0 1244 ..................... 2562 353 27-SEP-20 12.32.32.240 AM 0 1244 2562 353 27-SEP-20 12.32.42.270 AM 0 1244 2562 353 27-SEP-20 12.32.52.370 AM 0 1244 2562 353 27-SEP-20 12.33.02.420 AM 149 18yuw9npdrxwc 2562 353 27-SEP-20 12.33.12.440 AM 0 9wygvu6cx2npy 2562 353 27-SEP-20 12.33.22.480 AM 0 9wygvu6cx2npy 2562 353 27-SEP-20 12.33.32.510 AM 0 gkbrnf9xhv80g 68 rows selected. acc988uzvjmmt 执行的SQL: delete from WRH$_SYSMETRIC_HISTORY tab where (:beg_snap <= tab.snap_id and tab.snap_id <= :end_snap and dbid = :dbid) and not exists (select 1 from WRM$_BASELINE b where (tab.dbid = b.dbid) and (tab.snap_id >= b.start_snap_id) and (tab.snap_id <= b.end_snap_id)) 6 查询数据库的AWR生成信息,发现在故障点不能生成AWR:如下 Listing the last 3 days of Completed Snapshots Snap Instance DB Name Snap Id Snap Started Level ------------ ------------ --------- ------------------ ----- test3 test 46313 26 Sep 2020 00:00 1 46314 26 Sep 2020 01:00 1 46315 26 Sep 2020 02:00 1 46316 26 Sep 2020 03:00 1 46317 26 Sep 2020 04:00 1 --26日的4点 直接到27日的7:00 46325 27 Sep 2020 07:00 1 46326 27 Sep 2020 08:00 1 46327 27 Sep 2020 09:00 1 46328 27 Sep 2020 10:00 1 46329 27 Sep 2020 11:00 1 46330 27 Sep 2020 12:00 1 46331 27 Sep 2020 13:00 1 46332 27 Sep 2020 14:00 1 7 查询x$KQLFBC, 发现等待2小时都无法出来结果,通过追踪,找到执行计划,发现此表由1.4亿(138*1024*1024)条数据, 故导致运行2小时无法查询出结果 SYS@test3> select count(*) from x$kqlfbc; Plan hash value: 4012607404 ---------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ---------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | 18 (100)| | | 1 | SORT AGGREGATE | | 1 | | | | 2 | FIXED TABLE FULL| X$KQLFBC | 138K| 18 (100)| 00:00:01 | ---------------------------------------------------------------------- 14 rows selected. SYS@test3>select count(*) from x$ksmsp; COUNT(*) ---------- 3102652 8 由于ORA-32701故障发生在wrh$_sql_bind_metadata及 x$kqlfbc表上,是由于系统有大量的绑定变量引起的, 故查询如下参数,发现如下参数设置太小,需要调整。 SYS@test1>show parameter cursor NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ cursor_bind_capture_destination string memory+disk cursor_sharing string EXACT cursor_space_for_time boolean FALSE open_cursors integer 300 session_cached_cursors integer 500 PARAMETER VALUE USAGE ---------------------- ---------- ----- session_cached_cursors 500 258% open_cursors 300 432% HIGHEST_OPEN_CUR MAX_OPEN_CUR ---------------- -------------------- 246 300 PARAMETER VALUE USAGE -------------------------------------------------------- --------------- session_cached_cursors 500 212% open_cursors 300 356% HIGHEST_OPEN_CUR MAX_OPEN_CUR ---------------- -------------------- 236 300 PARAMETER VALUE USAGE ---------------------- ---------- ----- session_cached_cursors 500 249% open_cursors 300 418% 9 ORA-32701问题的解决方法,以下的解决方法,是根据所有的Oracle官方文档总结出来的, 需要根据实际情况采用不同的方法,一般情况下一种方法不行,就使用第二种,直到解决问题。 一般情况下重启数据库实例就可以临时的解决此问题,但生产环境一般是不允许重启实例的,故需要 使用不重启实例的方法解决。 9.1 搜集如下表的统计信息 exec dbms_stats.gather_table_stats('SYS', 'X$KEWRATTRNEW'); exec dbms_stats.gather_table_stats('SYS', 'X$KEWRTSQLPLAN'); exec dbms_stats.gather_table_stats('SYS', 'X$KEWRSQLIDTAB'); exec dbms_stats.gather_fixed_objects_stats; exec dbms_stats.gather_schema_stats('SYS'); 查询相关表的统计信息是否过期: select OWNER, TABLE_NAME, LAST_ANALYZED from dba_tab_statistics where table_name='X$KEWRATTRNEW'; select OWNER, TABLE_NAME, LAST_ANALYZED from dba_tab_statistics where table_name='X$KEWRTSQLPLAN'; select OWNER, TABLE_NAME, LAST_ANALYZED from dba_tab_statistics where table_name='X$KEWRSQLIDTAB'; 9.2 有时此表x$KQLFBC的数据量会特别大,会导致查询不出来或不能生成AWR,故需要通过设置隐含参数和刷共享池来解决 。 SYS@test1>select count(1) from x$KQLFBC; COUNT(1) ---------- 113714 select count(*) from X$KQLFBC; 刷共享池 alter system flush shared_pool; 设置隐含参数 alter system set "_awr_disabled_flush_tables"='wrh$_sql_bind_metadata'; 部署自动刷新shared的脚本 10 8,18 * * * /home/oracle/database_check_sxwy/scripts/flush_shared.sh #!/bin/bash export ORACLE_SID=test1 sqlplus / as sysdba <<EOF alter system flush shared_pool; exit; EOF 9.3 第三种禁用方法 先查询AWR Table Info (KEW layer)使用下面的SQL: select table_id_kewrtb, table_name_kewrtb from x$kewrtb order by 1; TABLE_ID_KEWRTB TABLE_NAME_KEWRTB --------------- ---------------------------------------------------------------- 0 WRM$_DATABASE_INSTANCE 1 WRM$_SNAPSHOT 2 WRM$_BASELINE 3 WRM$_WR_CONTROL 4 WRH$_STAT_NAME 。。。。。。。。。。 53 WRH$_TABLESPACE_SPACE_USAGE 54 WRH$_METRIC_NAME 55 WRH$_SYSMETRIC_SUMMARY 56 WRH$_SQL_BIND_METADATA 57 WRH$_JAVA_POOL_ADVICE 58 WRH$_THREAD 59 WRH$_SESSMETRIC_HISTORY 60 WRH$_OSSTAT 61 WRH$_OSSTAT_NAME 62 WRH$_SYS_TIME_MODEL 63 wri$_optstat_opr 64 WRH$_OPTIMIZER_ENV 65 WRM$_SNAP_ERROR 66 WRH$_SERVICE_WAIT_CLASS 67 WRH$_CR_BLOCK_SERVER 。。。。。。。。。。。 115 WRM$_WR_USAGE 116 WRH$_MVPARAMETER 117 WRH$_PERSISTENT_QMN_CACHE 118 WRH$_TABLESPACE 119 WRM$_SNAPSHOT_DETAILS 120 WRI$_SQLTEXT_REFCOUNT 121 rows selected. 从上面的查询中得到table_id用于下面的level 值,比如我们这里表wrh$_sql_bind_metadata对应的是56 alter session set events ‘immediate trace name awr_flush_table_off level 56’; 后期如果想打开flushing to awr 如果第二种方法禁用,打开使用 alter session set events ‘immediate trace name awr_flush_table_on level 56’ off; 9.4 第四种解决方法为数据库打相关的补丁。
ORA-32701错误原因分析及处理方法
来源:这里教程网
时间:2026-03-03 17:01:34
作者:
编辑推荐:
下一篇:
相关推荐
-
雷神推出 MIX PRO II 迷你主机:基于 Ultra 200H,玻璃上盖 + ARGB 灯效
2 月 9 日消息,雷神 (THUNDEROBOT) 现已宣布推出基于英
-
制造商 Musnap 推出彩色墨水屏电纸书 Ocean C:支持手写笔、第三方安卓应用
2 月 10 日消息,制造商 Musnap 现已在海外推出一款 Oce
热文推荐
- RAC11g搭建-centos7+openfiler+multipath+udev
- ORACLE 11g rac for linux升级到19c后台进程Space Manager:slave idle wait过多
- 荣耀手机反弹的法门
荣耀手机反弹的法门
26-03-03 - oracle19c安装 单实例 系统centos7 非cdb
oracle19c安装 单实例 系统centos7 非cdb
26-03-03 - 字节跳动再启音乐梦
字节跳动再启音乐梦
26-03-03 - 【SQL】Oracle SQL处理的流程
【SQL】Oracle SQL处理的流程
26-03-03 - 【SQL】Oracle SQL共享池检查
【SQL】Oracle SQL共享池检查
26-03-03 - Oracle 21C下载和安装
Oracle 21C下载和安装
26-03-03 - oracle11g安装 单实例 系统centos7
oracle11g安装 单实例 系统centos7
26-03-03 - Oracle 19c- 19.8应用32242453补丁
Oracle 19c- 19.8应用32242453补丁
26-03-03
