[20250908]模拟oradebug request completion等待事件.txt

来源:这里教程网 时间:2026-03-03 22:45:06 作者:

[20250908]模拟oradebug request completion等待事件.txt --//昨天看了2篇文章: https://nenadnoveljic.com/blog/oradebug-request-completion/ https://nenadnoveljic.com/blog/perilous-new-undocumented-tracing-feature-in-oracle-21c/ --//按照作者的说法: A performance problem with log switches can trigger database writers to generate debug information. This includes taking thousands of samples of /proc/PID/stat for the checkpoint process. Since there's a 1 ms sleep between each sample, it can take seconds to collect all the debugging information. This non-DBWR activity distracts database writers from their main work – writing dirty blocks to files. The consequences can be fatal – we observed a database grinded to a halt on several occasions. 日志交换机的性能问题可能会触发数据库编写器来生成调试信息。这包括为检查点进程获取数千个/proc/PID/stat样本。由于每个样本之 间有一个1 ms的睡眠时间,所以收集所有的调试信息可能需要几秒钟。这种非DBWR活动分散了数据库编写人员的主要工作内容——为文件 编写脏块。其后果可能是致命的——我们观察到一个数据库有好几次陷入瘫痪。 The collection of diagnostic information is instrumented by the wait event "oradebug request completion". If you see that the database writers are spending too much time in it, you can try increasing the undocumented parameter _log_switch_tracing_timeout to a much higher value, see my previous blog post. 诊断信息的收集由等待事件和警报请求完成完成。如果您看到数据库编写者在上面花费了太多时间,您可以尝试将未记录的参数 _log_switch_tracing_timeout增加到更高的值,请参阅我以前的博客文章。 --//dbwN进程太多,理论讲dbwN写入太慢会出现该等待。按照文中作者解析cktp取样出现1ms的sleep,而且次数太多,结果触发dbwN进程 --//大量调用堆栈转储到跟踪文件中。 SYS@book> @ hidez _log_switch_tracing_timeout SYS@book> @ pr ============================== NUM                           : 1944 N_HEX                         :   798 CON_ID                        : 0 NAME                          : _log_switch_tracing_timeout DESCRIPTION                   : dbwr to trace after logswitch timeout in secs DEFAULT_VALUE                 : TRUE SESSION_VALUE                 : 200 SYSTEM_VALUE                  : 200 ISSES_MODIFIABLE              : FALSE ISSYS_MODIFIABLE              : IMMEDIATE PL/SQL procedure successfully completed. --//200秒。 --//不过作者提到使用SIGUSR2: We previously established that the stack is printed upon a received signal. In particular, it was the SIGUSR2 signal. 我们之前确定了堆栈是在接收到的信号上打印的。特别是,它是SIGUSR2信号。 $ kill -l  1) SIGHUP       2) SIGINT       3) SIGQUIT      4) SIGILL       5) SIGTRAP  6) SIGABRT      7) SIGBUS       8) SIGFPE       9) SIGKILL     10) SIGUSR1 11) SIGSEGV     12) SIGUSR2     13) SIGPIPE     14) SIGALRM     15) SIGTERM 16) SIGSTKFLT   17) SIGCHLD     18) SIGCONT     19) SIGSTOP     20) SIGTSTP 21) SIGTTIN     22) SIGTTOU     23) SIGURG      24) SIGXCPU     25) SIGXFSZ 26) SIGVTALRM   27) SIGPROF     28) SIGWINCH    29) SIGIO       30) SIGPWR 31) SIGSYS      34) SIGRTMIN    35) SIGRTMIN+1  36) SIGRTMIN+2  37) SIGRTMIN+3 38) SIGRTMIN+4  39) SIGRTMIN+5  40) SIGRTMIN+6  41) SIGRTMIN+7  42) SIGRTMIN+8 43) SIGRTMIN+9  44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13 48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12 53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9  56) SIGRTMAX-8  57) SIGRTMAX-7 58) SIGRTMAX-6  59) SIGRTMAX-5  60) SIGRTMAX-4  61) SIGRTMAX-3  62) SIGRTMAX-2 63) SIGRTMAX-1  64) SIGRTMAX --//这样只要抓取信号SIGUSR2,另外一个会话对该进程发送oradebug short_stack,就应该能模拟该等待事件。 1.环境: SYS@book> @ ver2 ============================== PORT_STRING                   : x86_64/Linux 2.4.xx VERSION                       : 21.0.0.0.0 BANNER                        : Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production BANNER_FULL                   : Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production Version 21.3.0.0.0 BANNER_LEGACY                 : Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production CON_ID                        : 0 PL/SQL procedure successfully completed. 2.测试: --//session 1: SYS@book> @ spid ============================== SID                           : 149 SERIAL#                       : 16566 PROCESS                       : 4428 SERVER                        : DEDICATED SPID                          : 4429 PID                           : 49 P_SERIAL#                     : 10 KILL_COMMAND                  : alter system kill session '149,16566' immediate; PL/SQL procedure successfully completed. --//window 1: $ gdb -f -p 4429 .... (gdb) handle SIGUSR2 Signal        Stop      Print   Pass to program Description SIGUSR2       Yes       Yes     Yes             User defined signal 2 (gdb) catch signal SIGUSR2 Catchpoint 1 (signal SIGUSR2) --//注:直接写catch signal 12 对应的SIGSYS,不知道为什么? (gdb) c Continuing. Catchpoint 1 (signal SIGUSR2), 0x00007f12237c9480 in __read_nocancel () at ../sysdeps/unix/syscall-template.S:81 /usr/src/debug/glibc-2.17-c758a686/sysdeps/unix/syscall-template.S:81:3374:beg:0x7f12237c9480 --//session 2: SYS@book> oradebug setospid 4429 Oracle pid: 49, Unix process pid: 4429, NID: 4026531836, image: oracle@centtest (TNS V1-V3) SYS@book> oradebug short_stack --//挂起!! --//window 1: (gdb) c Continuing. --//session 2: SYS@book> oradebug short_stack ksedsts()+418<-ksdxcb()+2187<-sspuser()+213<-__sighandler()<-read()+14<-read()+85<-sntpread()+28<-ntpfprd()+126<-nsbasic_brc()+357<-nioq rc()+495<-opikndf2()+1078<-opitsk()+920<-opiino()+1008<-opiodr()+1256<-opidrv()+1067<-sou2o()+165<-opimai_real()+400<-ssthrdmain()+412<- main()+292<-__libc_start_main()+245 $ ./lookup.awk ksedsts ksedsts : kernel service error debug dump the call stack short stack --//session 3: SYS@book> @ ash/ashtop sid,event 1=1  "'2025-09-08 17:02:09'" sysdate     Total                                                                                                  Distinct Distinct    Distinct   Seconds     AAS %This          SID EVENT                       FIRST_SEEN          LAST_SEEN           Execs Seen  Tstamps Execs Seen1 --------- ------- ------- ---------- --------------------------- ------------------- ------------------- ---------- -------- -----------        38      .3   33% |        133                             2025-09-08 17:02:11 2025-09-08 17:04:24          1       38          38        37      .3   32% |        271 oradebug request completion 2025-09-08 17:02:09 2025-09-08 17:02:45          1       37          37         9      .1    8% |        257 log file parallel write     2025-09-08 17:03:00 2025-09-08 17:03:46          1        9           9         7      .1    6% |          1                             2025-09-08 17:02:34 2025-09-08 17:04:00          1        7           7         4      .0    4% |        137                             2025-09-08 17:03:05 2025-09-08 17:03:46          1        4           4         3      .0    3% |        385 control file parallel write 2025-09-08 17:02:54 2025-09-08 17:03:16          1        3           3         3      .0    3% |        388                             2025-09-08 17:02:15 2025-09-08 17:03:52          1        3           3         2      .0    2% |          4                             2025-09-08 17:03:36 2025-09-08 17:03:39          1        2           2         2      .0    2% |        132                             2025-09-08 17:03:44 2025-09-08 17:03:50          1        2           2         2      .0    2% |        398                             2025-09-08 17:03:12 2025-09-08 17:03:17          1        2           2         1      .0    1% |          3                             2025-09-08 17:03:31 2025-09-08 17:03:31          1        1           1         1      .0    1% |          5                             2025-09-08 17:04:24 2025-09-08 17:04:24          1        1           1         1      .0    1% |        130 db file async I/O submit    2025-09-08 17:03:34 2025-09-08 17:03:34          1        1           1         1      .0    1% |        257                             2025-09-08 17:03:05 2025-09-08 17:03:05          1        1           1         1      .0    1% |        384                             2025-09-08 17:03:36 2025-09-08 17:03:36          1        1           1         1      .0    1% |        385                             2025-09-08 17:03:19 2025-09-08 17:03:19          1        1           1         1      .0    1% |        387                             2025-09-08 17:03:03 2025-09-08 17:03:03          1        1           1 17 rows selected. --//基本模拟oradebug request completion等待事件。

相关推荐