[20250204]21c library cache mutex的深入探究7(_mutex_wait_scheme=2).txt

来源:这里教程网 时间:2026-03-03 21:33:19 作者:

[20250204]21c library cache mutex的深入探究7(_mutex_wait_scheme=2).txt --//探究library cache mutex阻塞导致的相关等待事件,分析_mutex_wait_scheme=2的情况的相关细节。 1.环境: SYS@book> @ver2 ============================== PORT_STRING                   : x86_64/Linux 2.4.xx VERSION                       : 21.0.0.0.0 BANNER                        : Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production BANNER_FULL                   : Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production Version 21.3.0.0.0 BANNER_LEGACY                 : Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production CON_ID                        : 0 PL/SQL procedure successfully completed. SYS@book> @ hidez ^_mutex  NUM N_HEX NAME               DESCRIPTION       DEFAULT_VALUE SESSION_VALUE SYSTEM_VALUE ISSES ISSYS_MOD ---- ----- ------------------ ----------------- ------------- ------------- ------------ ----- --------- 3553   DE1 _mutex_wait_time   Mutex wait time   TRUE          1             1            FALSE IMMEDIATE 3554   DE2 _mutex_spin_count  Mutex spin count  TRUE          255           255          FALSE IMMEDIATE 3555   DE3 _mutex_wait_scheme Mutex wait scheme TRUE          2             2            FALSE IMMEDIATE --//缺省_mutex_wait_time=1,时间单位与_mutex_wait_scheme相关,_mutex_wait_scheme=2时时间单位是厘秒,而 --//_mutex_wait_scheme=0,1时,单位时毫秒。 --//_mutex_wait_scheme =2时,_mutex_wait_time>1时sleeps的时间会出现指数回退. --//缺省_mutex_wait_scheme =2. --//网上找了一段资料: * _mutex_spin_count (Integer) - This sets the number of times to spin before yielding/waiting. * _mutex_wait_scheme (Integer) - In 11.2 this controls which wait scheme to use. It can be set to one of the three wait schemes described above thus: _mutex_wait_scheme = 0                        – Always YIELD _mutex_wait_scheme = 1 & _mutex_wait_time = t – Always SLEEP for t milli-seconds _mutex_wait_scheme = 2 & _mutex_wait_time = t – EXP BACKOFF with maximum sleep (default) 2.测试前准备: SYS@book> oradebug setmypid Statement processed. SYS@book> oradebug dump library_cache 4 Statement processed. $ grep "^Bucket:" /u01/app/oracle/diag/rdbms/book/book/trace/book_ora_10770.trc | head -4 Bucket: #=0 Mutex=0x6cfa1400(1125281431552, 9, 0, 6) Bucket: #=5 Mutex=0x6cfa14f0(1125281431552, 4, 0, 6) Bucket: #=17 Mutex=0x6cfa1730(1125281431552, 8, 0, 6) Bucket: #=39 Mutex=0x6cfa1b50(1125281431552, 4, 0, 6) --//以上是前面测试的结果,直接找Bucket: #=0 Mutex=0x6cfa1400的语句测试。 --//该sql语句select /*+ 9 */ count(*) from dept where deptno = 93834;的bucket=0. --//Bucket: #=0 Mutex=0x6cfa1400 --//0x6cfa1400-0x10 = 0x6cfa13f0 3.测试: --//session 2: SYS@book>  @opeek 0x6cfa13f0 48 0 [06CFA13F0, 06CFA1420) = 6CFA13F0 00000000 6CFA13F0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 --//前面2个8字节等于0x6cfa13f0,说明没有对象使用该library cache mutex。 --//session 1: SCOTT@book01p> @ spid ============================== SID                           : 281 SERIAL#                       : 36437 PROCESS                       : 3397 SERVER                        : DEDICATED SPID                          : 3399 PID                           : 42 P_SERIAL#                     : 5 KILL_COMMAND                  : alter system kill session '281,36437' immediate; PL/SQL procedure successfully completed. SCOTT@book01p> alter session  set session_cached_cursors=0 ; Session altered. --//主要目的避免光标缓存,保证每次执行library cache mutex gets增加。 SCOTT@book01p> select /*+ 9 */ count(*) from dept where deptno = 93834;   COUNT(*) ----------          0 --//执行多次。 SCOTT@book01p> @ hash HASH_VALUE SQL_ID        CHILD_NUMBER KGL_BUCKET PLAN_HASH_VALUE HASH_HEX   SQL_EXEC_START      SQL_EXEC_ID ---------- ------------- ------------ ---------- --------------- ---------- ------------------- ----------- 3403546624 as3g00v5dw000            0          0      2236899148  cade0000  2025-02-04 14:49:53    16777216 --//session 2: SYS@book> @opeek 0x6cfa13f0 48 0 [06CFA13F0, 06CFA1420) = 611AFD80 00000000 611AFD80 00000000 00000000 00000000 00000014 00000000 00000000 00000000 611B9E10 00000000 SYS@book> @ sharepool/shp4z as3g00v5dw000 -1 SYS@book> @ pr ============================== HANDLE_TYPE                   : parent handle address KGLHDADR                      : 00000000611AFD80 KGLHDPAR                      : 00000000611AFD80 C40                           : select /*+ 9 */ count(*) from dept where KGLHDLMD                      : 1 KGLHDPMD                      : 0 KGLHDIVC                      : 0 KGLOBHD0                      : 00000000610A1760 KGLOBHD6                      : 00 KGLOBHS0                      : 4064 KGLOBHS6                      : 0 KGLOBT16                      : 0 N0_6_16                       : 4064 N20                           : 4064 KGLNAHSH                      : 3403546624 KGLOBT03                      : as3g00v5dw000 KGLOBT09                      : 65535 PL/SQL procedure successfully completed. --//父游标句柄地址00000000611AFD80已经写入mutex地址-0x10处。 SYS@book> @opeek 0x6cfa13f0 48 0 [06CFA13F0, 06CFA1420) = 611AFD80 00000000 611AFD80 00000000 00000000 00000000 00000018 00000000 00000000 00000000 611B9E10 00000000 --//可以发现实际上执行shp4z.sql脚本2次,导致gets增加4次,有时候又没有不确定.. 4.继续测试: --//由于目前版本21c不支持oradebug poke内存信息,采用gdb操作。 --//session 3: SYS@book> @ spid ============================== SID                           : 263 SERIAL#                       : 53123 PROCESS                       : 3404 SERVER                        : DEDICATED SPID                          : 3405 PID                           : 86 P_SERIAL#                     : 2 KILL_COMMAND                  : alter system kill session '263,53123' immediate; PL/SQL procedure successfully completed. --//window 1: $ rlgdb -f -p 3405 ... (gdb) x /8wx 0x6cfa1400 0x6cfa1400:     0x00000000      0x00000000      0x00000018      0x00000000 0x6cfa1410:     0x00000000      0x00000000      0x611b9e10      0x00000000 --//当前gets=0x18. (gdb) set *(int *)0x6cfa1404=0x7777 --//注意intel CPU的字节顺序,修改地址是0x6cfa1404。 (gdb) x /8wx 0x6cfa1400 0x6cfa1400:     0x00000000      0x00007777      0x00000018      0x00000000 0x6cfa1410:     0x00000000      0x00000000      0x611b9e10      0x00000000 --//windows 2,跟踪session 1进程: $ strace -Ttt -y -f -p 3399 2>&1 | tee  mutex0204.txt --//session 1: SCOTT@book01p> select /*+ 9 */ count(*) from dept where deptno = 93834; --//挂起,等3X秒。 --//window 1: (gdb) set *(int *)0x6cfa1404=0x0 SYS@book> @opeek 0x6cfa13f0 48 0 [06CFA13F0, 06CFA1420) = 611AFD80 00000000 611AFD80 00000000 00000000 00000000 00000019 000009A2 00000000 00000000 611B9E10 00000000 --//对比前面的gets从0x18->0x19. 增加1次,sleeps次数从0x0->0x9a2 = 2466,增加2466次。 --//session 1: SCOTT@book01p> select /*+ 9 */ count(*) from dept where deptno = 93834;   COUNT(*) ----------          0 --//执行完成。 5.分析: --//session 2: SYS@book> @ ashtop event,p1,p1raw,p2raw,p3raw 1=1 &5min     Total                                                                                                                                                                       Distinct Distinct    Distinct   Seconds     AAS %This   EVENT                            P1 P1RAW             P2RAW             P3RAW             FIRST_SEEN          LAST_SEEN           Execs Seen  Tstamps Execs Seen1 --------- ------- ------- ----------------------------- ----- ----------------- ----------------- ----------------- ------------------- ------------------- ---------- -------- -----------        67      .2   42% |                                   3                                                       2025-02-04 14:58:00 2025-02-04 15:02:34          1       67          67        48      .2   30% | library cache: bucket mutex X     0 0000000000000000  0000777700000000  000000000000003E  2025-02-04 14:59:32 2025-02-04 15:00:19          1       48          48        13      .0    8% |                                  20                                                       2025-02-04 14:57:47 2025-02-04 15:01:22          1       13          13        11      .0    7% |                                 100                                                       2025-02-04 14:57:43 2025-02-04 15:02:11          1       11          11         5      .0    3% | LGWR all worker groups            0 0000000000000000  0000000000000000  0000000000000000  2025-02-04 14:59:37 2025-02-04 15:00:26          1        5           5         4      .0    3% |                                 300                                                       2025-02-04 14:58:50 2025-02-04 14:59:15          1        4           4         3      .0    2% |                                   0                                                       2025-02-04 14:59:39 2025-02-04 15:00:24          1        3           3         2      .0    1% | control file parallel write       2 0000000000000002  0000000000000003  0000000000000002  2025-02-04 15:02:30 2025-02-04 15:02:36          1        2           2         1      .0    1% | direct path read                  1 0000000000000001  000000000000B880  0000000000000040  2025-02-04 15:02:32 2025-02-04 15:02:32          1        1           1         1      .0    1% | log file parallel write           1 0000000000000001  0000000000000001  0000000000000001  2025-02-04 14:59:09 2025-02-04 14:59:09          1        1           1         1      .0    1% | log file parallel write           1 0000000000000001  000000000000000F  0000000000000001  2025-02-04 14:59:37 2025-02-04 14:59:37          1        1           1         1      .0    1% |                                  63                                                       2025-02-04 14:59:05 2025-02-04 14:59:05          1        1           1         1      .0    1% |                                  84                                                       2025-02-04 15:02:23 2025-02-04 15:02:23          1        1           1         1      .0    1% |                                 291                                                       2025-02-04 14:59:24 2025-02-04 14:59:24          1        1           1         1      .0    1% |                               65536                                                       2025-02-04 15:00:54 2025-02-04 15:00:54          1        1           1 15 rows selected. --//P2RAW=0000777700000000. SYS@book> @ ev_namezpr "library cache: bucket mutex X" ============================== EVENT#                        : 379 EVENT_ID                      : 2601513493 NAME                          : library cache: bucket mutex X PARAMETER1                    : idn PARAMETER2                    : value PARAMETER3                    : where WAIT_CLASS_ID                 : 3875070507 WAIT_CLASS#                   : 4 WAIT_CLASS                    : Concurrency DISPLAY_NAME                  : library cache: bucket mutex X CON_ID                        : 0 PL/SQL procedure successfully completed. --//p1 hash=0等于bucket桶号,P2 对应mutex值,注意intel cpu的字节顺序问题,00007777在前面。p2 0x3e = 62 --//window 2: --//按ctrl+c中断strace跟踪。 $ egrep "sched_yield|getrusage" mutex0204.txt 14:59:31.787265 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 332819}, ru_stime={0, 27126}, ...}) = 0 <0.000063> 14:59:31.787491 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 332902}, ru_stime={0, 27133}, ...}) = 0 <0.000015> 14:59:31.787549 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 332911}, ru_stime={0, 27133}, ...}) = 0 <0.000013> 14:59:31.787671 sched_yield()           = 0 <0.000227> 14:59:31.788037 sched_yield()           = 0 <0.000062> 14:59:33.279533 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 333892}, ru_stime={0, 31204}, ...}) = 0 <0.000035> 14:59:35.291641 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 339971}, ru_stime={0, 32628}, ...}) = 0 <0.000179> 14:59:37.291983 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 344284}, ru_stime={0, 36240}, ...}) = 0 <0.000077> 14:59:39.285997 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 347436}, ru_stime={0, 38249}, ...}) = 0 <0.000076> 14:59:41.301019 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 350147}, ru_stime={0, 40320}, ...}) = 0 <0.000015> 14:59:43.312911 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 353924}, ru_stime={0, 45159}, ...}) = 0 <0.000015> 14:59:45.307157 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 360457}, ru_stime={0, 46785}, ...}) = 0 <0.000031> 14:59:47.310026 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 365011}, ru_stime={0, 49095}, ...}) = 0 <0.000033> 14:59:49.335822 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 367872}, ru_stime={0, 52553}, ...}) = 0 <0.000029> 14:59:51.329267 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 372229}, ru_stime={0, 53946}, ...}) = 0 <0.000028> 14:59:53.331265 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 374290}, ru_stime={0, 57499}, ...}) = 0 <0.000037> 14:59:55.336138 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 376766}, ru_stime={0, 59718}, ...}) = 0 <0.000081> 14:59:57.343453 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 378673}, ru_stime={0, 62024}, ...}) = 0 <0.000028> 14:59:59.340053 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 379916}, ru_stime={0, 63137}, ...}) = 0 <0.000028> 15:00:01.344914 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 384742}, ru_stime={0, 64305}, ...}) = 0 <0.000015> 15:00:03.361060 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 386351}, ru_stime={0, 68952}, ...}) = 0 <0.000044> 15:00:05.369793 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 388195}, ru_stime={0, 73472}, ...}) = 0 <0.000028> 15:00:07.365355 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 388906}, ru_stime={0, 79941}, ...}) = 0 <0.000062> 15:00:09.366864 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 391246}, ru_stime={0, 82596}, ...}) = 0 <0.000026> 15:00:11.368288 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 393473}, ru_stime={0, 86106}, ...}) = 0 <0.000251> 15:00:13.381993 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 394210}, ru_stime={0, 90971}, ...}) = 0 <0.000040> 15:00:15.390105 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 401516}, ru_stime={0, 92994}, ...}) = 0 <0.000028> 15:00:17.390237 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 403435}, ru_stime={0, 96736}, ...}) = 0 <0.000028> 15:00:19.390711 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 403794}, ru_stime={0, 99023}, ...}) = 0 <0.000232> 15:00:20.383125 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 404769}, ru_stime={0, 100092}, ...}) = 0 <0.000026> 15:00:20.384382 getrusage(0x1 /* RUSAGE_??? */, {ru_utime={0, 404975}, ru_stime={0, 100143}, ...}) = 0 <0.000016> --//注意看时间间隔,开始sched_yield执行2次之后2秒调用1次getrusage。 $  awk '{print $2}' mutex0204.txt |egrep "semtimedop|getrusage|sched_yield|select" | uniq -c|head       3 getrusage(0x1       2 sched_yield()      83 semtimedop(98304,       1 getrusage(0x1     115 semtimedop(98304,       1 getrusage(0x1     123 semtimedop(98304,       1 getrusage(0x1      86 semtimedop(98304,       1 getrusage(0x1 --//可以看出规律,开始2次sched_yield,接着调用semtimedop许多次(每次0.01秒),2秒之后调用1次getrusage,如此重复。 $ grep "semtimedop" mutex0204.txt | head -4 14:59:31.788149 semtimedop(98304, {{46, -1, 0}}, 1, {0, 10000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.020113> 14:59:31.808359 semtimedop(98304, {{46, -1, 0}}, 1, {0, 10000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.009714> 14:59:31.818220 semtimedop(98304, {{46, -1, 0}}, 1, {0, 10000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.010634> 14:59:31.828961 semtimedop(98304, {{46, -1, 0}}, 1, {0, 10000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.017562> $ grep semtimedop mutex0204.txt | awk '{print $8}'| uniq -c    2464 10000000}) --//调用semtimedop 2464次,加上2次sched_yield(),sleeps合计2466次,与opeek看到的结果0x000009A2=2466一致。 --//{0, 10000000} 前面参数的时间单位是秒,后面参数的时间单位是纳秒。10000000/10^9 = .01,相当于10毫秒,与后面的执行时间一 --//致。感觉虚拟机器有点不稳定,以前我在11g下测试基本都是18X次。 SYS@book> @ mutexprofz id,loc "ts>=trunc(sysdate)+14/24+59/1440+30/86400" -- MutexProf by Tanel Poder (http://www.tanelpoder.com) -- Showing profile of top 20 sleeps... -- column info : id idn hash hash_value=>hash_value ts=>sleep_timestamp --               req=>requesting_session blk=>blocking_session val=>mutex_value maddr=>mutex_addr SUM_SLEEPS GETS_DIFF MUTEX_TYPE              ID GET_LOCATION              OBJECT_NAME ---------- --------- --------------- ---------- ------------------------- --------------------------------------------------------------------------------       2466           Library Cache            0 kglhdgn1  62              (name not found)          3           Cursor Parent   2276283790 kksLoadChild [KKSPRTLOC4] SELECT /*+ NO_STATEMENT_QUEUING RESULT_CACHE (SYSOBJ=TRUE) */                                                                           "OBJ#","CLASS_OID","JOB_FLAGS","RUN_TIME","PRIORITY","JOB_STATUS","RUNNING_INSTA                                                                           NCE","SCHEDULE_LIMIT","JOB_WEIGHT","INSTANCE_ID","AFFINITY","DATABASE_ROLE"                                                                           FROM "SYS"."SCHEDULER$_JOB_REFRESH" "SCHEDULER$_JOB_REFRESH" WHERE 1=1          1           Library Cache   2276283790 kglhdgn2 106              SELECT /*+ NO_STATEMENT_QUEUING RESULT_CACHE (SYSOBJ=TRUE) */                                                                           "OBJ#","CLASS_OID","JOB_FLAGS","RUN_TIME","PRIORITY","JOB_STATUS","RUNNING_INSTA                                                                           NCE","SCHEDULE_LIMIT","JOB_WEIGHT","INSTANCE_ID","AFFINITY","DATABASE_ROLE"                                                                           FROM "SYS"."SCHEDULER$_JOB_REFRESH" "SCHEDULER$_JOB_REFRESH" WHERE 1=1 --//记录的sleeps次数也是2466次。GET_LOCATION 对应的数字62 与前面ashtop 看到的p2 0x3e对应,注意id=0表示bucket桶号。 $ ./lookup.awk  kglhdgn1 (kglhdgn)1 : kernel generic library cache management object handle get handle, if it doesn't exist, create it ?? 4.小结: --//缺省_mutex_wait_time=1,_mutex_wait_scheme=2的情况下semtimedop的时间是10毫秒。 --//_mutex_wait_scheme=2模式下开始sched_yield 2次,接着调用semtimedop许多次(每次0.01秒),2秒之后调用1次getrusage,如此重复。 --//这次测试_mutex_wait_time=1的情况,单位厘秒.调用semtimedop每次0.01秒,当大于1时sleeps的时间会出现指数回退,另外写一篇 --//验证这样的情况.另外有机会测试_mutex_wait_scheme=0,1的情况。 --//另外关于_mutex_spin_count缺省255,按照前面理解,在yielding/waiting之前应该存在255次spin,不知道理解是否正确,如何验 --//证不是很清楚. * _mutex_spin_count (Integer) - This sets the number of times to spin before yielding/waiting. --//我做了如下尝试,从时间上看不出来。但是可以通过top -p spid发现CPU使用增加。 SYS@book> alter system set "_mutex_spin_count"=65535 scope=memory; System altered.

相关推荐