[20240510]SQL语句存在问题与共享池内存分配.txt

来源:这里教程网 时间:2026-03-03 20:02:01 作者:

[20240510]SQL语句存在问题与共享池内存分配.txt --//五一前遇到的问题,生产系统应用程序升级,但是3个表忘记建立,而编写的程序可能存在问题,导致频繁调用这些根本不可能执行的sql --//语句.很奇怪的是应用前台根本不报错,真不知道开发如何写代码,难道没做例外处理以及记录吗? --//测试看看如果sql语句存在问题,是否消耗共享池以及parse的情况以及对性能相关问题. 1.环境: SCOTT@test01p> @ver1 PORT_STRING                    VERSION        BANNER                                                                               CON_ID ------------------------------ -------------- -------------------------------------------------------------------------------- ---------- IBMPC/WIN_NT64-9.1.0           12.2.0.1.0     Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production              0 2.测试: SCOTT@test01p> select sql_id from v$sql where sql_text='select count(1) from deptxxx'; no rows selected --//相关语句不在共享池,并且deptxxx表不存在. SCOTT@test01p> select count(1) from deptxxx; select count(1) from deptxxx                      * ERROR at line 1: ORA-00942: table or view does not exist --//表deptxxx不存在!! SCOTT@test01p> @ hash HASH_VALUE SQL_ID        CHILD_NUMBER KGL_BUCKET PLAN_HASH_VALUE HASH_HEX   SQL_EXEC_START      SQL_EXEC_ID ---------- ------------- ------------ ---------- --------------- ---------- ------------------- ----------- 3744802176 07v29cmgma9c0            0      75136       903671040  df352580  2024-05-10 22:36:28    16777216 SCOTT@test01p> @ sql_id 07v29cmgma9c0 --SQL_ID = 07v29cmgma9c0 select sql_id from v$sql where sql_text='select count(1) from deptxxx'; --//是前一次正确执行的sql_id,这样无法获得无法执行sql语句的sql_id. SCOTT@test01p> select sql_id from v$sqlarea where sql_text='select count(1) from deptxxx'; no rows selected SCOTT@test01p> select sql_id from v$sql where sql_text='select count(1) from deptxxx'; no rows selected --//查询v$sqlarea,v$sql视图根本不能发现对应sql_id. --//使用我写的脚本计算sql_id与hash_value: $ ./sql_idx.sh 'select count(1) from deptxxx' sql_text = select count(1) from deptxxx\0 full_hash_value(16) = AA0B8A5E997323CE2D65F9B7AF91ED4F hash_value(10) = 2945576271 sql_id(32) = 2utgtqyrt3vag sql_id(32) = 2utgtqyrt3vag sql_id(32) = 2utgtqyrt3vag SYS@test> @ sharepool/shp4x 2utgtqyrt3vag 0 TEXT                  KGLHDADR         KGLHDPAR         C40                          KGLHDLMD KGLHDPMD KGLHDIVC KGLOBHD0         KGLOBHD6 KGLOBHS0 KGLOBHS6 KGLOBT16 N0_6_16  N20   KGLNAHSH KGLOBT03      KGLOBT09 --------------------- ---------------- ---------------- ---------------------------- -------- -------- -------- ---------------- -------- -------- -------- -------- ------- ---- ---------- ------------- -------- child handle address  000007FF121630F0 000007FF16537BC8 select count(1) from deptxxx        0        0        3 00               00              0        0     3165    3165 3165 2945576271 2utgtqyrt3vag        0 parent handle address 000007FF16537BC8 000007FF16537BC8 select count(1) from deptxxx        1        0        3 000007FF12E82EA0 00           4072        0        0    4072 4072 2945576271 2utgtqyrt3vag    65535 --//可以发现这种情况一样建立父子光标,即使sql语句执行错误.不过子光标的堆0,堆6不存在罢了. --//注:可以查询v$db_object_cache视图,比如name(对应sql文本)里面的特征字符,不过如果是生产系统,我估计会很慢!! --//说明即使无法正确执行的sql语句也会消耗共享池内存. 3.看看parse的情况: --//session 1: SCOTT@test01p(265,16062)> @ spid SID SERIAL# PROCESS   SERVER    SPID PID  P_SERIAL# C50 --- ------- --------- --------- ---- --- ---------- -------------------------------------------------- 265   16062 9104:6104 DEDICATED 9516  59          8 alter system kill session '265,16062' immediate; --//session 2: SYS@test> @ ses2 265 "parse count" SID NAME                   VALUE --- ---------------------- ----- 265 parse count (total)      729 265 parse count (hard)       266 265 parse count (failures)    31 --//session 1: --//执行3次,输出略.前面parse count (failures)=31,说明我已经重复执行多次. select count(1) from deptxxx; --//session 2: SYS@test> @ ses2 265 "parse count" SID NAME                   VALUE --- ---------------------- ----- 265 parse count (total)      732 265 parse count (hard)       269 265 parse count (failures)    34 --//可以发现parse count (failures),parse count (hard),parse count (total)各自增加3次. --//可以想象,如果密集的执行这些sql语句将是一场"灾难",如果生产系统这些语句执行频繁,至少出现大量分析,并且因为hash_value一样, --//会在相同bucket上出现争用,出现大量library cache: mutex X,shared pool latch争用,但是不明白生产系统还会出现library --//cache lock. --//注:事后测试发现,情况并不是那样,没有出现大量shared pool latch争用,另外写blog分析. --//生产系统的当时的情况: > @ dashtop event "event like 'library cache%'" "timestamp'2024-04-26 08:00:57'" "timestamp'2024-04-26 10:30:00'"   Total Seconds     AAS %This   EVENT                  FIRST_SEEN          LAST_SEEN ------- ------- ------- ---------------------- ------------------- ------------------- 8164610   913.0   76%   library cache lock     2024-04-26 08:23:22 2024-04-26 09:17:44 2587860   289.4   24%   library cache: mutex X 2024-04-26 08:23:22 2024-04-26 09:17:44      10      .0    0%   library cache pin      2024-04-26 10:00:40 2024-04-26 10:00:40 --//问题出现在2024-04-26 08:23:22,在2024-04-26 09:17:44问题解决. --//同事分析缺乏经验,实际上知道sql_id,如果能知道sql语句,随便执行看看或者查看对象就可以定位问题. --//当然定位不是查询gv$sqlarea,gv$sql视图,而是查询x$kglob底层结构. 4.继续: --//session 1: SCOTT@test01p(265,16062)> select count(1) from; select count(1) from                    * ERROR at line 1: ORA-00903: invalid table name --//执行多次,看看这样不完整的sql语句的情况. $ ./sql_idx.sh 'select count(1) from' sql_text = select count(1) from\0 full_hash_value(16) = 17A4F71D1DD4E12BAB0FAF270B2672EB hash_value(10) = 187069163 sql_id(32) = aq3xg4w5kcwrb sql_id(32) = aq3xg4w5kcwrb sql_id(32) = aq3xg4w5kcwrb SYS@test> @ sharepool/shp4x aq3xg4w5kcwrb 0 TEXT                  KGLHDADR         KGLHDPAR         C40                  KGLHDLMD KGLHDPMD KGLHDIVC KGLOBHD0         KGLOBHD6 KGLOBHS0 KGLOBHS6 KGLOBT16 N0_6_16  N20  KGLNAHSH KGLOBT03      KGLOBT09 --------------------- ---------------- ---------------- -------------------- -------- -------- -------- ---------------- -------- -------- -------- -------- ------- ---- --------- ------------- -------- child handle address  000007FF008F2D88 000007FF12F38968 select count(1) from        0        0        4 00               00              0        0     3157    3157 3157 187069163 aq3xg4w5kcwrb        0 parent handle address 000007FF12F38968 000007FF12F38968 select count(1) from        1        0        4 000007FF12B33EE8 00           4072        0        0    4072 4072 187069163 aq3xg4w5kcwrb    65535 --//情况与上面测试类似. --//做这个测试主要原因在生产系统发现如下: --//主要原因在生产系统看到如下: > @ dashtop event,p1,p3 "event like 'library cache%'" "timestamp'2024-04-26 08:00:57'" "timestamp'2024-04-26 09:30:00'"   Total Seconds    AAS %This   EVENT                            P1              P3 FIRST_SEEN          LAST_SEEN ------- ------ ------- ---------------------- ------------ --------------- ------------------- ------------------- 8163900 1528.0   76%   library cache lock      81867324184         5373954 2024-04-26 08:23:22 2024-04-26 09:17:44 1545660  289.3   14%   library cache: mutex X   3802446058              82 2024-04-26 08:23:22 2024-04-26 09:17:44 1041590  194.9   10%   library cache: mutex X   3802446058             119 2024-04-26 08:23:33 2024-04-26 09:17:44 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~     700     .1    0%   library cache lock      81867324184         5373955 2024-04-26 08:23:22 2024-04-26 09:17:44     490     .1    0%   library cache: mutex X   3802446058             106 2024-04-26 08:25:31 2024-04-26 09:17:44      40     .0    0%   library cache: mutex X   3771887287               1 2024-04-26 09:17:11 2024-04-26 09:17:44      30     .0    0%   library cache: mutex X   3802446058             124 2024-04-26 08:46:19 2024-04-26 08:58:18      20     .0    0%   library cache: mutex X   3802446058              85 2024-04-26 09:17:11 2024-04-26 09:17:22      10     .0    0%   library cache lock      81864522800 400827822964738 2024-04-26 08:43:39 2024-04-26 08:43:39      10     .0    0%   library cache: mutex X        37799              49 2024-04-26 08:36:47 2024-04-26 08:36:47      10     .0    0%   library cache: mutex X       106013              49 2024-04-26 08:47:04 2024-04-26 08:47:04      10     .0    0%   library cache: mutex X       121048              49 2024-04-26 08:36:37 2024-04-26 08:36:37 12 rows selected. > @ sharepool/shp4x 0 3802446058 > @ pr ============================== TEXT        : parent handle address KGLHDADR    : 00000013077E9500 KGLHDPAR    : 00000013077E9500 C40         : select count(1) from KGLHDLMD    : 0 KGLHDPMD    : 0 KGLHDIVC    : 0 KGLOBHD0    : 00 KGLOBHD6    : 00 KGLOBHS0    : 0 KGLOBHS6    : 0 KGLOBT16    : 0 N0_6_16     : 0 N20         : 0 KGLNAHSH    : 3802446058 ~~~~~~~~~~~~~~~~~~~~~~ KGLOBT03    :            --//没有值.   KGLOBT09    : 65535 PL/SQL procedure successfully completed. --//小心,在生产系统访问x$kglob可能存在严重性能问题,不行快速kill相关进程.包括下面测试使用的fchaz.sql脚本. --//我的查询可以使用索引估计问题不大!! --//参数1 sql_id 参数2 hash_value. > select replace(kglnaobj,chr(13),' ')  c100 ,length(kglnaobj)  n10 from x$kglob where KGLHDPAR=hextoraw('00000013077E9500'); C100                  N10 --------------------- --- select count(1) from   20 --//长度确实是20,但是KGLNAHSH=3802446058,完全与我的测试对不上,我的测试是KGLNAHSH=187069163.这个问题先放一放. 5.继续看看对象deptxxx: --//继续昨天的测试,在scottt用户下执行select count(1) from deptxxx;多次. SYS@test> select * from v$open_cursor where sid=263 and sql_text like '%count%'; no rows selected --//语句执行有问题,光标不会缓存. SYS@test> select * from v$db_object_cache where name = 'DEPTXXX'   2  @ prxx ============================== OWNER                         : SCOTT NAME                          : DEPTXXX DB_LINK                       : NAMESPACE                     : TABLE/PROCEDURE TYPE                          : CURSOR SHARABLE_MEM                  : 0 LOADS                         : 1 EXECUTIONS                    : 0 LOCKS                         : 0 PINS                          : 0 KEPT                          : NO CHILD_LATCH                   : 77291 INVALIDATIONS                 : 0 HASH_VALUE                    : 1772563947 LOCK_MODE                     : NONE PIN_MODE                      : NONE STATUS                        : UNKOWN TIMESTAMP                     : PREVIOUS_TIMESTAMP            : LOCKED_TOTAL                  : 4 PINNED_TOTAL                  : 4 PROPERTY                      : FULL_HASH_VALUE               : ef603aaa09d90710c71ba16b69a72deb CON_ID                        : 3 CON_NAME                      : TEST01P ADDR                          : 000007FF14929138 EDITION                       : ============================== OWNER                         : PUBLIC NAME                          : DEPTXXX DB_LINK                       : NAMESPACE                     : TABLE/PROCEDURE TYPE                          : CURSOR SHARABLE_MEM                  : 0 LOADS                         : 1 EXECUTIONS                    : 0 LOCKS                         : 0 PINS                          : 0 KEPT                          : NO CHILD_LATCH                   : 104220 INVALIDATIONS                 : 0 HASH_VALUE                    : 2989463324 LOCK_MODE                     : NONE PIN_MODE                      : NONE STATUS                        : UNKOWN TIMESTAMP                     : PREVIOUS_TIMESTAMP            : LOCKED_TOTAL                  : 4 PINNED_TOTAL                  : 4 PROPERTY                      : FULL_HASH_VALUE               : bee0db68379be71263a53e5fb22f971c CON_ID                        : 3 CON_NAME                      : TEST01P ADDR                          : 000007FEFFF8AF58 EDITION                       : PL/SQL procedure successfully completed. --//可以发现即使对象不存在,也会加载在共享池.一个owner=SCOTT,另外一个PUBLIC.oracle这样的目的是表示对象是否存在. SYS@test> @ sharepool/shp4x 0 1772563947 TEXT                  KGLHDADR         KGLHDPAR         C40      KGLHDLMD   KGLHDPMD   KGLHDIVC KGLOBHD0         KGLOBHD6           KGLOBHS0   KGLOBHS6   KGLOBT16   N0_6_16        N20   KGLNAHSH KGLOBT03        KGLOBT09 --------------------- ---------------- ---------------- ------- --------- ---------- ---------- ---------------- ---------------- ---------- ---------- ---------- --------- ---------- ---------- ------------- ---------- parent handle address 000007FF14929138 000007FF14929138 DEPTXXX         0          0          0 00               00                        0          0          0         0          0 1772563947                        0 SYS@test> @ sharepool/shp4x 0 2989463324 TEXT                  KGLHDADR         KGLHDPAR         C40      KGLHDLMD   KGLHDPMD   KGLHDIVC KGLOBHD0         KGLOBHD6           KGLOBHS0   KGLOBHS6   KGLOBT16   N0_6_16        N20   KGLNAHSH KGLOBT03        KGLOBT09 --------------------- ---------------- ---------------- ------- --------- ---------- ---------- ---------------- ---------------- ---------- ---------- ---------- --------- ---------- ---------- ------------- ---------- parent handle address 000007FEFFF8AF58 000007FEFFF8AF58 DEPTXXX         0          0          0 00               00                        0          0          0         0          0 2989463324                        0 --//实际上v$db_object_cache视图来源就是x$kglob,v$db_object_cache.addr=KGLHDADR. --//1772563947%2^17 = 77291  对应CHILD_LATCH = 77291 --//2989463324%2^17 = 104220 对应CHILD_LATCH = 104220 SYS@test> @ fchaz 000007FF14929138 LOC KSMCHPTR           KSMCHIDX   KSMCHDUR KSMCHCOM           KSMCHSIZ KSMCHCLS   KSMCHTYP KSMCHPAR         KSMCHPTR_END --- ---------------- ---------- ---------- ---------------- ---------- -------- ---------- ---------------- ----------------- SGA 000007FF14929108          1          1 KGLHD                   816 recr             80 00               000007FF14929437 SYS@test> @ fchaz 000007FEFFF8AF58 LOC KSMCHPTR           KSMCHIDX   KSMCHDUR KSMCHCOM           KSMCHSIZ KSMCHCLS   KSMCHTYP KSMCHPAR         KSMCHPTR_END --- ---------------- ---------- ---------- ---------------- ---------- -------- ---------- ---------------- ----------------- SGA 000007FEFFF8AF28          1          1 KGLHD                   816 recr             80 00               000007FEFFF8B257 --//2个对象消耗816字节. 6.结论: --//可以看出几个特点即使sql语句存在问题,oracle还是消耗一定的共享池内存,建立父子光标,仅仅子光标堆0,堆6不存在. --//并且每次执行都会执行产生1次硬分析. --//相关表也会加载到共享池中. --//下次测试环境下模拟生产系统大量执行这类sql语句时出现的情况. 7.附上shp4x.sql脚本: $ cat shp4x.sql column N0_6_16 format 99999999 SELECT /*+ USE_CONCAT(@"SEL$1" 8 OR_PREDICATES(1)) */ DECODE (kglhdadr,                kglhdpar, 'parent handle address',                'child handle address')     text,        kglhdadr,        kglhdpar,        substr(kglnaobj,1,40) c40,        KGLHDLMD,        KGLHDPMD,        kglhdivc,        kglobhd0,        kglobhd6,        kglobhs0,kglobhs6,kglobt16,        kglobhs0+kglobhs6+kglobt16 N0_6_16,        kglobhs0+kglobhs1+kglobhs2+kglobhs3+kglobhs4+kglobhs5+kglobhs6+kglobt16 N20,        kglnahsh,        kglobt03,        kglobt09     FROM x$kglob  WHERE kglobt03 = lower('&1') or KGLNAHSH= &2; --//fchaz.sql脚本来源tpt fcha.sql,我仅仅注解了里面的提示信息,增加一个显示字段KSMCHPTR_END. --//TO_CHAR(TO_NUMBER(KSMCHPTR,'XXXXXXXXXXXXXXXX') + KSMCHSIZ - 1,'FM0XXXXXXXXXXXXXXX') KSMCHPTR_END --//注:在生产系统执行要小心!! -- Copyright 2018 Tanel Poder. All rights reserved. More info at http://tanelpoder.com -- Licensed under the Apache License, Version 2.0. See LICENSE.txt for terms & conditions. -------------------------------------------------------------------------------- -- -- File name:   fcha.sql (Find CHunk Address) v0.2 -- Purpose:     Find in which heap (UGA, PGA or Shared Pool) a memory address resides --               -- Author:      Tanel Poder -- Copyright:   (c) http://blog.tanelpoder.com | @tanelpoder --               -- Usage:       @fchaz <addr_hex> -- --              @fchaz F6A14448  -- -- -- Other:       This would only report an UGA/PGA chunk address if it belongs --              to *your* process/session (x$ksmup and x$ksmpp do not see other --              session/process memory) --               -------------------------------------------------------------------------------- --prompt Find in which heap (UGA, PGA or Shared Pool) the memory address &1 resides... --prompt --prompt WARNING!!! This script will query X$KSMSP, which will cause heavy shared pool latch contention --prompt in systems under load and with large shared pool. This may even completely hang --prompt your instance until the query has finished! You probably do not want to run this in production! --prompt --pause  Press ENTER to continue, CTRL+C to cancel... select     'SGA' LOC,     KSMCHPTR,     KSMCHIDX,     KSMCHDUR,     KSMCHCOM,     KSMCHSIZ,     KSMCHCLS,     KSMCHTYP,     KSMCHPAR,     TO_CHAR(TO_NUMBER(KSMCHPTR,'XXXXXXXXXXXXXXXX') + KSMCHSIZ - 1,'FM0XXXXXXXXXXXXXXX') KSMCHPTR_END from     x$ksmsp where     to_number(substr('&1', instr(lower('&1'), 'x')+1) ,'XXXXXXXXXXXXXXXX')     between         to_number(ksmchptr,'XXXXXXXXXXXXXXXX')     and to_number(ksmchptr,'XXXXXXXXXXXXXXXX') + ksmchsiz - 1 union all select     'UGA',     KSMCHPTR,     null,     null,     KSMCHCOM,     KSMCHSIZ,     KSMCHCLS,     KSMCHTYP,     KSMCHPAR,     TO_CHAR(TO_NUMBER(KSMCHPTR,'XXXXXXXXXXXXXXXX') + KSMCHSIZ - 1,'FM0XXXXXXXXXXXXXXX') KSMCHPTR_END from     x$ksmup where     to_number(substr('&1', instr(lower('&1'), 'x')+1) ,'XXXXXXXXXXXXXXXX')     between         to_number(ksmchptr,'XXXXXXXXXXXXXXXX')     and to_number(ksmchptr,'XXXXXXXXXXXXXXXX') + ksmchsiz - 1 union all select     'PGA',     KSMCHPTR,     null,     null,     KSMCHCOM,     KSMCHSIZ,     KSMCHCLS,     KSMCHTYP,     KSMCHPAR,     TO_CHAR(TO_NUMBER(KSMCHPTR,'XXXXXXXXXXXXXXXX') + KSMCHSIZ - 1,'FM0XXXXXXXXXXXXXXX') KSMCHPTR_END from     x$ksmpp where     to_number(substr('&1', instr(lower('&1'), 'x')+1) ,'XXXXXXXXXXXXXXXX')     between         to_number(ksmchptr,'XXXXXXXXXXXXXXXX')     and to_number(ksmchptr,'XXXXXXXXXXXXXXXX') + ksmchsiz - 1 /

相关推荐