[20180928]避免表达式在sql语句中.txt

来源:这里教程网 时间:2026-03-03 12:02:21 作者:

[20180928]避免表达式在sql语句中.txt --//在sql语句中避免表达式是很困难的,但是链接https://blog.jooq.org/2016/11/01/why-you-should-avoid-expressions-in-sql-predicates/ --//提到这么大的差异,我个人还是非常怀疑的,还是以自己的测试为准: 1.环境与建立测试: SCOTT@test> @ &r/ver1 PORT_STRING                    VERSION        BANNER ------------------------------ -------------- ---------------------------------------------------------------- x86_64/Linux 2.4.xx            10.2.0.4.0     Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi CREATE TABLE payment (   id            NOT NULL PRIMARY KEY,   payment_date  NOT NULL,   text ) AS SELECT   level,   SYSDATE - dbms_random.value(1, 500),   LPAD ('a', 500, 'a') --    dbms_random.string('a', 500) FROM dual CONNECT BY level <= 50000 ORDER BY dbms_random.value; --//执行有点慢,可以修改dbms_random.string('a', 500)=>lpad('a',500,'a'),这样可以快一些,不影响测试结果.   --//CREATE INDEX i_payment_date ON payment(payment_date); --//先不建立索引,实际上索引没用,查询范围很大.因为作者使用ORDER BY dbms_random.value,不按日期排序. --//走全表扫描更能看出问题在那里,因为这样每行都要比较.   EXEC dbms_stats.gather_table_stats('SCOTT', 'PAYMENT'); 2.建立测试脚本: $ cat expr.txt SET SERVEROUTPUT ON DECLARE   v_ts TIMESTAMP WITH TIME ZONE;   v_repeat CONSTANT NUMBER := 500;   v_range CONSTANT NUMBER := 470;  -- v_date CONSTANT DATE := SYSDATE - v_range;   v_date  DATE ; BEGIN   v_ts := SYSTIMESTAMP;   -- Original query with inline expression   FOR i IN 1..v_repeat LOOP     FOR rec IN (       SELECT *       FROM payment       WHERE payment_date < SYSDATE - v_range     ) LOOP       NULL;     END LOOP;   END LOOP;   dbms_output.put_line('Statement 1 : ' || (SYSTIMESTAMP - v_ts));   v_ts := SYSTIMESTAMP;   -- Pre-calculated PL/SQL local variable   FOR i IN 1..v_repeat LOOP     v_date  := SYSDATE - v_range;     FOR rec IN (       SELECT *       FROM payment       WHERE payment_date < v_date     ) LOOP       NULL;     END LOOP;   END LOOP;   dbms_output.put_line('Statement 2 : ' || (SYSTIMESTAMP - v_ts));   v_ts := SYSTIMESTAMP;   -- Magical 11g scalar subquery caching   FOR i IN 1..v_repeat LOOP     FOR rec IN (       SELECT *       FROM payment       WHERE payment_date < (SELECT SYSDATE - v_range FROM dual)     ) LOOP       NULL;     END LOOP;   END LOOP;   dbms_output.put_line('Statement 3 : ' || (SYSTIMESTAMP - v_ts)); END; / --//测试前说明一下,v_repeat原来定义100,我修改500.这样测试差距明显一点点. --//注:我修改计算v_date应该重复500次.这样测试公平一些. 3.测试: --//第1次执行使用表达式: SELECT * FROM payment WHERE payment_date < SYSDATE - v_range --//第2次执行使用变量.感觉这里有问题,v_date仅仅计算1次.我个人认为作为比较,应该每次要重新计算v_date. SELECT * FROM payment WHERE payment_date < v_date --//第3次执行使用标量子查询. SELECT * FROM payment WHERE payment_date < (SELECT SYSDATE - v_range FROM dual) --//执行计划应该都是一样的(我没有建立索引,仅仅标量子查询有一点点不同).测试结果: SCOTT@test> @expr.txt Statement 1 : +000000000 00:00:16.639160000 Statement 2 : +000000000 00:00:07.997651000 Statement 3 : +000000000 00:00:09.716406000 PL/SQL procedure successfully completed. SCOTT@test> @expr.txt Statement 1 : +000000000 00:00:17.143361000 Statement 2 : +000000000 00:00:08.381461000 Statement 3 : +000000000 00:00:10.362664000 PL/SQL procedure successfully completed. SCOTT@test> @expr.txt Statement 1 : +000000000 00:00:18.866096000 Statement 2 : +000000000 00:00:09.923500000 Statement 3 : +000000000 00:00:11.905165000 PL/SQL procedure successfully completed. --//没想到在10g下存在这么大的差异.可以发现使用表达式最慢,而标量子查询能缓存结果,这样看上去第2,第3执行时间相差感觉还是有点大. --//差别在于多执行500次的SELECT SYSDATE - v_range FROM dual. 感觉这里将近2秒的差距还是有点大. --//可以看出第2种直接使用变量快一些. --//不过并不像作者测试那样,使用标量子查询最快,估计三种方式可能有一些使用索引. --//建立索引重复测试: SCOTT@test> CREATE INDEX i_payment_date ON payment(payment_date); Index created. SCOTT@test> alter system flush shared_pool; System altered. SCOTT@test> @expr.txt Statement 1 : +000000000 00:00:15.888974000 Statement 2 : +000000000 00:00:07.082431000 Statement 3 : +000000000 00:00:04.654471000 PL/SQL procedure successfully completed. SCOTT@test> @expr.txt Statement 1 : +000000000 00:00:15.862751000 Statement 2 : +000000000 00:00:07.081248000 Statement 3 : +000000000 00:00:04.524595000 PL/SQL procedure successfully completed. SCOTT@test> @expr.txt Statement 1 : +000000000 00:00:15.854954000 Statement 2 : +000000000 00:00:07.104939000 Statement 3 : +000000000 00:00:04.487967000 PL/SQL procedure successfully completed. --//建立索引后结果与作者测试相近,标量子查询最快.实际上作者没有注意执行计划发生了变化: SCOTT@test> @ &r/dpc  bnf6jkmr2sh8c '' PLAN_TABLE_OUTPUT ------------------------------------- SQL_ID  bnf6jkmr2sh8c, child number 0 ------------------------------------- SELECT * FROM PAYMENT WHERE PAYMENT_DATE < SYSDATE - :B1 Plan hash value: 684176532 ------------------------------------------------------------------------------ | Id  | Operation         | Name    | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | ------------------------------------------------------------------------------ |*  1 |  TABLE ACCESS FULL| PAYMENT |   3002 |  1503K|   864   (1)| 00:00:11 | ------------------------------------------------------------------------------ SCOTT@test> @ &r/dpc 5tqgbsqwxsjtt '' PLAN_TABLE_OUTPUT ------------------------------------- SQL_ID  5tqgbsqwxsjtt, child number 0 ------------------------------------- SELECT * FROM PAYMENT WHERE PAYMENT_DATE < :B1 Plan hash value: 684176532 ------------------------------------------------------------------------------ | Id  | Operation         | Name    | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | ------------------------------------------------------------------------------ |*  1 |  TABLE ACCESS FULL| PAYMENT |   3002 |  1503K|   862   (1)| 00:00:11 | ------------------------------------------------------------------------------ SCOTT@test> @ &r/dpc bumyan15pbchp '' PLAN_TABLE_OUTPUT ------------------------------------- SQL_ID  bumyan15pbchp, child number 0 ------------------------------------- SELECT * FROM PAYMENT WHERE PAYMENT_DATE < (SELECT SYSDATE - :B1 FROM DUAL) Plan hash value: 2871123539 ----------------------------------------------------------------------------------------------- | Id  | Operation                   | Name           | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | ----------------------------------------------------------------------------------------------- |   1 |  TABLE ACCESS BY INDEX ROWID| PAYMENT        |   2500 |  1252K|   453   (0)| 00:00:06 | |*  2 |   INDEX RANGE SCAN          | I_PAYMENT_DATE |    450 |       |     3   (0)| 00:00:01 | |   3 |    FAST DUAL                |                |      1 |       |     2   (0)| 00:00:01 | ----------------------------------------------------------------------------------------------- --//450/50000 = .009,真不知道这个比率如何确定的. --//只有标量子查询的语句走了索引,问题在于估算的返回记录450,这样比较的次数减少许多,所以出现了使用标量子查询CPU消耗时间更 --//少,出现标量子查询更快的假象. --//不过给承认在10g下确实存在存在表达式执行比较慢的情况. 4.继续11g下测试: SCOTT@book> @ ver1 PORT_STRING                    VERSION        BANNER ------------------------------ -------------- -------------------------------------------------------------------------------- x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production SCOTT@book> alter session set "_serial_direct_read"=never; Session altered. --//避免采用直接路径读. --//重复测试脚本略,看前面: --//先不建立索引看看: SCOTT@book> @ expr.txt Statement 1 : +000000000 00:00:04.520582000 Statement 2 : +000000000 00:00:03.725714000 Statement 3 : +000000000 00:00:04.309029000 PL/SQL procedure successfully completed. SCOTT@book> @ expr.txt Statement 1 : +000000000 00:00:05.054053000 Statement 2 : +000000000 00:00:04.325762000 Statement 3 : +000000000 00:00:05.000122000 PL/SQL procedure successfully completed. SCOTT@book> @ expr.txt Statement 1 : +000000000 00:00:06.489944000 Statement 2 : +000000000 00:00:05.142213000 Statement 3 : +000000000 00:00:06.300671000 PL/SQL procedure successfully completed. --//还是可以发现使用表达式比较慢,不过这个需要量的累积. --//建立索引重复测试: CREATE INDEX i_payment_date ON payment(payment_date); Index created. SCOTT@book> CREATE INDEX i_payment_date ON payment(payment_date); Index created. --//退出再登录: SCOTT@book> alter system flush shared_pool; System altered. SCOTT@book> alter session set "_serial_direct_read"=never; Session altered. SCOTT@book> @ expr.txt Statement 1 : +000000000 00:00:05.703140000 Statement 2 : +000000000 00:00:04.711262000 Statement 3 : +000000000 00:00:05.617969000 PL/SQL procedure successfully completed. SCOTT@book> @ expr.txt Statement 1 : +000000000 00:00:05.700170000 Statement 2 : +000000000 00:00:04.738476000 Statement 3 : +000000000 00:00:05.615570000 PL/SQL procedure successfully completed. SCOTT@book> @ expr.txt Statement 1 : +000000000 00:00:05.719636000 Statement 2 : +000000000 00:00:04.709207000 Statement 3 : +000000000 00:00:05.611807000 PL/SQL procedure successfully completed. --//执行计划全部是全表扫描,不再贴出,不像10g那样出现标量子查询语句走索引的情况 5.测试11g使用提示使用索引的情况: $ cat expr1.txt SET SERVEROUTPUT ON DECLARE   v_ts TIMESTAMP WITH TIME ZONE;   v_repeat CONSTANT NUMBER := 500;   v_range CONSTANT NUMBER := 470; --  v_date CONSTANT DATE := SYSDATE - v_range;   v_date  DATE ; BEGIN   v_ts := SYSTIMESTAMP;   -- Original query with inline expression   FOR i IN 1..v_repeat LOOP     FOR rec IN (       SELECT /*+ index(payment i_payment_date) */ *       FROM payment       WHERE payment_date < SYSDATE - v_range     ) LOOP       NULL;     END LOOP;   END LOOP;   dbms_output.put_line('Statement 1 : ' || (SYSTIMESTAMP - v_ts));   v_ts := SYSTIMESTAMP;   -- Pre-calculated PL/SQL local variable   FOR i IN 1..v_repeat LOOP   v_date   := SYSDATE - v_range;     FOR rec IN (       SELECT /*+ index(payment i_payment_date) */ *       FROM payment       WHERE payment_date < v_date     ) LOOP       NULL;     END LOOP;   END LOOP;   dbms_output.put_line('Statement 2 : ' || (SYSTIMESTAMP - v_ts));   v_ts := SYSTIMESTAMP;   -- Magical 11g scalar subquery caching   FOR i IN 1..v_repeat LOOP     FOR rec IN (       SELECT /*+ index(payment i_payment_date) */ *       FROM payment       WHERE payment_date < (SELECT SYSDATE - v_range FROM dual)     ) LOOP       NULL;     END LOOP;   END LOOP;   dbms_output.put_line('Statement 3 : ' || (SYSTIMESTAMP - v_ts)); END; / SCOTT@book> alter system flush shared_pool; System altered. SCOTT@book> alter session set "_serial_direct_read"=never; Session altered. SCOTT@book> @ expr1.txt Statement 1 : +000000000 00:00:02.385992000 Statement 2 : +000000000 00:00:02.257972000 Statement 3 : +000000000 00:00:02.249990000 PL/SQL procedure successfully completed. SCOTT@book> @ expr1.txt Statement 1 : +000000000 00:00:02.197627000 Statement 2 : +000000000 00:00:02.002312000 Statement 3 : +000000000 00:00:01.987825000 PL/SQL procedure successfully completed. SCOTT@book> @ expr1.txt Statement 1 : +000000000 00:00:02.279598000 Statement 2 : +000000000 00:00:01.991588000 Statement 3 : +000000000 00:00:01.986305000 PL/SQL procedure successfully completed. --//你可以发现比较次数少量,3者差距很小. 6.在12c下测试如下: SCOTT@test01p> @ver1 PORT_STRING                    VERSION        BANNER                                                                               CON_ID ------------------------------ -------------- -------------------------------------------------------------------------------- ---------- IBMPC/WIN_NT64-9.1.0           12.1.0.1.0     Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0 SCOTT@test01p> @ expr.txt Statement 1 : +000000000 00:00:06.851000000 Statement 2 : +000000000 00:00:06.048000000 Statement 3 : +000000000 00:00:06.812000000 PL/SQL procedure successfully completed. SCOTT@test01p> @ expr.txt Statement 1 : +000000000 00:00:06.803000000 Statement 2 : +000000000 00:00:06.187000000 Statement 3 : +000000000 00:00:07.083000000 PL/SQL procedure successfully completed. --//注:建立不建立索引,在12c先全部是全表扫描. --//3者差距很小.不过使用表达式还是有点慢,不是很明显. 7.总结: 1.可以看出在sql语句出现表达式最慢的,重点还是在于比较的数量上.不管10g,11g,12c.特别注意的情况是10g差距明显.有2倍的差距. 2.500次,大约有1秒的差距(11g的情况,12c小一些),也就是1000/500=2ms,每次2ms的差距,如果语句大量累积还是很可观的. 3.另外作者10g下测试我猜测一定是标量子查询的语句使用索引,这样比较次数明显减少. 4.给人一种感觉走索引更快,当然我仅仅一个用户执行. 5.我自己还有1个疑问,我在10g,11g的测试环境硬件环境相似.全表扫描执行时间上为什么存在这么大的差异.

相关推荐