PostgreSQL执行计划变化

来源:这里教程网 时间:2026-03-14 20:35:55 作者:

PostgreSQL执行计划变化,当PostgreSQL使用绑定变量时,由于优化器无法预估表达式返回的行数,就默认选择统计信息收集行数的5%,评估的行数会远远低于实际执行时返回的行数。在表关联时,这种评估会对关联方式产生影响,驱动表返回行数少,优化器更倾向于使用nested loop关联方式,执行效率会下降。下面是测试案例:

 

创建测试关联表。

create table tym_p1 (

id varchar(32) not null default sys_guid()

, tym_no varchar(32) not null

, tym_date timestamp(0) not null

, primary key (id)

)

;

 

create index tym_p1_no on tym_p1 (tym_no);

 

创建测试关联子表。

create table tym_c1 (

id varchar(32) not null default sys_guid()

, tym_no varchar(32) not null

, tym_date timestamp(0) not null

, tym_user varchar(32) not null

, primary key (id)

)

;

 

create index tym_c1_no on tym_c1 (tym_no);

 

生成测试数据

for i in `seq 100000`; do psql -d phroc -c "insert into tym_p1 (tym_no,tym_date) values(to_char($i,'999999'),now())";done

 

insert into tym_c1 select a.*, tym_no from tym_p1 a;

 

准备语句

prepare tym_prep04 (

varchar(50),varchar(50)

) as

select * from tym_p1 p where

p.tym_date

BETWEEN to_timestamp($1, 'yyyy-MM-dd hh24:mi:ss')

AND to_timestamp($2, 'yyyy-MM-dd hh24:mi:ss')

;

 

explain analyze execute tym_prep04(

'2022-03-02 00:00:00','2022-03-09 23:59:59');

 

执行结果,第一次至第五次的执行计划,从执行计划预估行数是返回 99980 行,实际执行时 100000 行。

explain analyze execute tym_prep04(

'2022-03-02 00:00:00','2022-03-09 23:59:59');

                                                                                          QUERY PLAN                                                        

-------------------------------------------------------------------------------------------------------------------------------------------------------------

 Seq Scan on tym_p1 p  (cost=0.00..3031.00 rows= 99980 width=49) (actual time=0.011..148.967 rows=100000 loops=1)

   Filter: ((tym_date >= to_timestamp('2022-03-02 00:00:00'::text, 'yyyy-MM-dd hh24:mi:ss'::text)) AND (tym_date <= to_timestamp('2022-03-09 23:59:59'::text,

 Planning Time: 0.119 ms

 Execution Time: 152.085 ms

(4 rows)

 

执行结果,第六次的执行计划,,从执行计划预估行数是返回 500 行,实际执行时 100000 行。

explain analyze execute tym_prep04(

'2022-03-02 00:00:00','2022-03-09 23:59:59');

                                                                         QUERY PLAN                                                                         

-------------------------------------------------------------------------------------------------------------------------------------------------------------

 Seq Scan on tym_p1 p  (cost=0.00..3031.00 rows= 500 width=49) (actual time=0.041..151.032 rows=100000 loops=1)

   Filter: ((tym_date >= to_timestamp(($1)::text, 'yyyy-MM-dd hh24:mi:ss'::text)) AND (tym_date <= to_timestamp(($2)::text, 'yyyy-MM-dd hh24:mi:ss'::text)))

 Planning Time: 0.122 ms

 Execution Time: 154.185 ms

(4 rows)

 

如果是几张表关联时,这种评估会对连接方式产生影响。

prepare tym_prep05 (

varchar(50),varchar(50)

) as

select * from tym_p1 p, tym_c1 c where

p.tym_no = c.tym_no and

p.tym_date

BETWEEN to_timestamp($1, 'yyyy-MM-dd hh24:mi:ss')

AND to_timestamp($2, 'yyyy-MM-dd hh24:mi:ss')

;

 

explain analyze execute tym_prep05(

'2022-03-02 00:00:00','2022-03-09 23:59:59');

 

第一次到第五次执行,由于预估返回表tym_p1的行数是99980,优先采用了hash join的方式。

 

explain analyze execute tym_prep05(

'2022-03-02 00:00:00','2022-03-09 23:59:59');

                                                                                             QUERY PLAN                                                     

-------------------------------------------------------------------------------------------------------------------------------------------------------------

 Hash Join  (cost=3387.00..7792.72 rows=99980 width=106) (actual time=35.425..229.255 rows=100000 loops=1)

   Hash Cond: ((p.tym_no)::text = (c.tym_no)::text)

   ->  Seq Scan on tym_p1 p  (cost=0.00..3031.00 rows= 99980 width=49) (actual time=0.016..152.250 rows=100000 loops=1)

         Filter: ((tym_date >= to_timestamp('2022-03-02 00:00:00'::text, 'yyyy-MM-dd hh24:mi:ss'::text)) AND (tym_date <= to_timestamp('2022-03-09 23:59:59':

   ->  Hash  (cost=2137.00..2137.00 rows=100000 width=57) (actual time=34.740..34.741 rows=100000 loops=1)

         Buckets: 131072  Batches: 1  Memory Usage: 10399kB

         ->  Seq Scan on tym_c1 c  (cost=0.00..2137.00 rows=100000 width=57) (actual time=0.006..12.159 rows=100000 loops=1)

 Planning Time: 0.434 ms

 Execution Time: 234.067 ms

(9 rows)

 

Time: 234.852 ms

 

第六次执行,由于预估返回表tym_p1的行数是294,优先采用了nested loop的方式。

explain analyze execute tym_prep05(

'2022-03-02 00:00:00','2022-03-09 23:59:59');

                                                                               QUERY PLAN                                                                   

-------------------------------------------------------------------------------------------------------------------------------------------------------------

 Gather  (cost=1000.42..4474.63 rows=500 width=106) (actual time=0.509..538.023 rows=100000 loops=1)

   Workers Planned: 1

   Workers Launched: 1

   ->  Nested Loop  (cost=0.42..3424.63 rows=294 width=106) (actual time=0.086..496.033 rows=50000 loops=2)

         ->  Parallel Seq Scan on tym_p1 p  (cost=0.00..2207.47 rows= 294 width=49) (actual time=0.035..85.307 rows=50000 loops=2)

               Filter: ((tym_date >= to_timestamp(($1)::text, 'yyyy-MM-dd hh24:mi:ss'::text)) AND (tym_date <= to_timestamp(($2)::text, 'yyyy-MM-dd hh24:mi:s

         ->  Index Scan using tym_c1_no on tym_c1 c  (cost=0.42..4.13 rows=1 width=57) (actual time=0.008..0.008 rows=1 loops=100000)

               Index Cond: ((tym_no)::text = (p.tym_no)::text)

 Planning Time: 0.380 ms

 Execution Time: 543.929 ms

(10 rows)

解决方案:

    不使用绑定变量,优化器可以根据统计信息的直方图评估较为接近的行数,生成相对较优的执行计划。

    使用pg hint插件,通过插件固化执行计划。

相关推荐