执行计划存储 如果有同样的SQL要执行很多遍,且每次都是同样的执行计划,每次都发生硬解析,则会消耗大量时间,类似于Oracle的存放执行计划的library cache,PG也有一个类似的plan_cache概念,但实际上,PG提供的是预备语言(preparedstatement),它实际是要求应用给这个语句进行标识,之后应用再通过这标识请求服务端执行,并且由应用负责回收。 对于无参的预备语句,在第一次在执行的时候就会生成执行计划,之后再执行则会使用这个执行计划。对于有参的预备语句,最优的执行计划会因为变量的实际值而不一样,因此,在PG里,前5次执行预备语句,每一次都产生新的执行计划,叫做custom plan,第6次再执行时,会生成一个不依赖于参数的执行计划并保存下来,叫做generic plan。之后每一次执行一个预备语句,也都会善生一个相对应的custom plan,如果generic plan小于之前执行过的custom plan的平均值的1.1倍,则使用generic plan,否则使用当前产生的对应的custom plan。
custom plan是指对于preapre语句,在执行execute的时候,把execute语句中的参数嵌套到语句之后生成的计划。 custom plan会根据execute语句中具体的参数生成计划,这种方案的优点是每次都按照具体的参数生成优选计划,执行性能比较好; 缺点是每次执行前都需要重新生成计划,存在大量的重复的优化器开销。 generic plan是指对于preapre语句生成计划,该计划策略会在执行execute语句的时候把参数bind到plan中,然后执行计划。 这种方案的优点是每次执行可以省去重复的优化器开销;缺点是当bind参数字段上数据存在倾斜时该计划可能不是最优的, 部分bind参数场景下执行性能较差。
可以根据pg_prepared_statements视图显示当前会话所有可用的预备语句
postgres=# \d pg_prepared_statements View "pg_catalog.pg_prepared_statements" Column | Type | Collation | Nullable | Default -----------------+--------------------------+-----------+----------+--------- name | text | | | statement | text | | | prepare_time | timestamp with time zone | | | parameter_types | regtype[] | | | from_sql | boolean | | |
plan_cache_mode参数可以影响prepare语句选择生成执行计划的策略
auto表示按照默认的方式选择custom plan或者generic plan force_generic_plan表示强制走generic plan force_custom_plan表示强制走custom plan
此参数只对prepare语句生效,一般用在prepare语句中参数化字段存在比较严重的数据倾斜的场景下
通常情况,我们可以通过 explain,explain analyze,explain verbose来获取执行计划。 但是explain查询的当前缓存的执行计划, 在实际中估算的成本可能是不准确的,因为很可能估算的成本和你实际运行的成本不一致。而,explain analyze,explain verbose则会实际执行sql,在某些场景不会允许。
(可以尝试采用开启一个事务后,explain analyze,explain verbose查看执行计划,最后rollback)
pg_show_plans模块
接下来的主题则是一个供PostgreSQL数据库查询当前执行中sql的执行计划的模块—pg_show_plans,它可以动态查找当前正在执行中的sql的执行计划。 pg_show_plans 是一个显示所有当前运行的SQL语句的查询计划的模块。它在plan结束位置, 截获并存储当前plan tree. 从而其他会话可以打印存储的plan tree。此模块支持从9.5到12的PostgreSQL版本。它会在共享内存上创建一个哈希表,以便临时存储查询计划。哈希表大小不能更改,因此如果哈希表已满,则不会存储计划。
安装及使用介绍
1.进到数据库对应的contrib目录下
[postgres@t1ysl opt]$ cd /opt/postgresql-12.1/contrib/
2.获取pg_show_plans扩展包
[postgres@t1ysl contrib]$ git clone https://github.com/cybertec-postgresql/pg_show_plans.git Cloning into 'pg_show_plans'... remote: Enumerating objects: 70, done. remote: Counting objects: 100% (2/2), done. remote: Compressing objects: 100% (2/2), done. remote: Total 70 (delta 0), reused 0 (delta 0), pack-reused 68 Unpacking objects: 100% (70/70), done.
3.编译安装
[postgres@t1ysl contrib]$ cd pg_show_plans/ [postgres@t1ysl pg_show_plans]$ make make -C ../../src/backend generated-headers make[1]: Entering directory `/opt/postgresql-12.1/src/backend' make -C catalog distprep generated-header-symlinks make[2]: Entering directory `/opt/postgresql-12.1/src/backend/catalog' make[2]: Nothing to be done for `distprep'. make[2]: Nothing to be done for `generated-header-symlinks'. make[2]: Leaving directory `/opt/postgresql-12.1/src/backend/catalog' make -C utils distprep generated-header-symlinks make[2]: Entering directory `/opt/postgresql-12.1/src/backend/utils' make[2]: Nothing to be done for `distprep'. make[2]: Nothing to be done for `generated-header-symlinks'. make[2]: Leaving directory `/opt/postgresql-12.1/src/backend/utils' make[1]: Leaving directory `/opt/postgresql-12.1/src/backend' gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I. -I../../src/include -D_GNU_SOURCE -c -o pg_show_plans.o pg_show_plans.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -shared -o pg_show_plans.so pg_show_plans.o -L../../src/port -L../../src/common -Wl,--as-needed -Wl,-rpath,'/opt/pg12/lib',--enable-new-dtags [postgres@t1ysl pg_show_plans]$ make install make -C ../../src/backend generated-headers make[1]: Entering directory `/opt/postgresql-12.1/src/backend' make -C catalog distprep generated-header-symlinks make[2]: Entering directory `/opt/postgresql-12.1/src/backend/catalog' make[2]: Nothing to be done for `distprep'. make[2]: Nothing to be done for `generated-header-symlinks'. make[2]: Leaving directory `/opt/postgresql-12.1/src/backend/catalog' make -C utils distprep generated-header-symlinks make[2]: Entering directory `/opt/postgresql-12.1/src/backend/utils' make[2]: Nothing to be done for `distprep'. make[2]: Nothing to be done for `generated-header-symlinks'. make[2]: Leaving directory `/opt/postgresql-12.1/src/backend/utils' make[1]: Leaving directory `/opt/postgresql-12.1/src/backend' /bin/mkdir -p '/opt/pg12/lib/postgresql' /bin/mkdir -p '/opt/pg12/share/postgresql/extension' /bin/mkdir -p '/opt/pg12/share/postgresql/extension' /bin/install -c -m 755 pg_show_plans.so '/opt/pg12/lib/postgresql/pg_show_plans.so' /bin/install -c -m 644 ./pg_show_plans.control '/opt/pg12/share/postgresql/extension/' /bin/install -c -m 644 ./pg_show_plans--1.0.sql '/opt/pg12/share/postgresql/extension/'
4.在postgresql.conf文件的shared_preload_libraries里增加pg_show_plans,并重启数据库生效
vi postgresql.conf 增加 shared_preload_libraries = 'pg_show_plans' [postgres@t1ysl ~]$ pg_ctl restart waiting for server to shut down.... done server stopped waiting for server to start....2021-07-25 08:52:08.402 CST [2990] LOG: starting PostgreSQL 12.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit 2021-07-25 08:52:08.402 CST [2990] LOG: listening on IPv4 address "0.0.0.0", port 6000 2021-07-25 08:52:08.402 CST [2990] LOG: listening on IPv6 address "::", port 6000 2021-07-25 08:52:08.406 CST [2990] LOG: listening on Unix socket "/tmp/.s.PGSQL.6000" 2021-07-25 08:52:08.434 CST [2990] LOG: redirecting log output to logging collector process 2021-07-25 08:52:08.434 CST [2990] HINT: Future log output will appear in directory "/opt/pg_log6000". done server started
5.创建EXTENSION
postgres=# CREATE EXTENSION pg_show_plans; CREATE EXTENSION
6.通过pg_show_plans表可查看当前正在执行中的sql的执行计划
postgres=# \d pg_show_plans View "public.pg_show_plans" Column | Type | Collation | Nullable | Default --------+--------+-----------+----------+--------- pid | bigint | | | level | bigint | | | userid | oid | | | dbid | oid | | | plan | text | | |
模拟使用场景 1.开启两个session 一个session执行一条较慢sql(便于获取到其执行计划) 一个session在sql执行过程获取其执行计划
2.这里我举例的sql为对346MB的一张表的全表扫描
session1: postgres=# \dt+ t1_ysl List of relations Schema | Name | Type | Owner | Size | Description --------+--------+-------+----------+--------+------------- public | t1_ysl | table | postgres | 346 MB | (1 row) postgres=# select * from t1_ysl ; id --------- 3511203 5877715 7284053 4522491 3815961 6454179 2712063 ...
通过pg_show_plans和pg_stat_activity联合查询出当前执行中sql的执行计划。
session2: postgres=# SELECT * FROM pg_show_plans; pid | level | userid | dbid | plan ------+-------+--------+-------+------------------------------------------------------------------ ----- 1812 | 0 | 10 | 13593 | Function Scan on pg_show_plans (cost=0.00..10.00 rows=1000 width =56) 1899 | 0 | 10 | 13593 | Seq Scan on t1_ysl (cost=0.00..144247.77 rows=9999977 width=4) (2 rows) postgres=# SELECT p.pid, p.level, p.plan, a.query FROM pg_show_plans p LEFT JOIN pg_stat_activity a ON p.pid = a.pid AND p.level = 0 ORDER BY p.pid, p.level; pid | level | plan | query ------+-------+----------------------------------------------------------------------------------- -------------+---------------------------------------------------------------- 1812 | 0 | Sort (cost=72.08..74.58 rows=1000 width=80) +| SELECT p.pid, p.level, p.plan, a.query FROM pg_show_plans p + | | Sort Key: pg_show_plans.pid, pg_show_plans.level +| LEFT JOIN pg_stat_activity a + | | -> Hash Left Join (cost=2.25..22.25 rows=1000 width=80) +| ON p.pid = a.pid AND p.level = 0 ORDER BY p.pid, p.level; | | Hash Cond: (pg_show_plans.pid = s.pid) +| | | Join Filter: (pg_show_plans.level = 0) +| | | -> Function Scan on pg_show_plans (cost=0.00..10.00 rows=1000 width=48) +| | | -> Hash (cost=1.00..1.00 rows=100 width=44) +| | | -> Function Scan on pg_stat_get_activity s (cost=0.00..1.00 rows=1 00 width=44) | 1899 | 0 | Seq Scan on t1_ysl (cost=0.00..144247.77 rows=9999977 width=4) | select * from t1_ysl ; (2 rows)
相关参数
pg_show_plans.enable 是否可以显示计划。 pg_show_plans.plan_format 它控制查询计划的输出格式。可以选择文本或json。默认为文本。 pg_show_plans.max_plan_length 它设置查询计划的最大长度。默认值为8192[字节]。此参数必须设置为整数。
