在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
本文信息基于PG13.1。 从PG9.6开始支持并行查询。PG11开始支持CREATE TABLE … AS、SELECT INTO以及CREATE MATERIALIZED VIEW的并行查询。 先说结论: 换用create table as 或者select into或者导入导出。 首先跟踪如下查询语句的执行计划: select count(*) from test t1,test1 t2 where t1.id = t2.id ; postgres=# explain analyze select count(*) from test t1,test1 t2 where t1.id = t2.id ; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------- Finalize Aggregate (cost=34244.16..34244.17 rows=1 width=8) (actual time=683.246..715.324 rows=1 loops=1) -> Gather (cost=34243.95..34244.16 rows=2 width=8) (actual time=681.474..715.311 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 -> Partial Aggregate (cost=33243.95..33243.96 rows=1 width=8) (actual time=674.689..675.285 rows=1 loops=3) -> Parallel Hash Join (cost=15428.00..32202.28 rows=416667 width=0) (actual time=447.799..645.689 rows=333333 loops=3) Hash Cond: (t1.id = t2.id) -> Parallel Seq Scan on test t1 (cost=0.00..8591.67 rows=416667 width=4) (actual time=0.025..74.010 rows=333333 loops=3) -> Parallel Hash (cost=8591.67..8591.67 rows=416667 width=4) (actual time=260.052..260.053 rows=333333 loops=3) Buckets: 131072 Batches: 16 Memory Usage: 3520kB -> Parallel Seq Scan on test1 t2 (cost=0.00..8591.67 rows=416667 width=4) (actual time=0.032..104.804 rows=333333 loops=3) Planning Time: 0.420 ms Execution Time: 715.447 ms (13 rows) 可以看到走了两个Workers。 下边看一下insert into select: postgres=# explain analyze insert into va select count(*) from test t1,test1 t2 where t1.id = t2.id ; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------- Insert on va (cost=73228.00..73228.02 rows=1 width=4) (actual time=3744.179..3744.187 rows=0 loops=1) -> Subquery Scan on "*SELECT*" (cost=73228.00..73228.02 rows=1 width=4) (actual time=3743.343..3743.352 rows=1 loops=1) -> Aggregate (cost=73228.00..73228.01 rows=1 width=8) (actual time=3743.247..3743.254 rows=1 loops=1) -> Hash Join (cost=30832.00..70728.00 rows=1000000 width=0) (actual time=1092.295..3511.301 rows=1000000 loops=1) Hash Cond: (t1.id = t2.id) -> Seq Scan on test t1 (cost=0.00..14425.00 rows=1000000 width=4) (actual time=0.030..421.537 rows=1000000 loops=1) -> Hash (cost=14425.00..14425.00 rows=1000000 width=4) (actual time=1090.078..1090.081 rows=1000000 loops=1) Buckets: 131072 Batches: 16 Memory Usage: 3227kB -> Seq Scan on test1 t2 (cost=0.00..14425.00 rows=1000000 width=4) (actual time=0.021..422.768 rows=1000000 loops=1) Planning Time: 0.511 ms Execution Time: 3745.633 ms (11 rows) 可以看到并没有Workers的指示,没有启用并行查询。 即使开启强制并行,也无法走并行查询。 postgres=# set force_parallel_mode =on; SET postgres=# explain analyze insert into va select count(*) from test t1,test1 t2 where t1.id = t2.id ; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------- Insert on va (cost=73228.00..73228.02 rows=1 width=4) (actual time=3825.042..3825.049 rows=0 loops=1) -> Subquery Scan on "*SELECT*" (cost=73228.00..73228.02 rows=1 width=4) (actual time=3824.976..3824.984 rows=1 loops=1) -> Aggregate (cost=73228.00..73228.01 rows=1 width=8) (actual time=3824.972..3824.978 rows=1 loops=1) -> Hash Join (cost=30832.00..70728.00 rows=1000000 width=0) (actual time=1073.587..3599.402 rows=1000000 loops=1) Hash Cond: (t1.id = t2.id) -> Seq Scan on test t1 (cost=0.00..14425.00 rows=1000000 width=4) (actual time=0.034..414.965 rows=1000000 loops=1) -> Hash (cost=14425.00..14425.00 rows=1000000 width=4) (actual time=1072.441..1072.443 rows=1000000 loops=1) Buckets: 131072 Batches: 16 Memory Usage: 3227kB -> Seq Scan on test1 t2 (cost=0.00..14425.00 rows=1000000 width=4) (actual time=0.022..400.624 rows=1000000 loops=1) Planning Time: 0.577 ms Execution Time: 3825.923 ms (11 rows) 原因在官方文档有写:
解决方案有如下三种:
|
请发表评论