Postgresql
當其他子查詢使用非並行聚合時,為什麼子查詢不使用並行聚合?
我定義了 2 個等效聚合。第一個支持並行模式,但第二個不支持。我用索引定義了 2 個相同的表。
create table jsfield(j jsonb); insert into jsfield select jsonb_build_object('c', s) from generate_series(1, 100000) s; create index jsfield_idx on jsfield(j); create table jsfield2 as select * from jsfield; create index jsfield2_idx on jsfield2(j); create or replace function shallow_jsonb_object_merge(st jsonb, a jsonb) returns jsonb immutable language sql as $$ select st || a ; $$ ; create or replace aggregate "Semigroup_seq" (jsonb) ( sfunc = shallow_jsonb_object_merge, stype = jsonb ); create or replace aggregate "Semigroup_par" (jsonb) ( sfunc = shallow_jsonb_object_merge, stype = jsonb, combinefunc = shallow_jsonb_object_merge, parallel = safe, initcond = '{}' );
當單獨使用聚合時,explain analyze 會產生預期的計劃:
explain analyze select "Semigroup_seq"(j) from jsfield ; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Aggregate (cost=26737.04..26737.05 rows=1 width=32) (actual time=122.797..122.797 rows=1 loops=1) -> Seq Scan on jsfield (cost=0.00..1736.04 rows=100004 width=26) (actual time=0.015..6.934 rows=100004 loops=1) Planning Time: 0.420 ms Execution Time: 122.847 ms explain analyze select "Semigroup_par"(j) from jsfield2 ; QUERY PLAN --------------------------------------------- Finalize Aggregate (cost=10559.84..10559.85 rows=1 width=32) (actual time=61.475..64.994 rows=1 loops=1) -> Gather (cost=10558.78..10559.09 rows=3 width=32) (actual time=61.363..64.884 rows=4 loops=1) Workers Planned: 3 Workers Launched: 3 -> Partial Aggregate (cost=9558.78..9558.79 rows=1 width=32) (actual time=42.316..42.316 rows=1 loops=4) -> Parallel Index Only Scan using jsfield2_idx on jsfield2 (cost=0.42..1494.03 rows=32259 width=26) (actual time=0.030..2.781 rows=25001 loops=4) Heap Fetches: 0 Planning Time: 0.363 ms Execution Time: 65.065 ms
並行聚合執行肯定更快,但是當我將兩個查詢作為子查詢同時執行時,解釋顯示兩個查詢的順序掃描!
explain analyze select (select "Semigroup_par"(j) from jsfield2), (select "Semigroup_seq"(j) from jsfield) ; QUERY PLAN -------------------------------------------------- Result (cost=53474.10..53474.11 rows=1 width=64) (actual time=271.377..271.378 rows=1 loops=1) InitPlan 1 (returns $0) -> Aggregate (cost=26737.04..26737.05 rows=1 width=32) (actual time=174.772..174.773 rows=1 loops=1) -> Seq Scan on jsfield2 (cost=0.00..1736.04 rows=100004 width=26) (actual time=0.007..7.586 rows=100004 loops=1) InitPlan 2 (returns $1) -> Aggregate (cost=26737.04..26737.05 rows=1 width=32) (actual time=96.599..96.599 rows=1 loops=1) -> Seq Scan on jsfield (cost=0.00..1736.04 rows=100004 width=26) (actual time=0.006..5.362 rows=100004 loops=1) Planning Time: 0.581 ms Execution Time: 271.407 ms
為什麼規劃器更喜歡 seq 掃描而不是並行掃描?是因為兩個子查詢的成本相同並且並行模式的記憶體要貴得多嗎?雖然刨床如何知道它是否僅由成本參數指導。
這可能是一個錯誤嗎?它是可調節的行為嗎?
server_version ------------------------------------- 14.2 (Ubuntu 14.2-1.pgdg22.04+1+b1)
不,這是按預期工作的。如果在查詢中的任何地方使用了並行不安全函式,則該查詢不能使用並行性,句點。用文件的話來說:
當查詢包含任何並行不安全的內容時,該查詢將完全禁用並行查詢。