Postgresql

提高 postgres 中的查詢性能

  • March 29, 2022

我有一個表地址fabric_v2_2021_12

CREATE TABLE IF NOT EXISTS public.addressfabric_v2_2021_12(
country character varying(10) COLLATE pg_catalog."default",
areaname1 character varying(10) COLLATE pg_catalog."default",
areaname2 character varying(100) COLLATE pg_catalog."default",
areaname3 character varying(100) COLLATE pg_catalog."default",
areaname4 character varying(100) COLLATE pg_catalog."default",
postcode character varying(20) COLLATE pg_catalog."default",
pbkey character varying(50) COLLATE pg_catalog."default",
prop_type character varying(10) COLLATE pg_catalog."default",
postcode_ext character varying(10) COLLATE pg_catalog."default",
add_number character varying(20) COLLATE pg_catalog."default",
streetname character varying(60) COLLATE pg_catalog."default",
unit_type character varying(20) COLLATE pg_catalog."default",
unit_num character varying(50) COLLATE pg_catalog."default",
type character varying(3) COLLATE pg_catalog."default",
lat character varying(20) COLLATE pg_catalog."default",
lon character varying(20) COLLATE pg_catalog."default",
parent character varying(50) COLLATE pg_catalog."default",
geoid character varying(50) COLLATE pg_catalog."default",
geometry geometry,
CONSTRAINT enforce_addfb_srid_geometry_2021_12 CHECK (st_srid(geometry) = 4326))

我在以下查詢中遇到性能問題

select count(*) from addressfabric_v2_2021_12 as addr, (
SELECT ST_SubDivide('POLYGON((-74.4234467 40.7541435,-74.3904877 40.7489001,-74.380188 40.7587776,-74.3891144 40.7748904,-74.4179535 40.7790479,-74.4316864 40.7665746,-74.4241333 40.7566983,-74.4234467 40.7541435))'::geometry,20)) as
f(geom) where st_intersects(ST_SetSRID(f.geom,4326),addr.geometry);

在此之後我創建了以下索引

 CREATE INDEX IF NOT EXISTS addressfabric_v2_2021_12_geometry
ON public.addressfabric_v2_2021_12 USING gist
(geometry)  TABLESPACE pg_default;

性能有所提升,但仍不能令人滿意

Aggregate  (cost=591866320.40..591866320.41 rows=1 width=8) (actual 
       time=11.812..11.813 rows=1 loops=1)->Nested Loop  (cost=0.55..591346132.52 rows=208075152 width=0) (actual 
              time=0.219..11.476 rows=7072 loops=1)
   ->ProjectSet  (cost=0.00..30.02 rows=1000 width=32) (actual 
                    time=0.010..0.012 rows=1 loops=1)
         ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual 
                      time=0.001..0.001 rows=1 loops=1)
   ->  Index Scan using addressfabric_v2_2021_12_geomtry on 
       addressfabric_v2_2021_12 addr  (cost=0.55..591138.01 rows=20808 
       width=32) (actual time=0.207..10.893 rows=7072 loops=1)
         Index Cond: (geometry && st_setsrid((st_subdivide('01030000000100000008000000580630C0199B52C0D87E32C687604440603F1FC0FD9852C04EBC5EF5DB5F44403A950C00559852C0FC77D89F1F614440C5991540E79852C0FFFDCE9B2F634440EB6F09C0BF9A52C03D4272D7B76344403188FEBFA09B52C0DE78D21D1F6244404029FFFF249B52C02AB8697DDB604440580630C0199B52C0D87E32C687604440'::geometry, 20, '-1'::double precision)), 4326))
         Filter: st_intersects(st_setsrid((st_subdivide('01030000000100000008000000580630C0199B52C0D87E32C687604440603F1FC0FD9852C04EBC5EF5DB5F44403A950C00559852C0FC77D89F1F614440C5991540E79852C0FFFDCE9B2F634440EB6F09C0BF9A52C03D4272D7B76344403188FEBFA09B52C0DE78D21D1F6244404029FFFF249B52C02AB8697DDB604440580630C0199B52C0D87E32C687604440'::geometry, 20, '-1'::double precision)), 4326), geometry)
         Rows Removed by Filter: 1987
Planning Time: 0.685 ms
Execution Time: 11.900 ms 

任何幫助將不勝感激。

編輯

在使用 join 重寫查詢時,queryplan 還包括函式掃描

 select count(*) from addressfabric_v2_2022_1 as addr join ST_SubDivide('POLYGON((-74.4234467 40.7541435,-74.3904877 40.7489001,-74.380188 40.7587776,-74.3891144 40.7748904,-74.4179535 40.7790479,-74.4316864 40.7665746,-74.4241333 40.7566983,-74.4234467 40.7541435))'::geometry,20)as f(geom) on st_intersects(ST_SetSRID(f.geom,4326),addr.geometry);

查詢計劃

Aggregate  (cost=543972801.24..543972801.25 rows=1 width=8) (actual time=11.208..11.209 rows=1 loops=1)->Nested 
Loop(cost=25.55..543493555.50 rows=191698296 width=0) 
(actual time=0.114..10.856 rows=7073 loops=1)
->Function Scan on st_subdivide f  (cost=25.00..35.00 rows=1000 
width=32) (actual time=0.010..0.011 rows=1 loops=1)
->  Index Scan using addressfabric_v2_2022_1_geometry on 
addressfabric_v2_2022_1 addr(cost=0.55..543301.82 rows=19170 
width=32)   (actual time=0.101..10.220 rows=7073 loops=1)
Index Cond: (geometry && st_setsrid(f.geom, 4326))
Filter: st_intersects(st_setsrid(f.geom, 4326), geometry)
Rows Removed by Filter: 1987
Planning Time: 0.092 ms
Execution Time: 11.240 ms

有沒有辦法在新的查詢計劃中創建索引而不是函式掃描?

並行設置和大數據集後更新查詢計劃

Aggregate  (cost=10539438036.82..10539438036.83 rows=1 width=8) (actual time=740.148..740.149 rows=1 loops=1)
 ->  Nested Loop  (cost=10000000025.55..10538917273.50 rows=208305328 width=0) (actual time=0.173..717.950 rows=429963 loops=1)
       ->  Function Scan on st_subdivide f  (cost=25.00..35.00 rows=1000 width=32) (actual time=0.008..0.010 rows=1 loops=1)
       ->  Index Scan using addressfabric_v2_2022_1_geometry on addressfabric_v2_2022_1 addr  (cost=0.55..538708.93 rows=20831 width=32) (actual time=0.162..679.372 rows=429963 loops=1)
             Index Cond: (geometry && st_setsrid(f.geom, 4326))
             Filter: st_intersects(st_setsrid(f.geom, 4326), geometry)
             Rows Removed by Filter: 32726
Planning Time: 0.108 ms
Execution Time: 740.184 ms

通過更改以下參數解決了該問題

  • work_mem 從 4mb 更改為 50mb
  • random_Page_cost從 4 更改為 1
  • max_wal_size從 2 GB 減少到 128 MB
  • cpu_index_tuple_cost至 0.003

還按照@Gerad H.pille 的建議更改了以下內容

max_parallel__Workers_per_Gather 
max_parallel_maintainence_workers 

可能讓你低於半秒的唯一方法似乎是向它投入更多的 CPU。要麼你得到一個更快的系統,要麼你使用更多你所擁有的,即。並行性。使用 PostgreSQL 13,您應該是安全的。

在查詢之前,執行:

SET max_parallel_workers = 8;
SET max_parallel_workers_per_gather = 4;

請參閱:並行 postgis

引用自:https://dba.stackexchange.com/questions/306815