Postgresql

使用一個視圖時不同(更糟)的查詢計劃

  • April 19, 2021

我正在執行以下查詢:

SELECT m.entityid,
   m.alias::text AS alias,
   m.measurement_ts,
   m.volume_m3::double precision AS volume_m3,
   m.pressure_mca,
   m.battery_volt,
   (m.volume_m3 - lag(m.volume_m3) OVER (PARTITION BY m.entityid ORDER BY m.measurement_ts))::double precision / NULLIF(date_part('epoch'::text, m.measurement_ts - lag(m.measurement_ts) OVER (PARTITION BY m.entityid ORDER BY m.measurement_ts)), 0::double precision) * 3600::double precision AS flow,
   (m.volume_m3 - lag(m.volume_m3) OVER (PARTITION BY m.entityid ORDER BY m.measurement_ts))::double precision AS volumedelta,
   date_part('epoch'::text, m.measurement_ts - lag(m.measurement_ts) OVER (PARTITION BY m.entityid ORDER BY m.measurement_ts)) AS seconds_since_last
  FROM openiot_json.all_pde_data m
where measurement_ts BETWEEN '2021-01-18T20:00:31.69Z' AND '2021-04-18T19:00:31.69Z' and alias = '1002149751'
order by "measurement_ts" desc

該查詢執行良好,執行計劃如下:

Sort  (cost=936.08..936.67 rows=236 width=112) (actual time=64.371..64.749 rows=4334 loops=1)
 Sort Key: addt.measurement_ts DESC
 Sort Method: quicksort  Memory: 802kB
 ->  WindowAgg  (cost=912.03..926.78 rows=236 width=112) (actual time=51.539..62.264 rows=4334 loops=1)
       ->  Sort  (cost=912.03..912.62 rows=236 width=68) (actual time=51.514..51.867 rows=4334 loops=1)
             Sort Key: ((pts.sid)::text), addt.measurement_ts
             Sort Method: quicksort  Memory: 802kB
             ->  Nested Loop  (cost=13.03..902.73 rows=236 width=68) (actual time=1.992..45.831 rows=4334 loops=1)
                   ->  Index Scan using ppe_to_sigfox_alias_sid_start_date on ppe_to_sigfox pts  (cost=0.27..8.29 rows=1 width=56) (actual time=0.031..0.036 rows=1 loops=1)
                         Index Cond: ((alias)::text = '1002149751'::text)
                   ->  Bitmap Heap Scan on all_devices_data_table addt  (cost=12.76..892.07 rows=237 width=48) (actual time=1.955..43.490 rows=4334 loops=1)
                         Recheck Cond: ((entityid = (pts.sid)::text) AND (measurement_ts >= pts."startDate") AND (measurement_ts <= pts."endDate") AND (measurement_ts >= '2021-01-18 21:00:31.69+01'::timestamp with time zone) AND (measurement_ts <= '2021-04-18 21:00:31.69+02'::timestamp with time zone))
                         Heap Blocks: exact=3803
                         ->  Bitmap Index Scan on all_devices_data_table_etdesc  (cost=0.00..12.70 rows=237 width=0) (actual time=1.520..1.520 rows=4334 loops=1)
                               Index Cond: ((entityid = (pts.sid)::text) AND (measurement_ts >= pts."startDate") AND (measurement_ts <= pts."endDate") AND (measurement_ts >= '2021-01-18 21:00:31.69+01'::timestamp with time zone) AND (measurement_ts <= '2021-04-18 21:00:31.69+02'::timestamp with time zone))
Planning Time: 0.636 ms
Execution Time: 65.103 ms

到現在為止還挺好。現在,如果您查看原始查詢,您可以在SELECT列表中看到一些我不想繼續複製和粘貼的計算指標。所以我創建了一個VIEW為我做這個的:

CREATE OR REPLACE VIEW openiot_json.measurement
AS SELECT m.entityid,
   m.alias::text AS alias,
   m.measurement_ts,
   m.volume_m3::double precision AS volume_m3,
   m.pressure_mca,
   m.battery_volt,
   (m.volume_m3 - lag(m.volume_m3) OVER (PARTITION BY m.entityid ORDER BY m.measurement_ts))::double precision / NULLIF(date_part('epoch'::text, m.measurement_ts - lag(m.measurement_ts) OVER (PARTITION BY m.entityid ORDER BY m.measurement_ts)), 0::double precision) * 3600::double precision AS flow,
   (m.volume_m3 - lag(m.volume_m3) OVER (PARTITION BY m.entityid ORDER BY m.measurement_ts))::double precision AS volumedelta,
   date_part('epoch'::text, m.measurement_ts - lag(m.measurement_ts) OVER (PARTITION BY m.entityid ORDER BY m.measurement_ts)) AS seconds_since_last
  FROM openiot_json.all_pde_data m;

現在,如果我再次嘗試相同的查詢,但針對視圖:

SELECT
   *
FROM
   openiot_json.measurement
WHERE
 measurement_ts BETWEEN '2021-01-18T20:00:31.69Z' AND '2021-04-18T19:00:31.69Z' and alias = '1002149751'
ORDER by measurement_ts desc

我的性能要差得多,因為查詢計劃發生了變化:

Sort  (cost=123776.16..123777.61 rows=581 width=112) (actual time=18810.436..18810.910 rows=4334 loops=1)
 Sort Key: measurement.measurement_ts DESC
 Sort Method: quicksort  Memory: 802kB
 ->  Subquery Scan on measurement  (cost=104880.52..123749.48 rows=581 width=112) (actual time=18385.000..18808.447 rows=4334 loops=1)
       Filter: ((measurement.measurement_ts >= '2021-01-18 21:00:31.69+01'::timestamp with time zone) AND (measurement.measurement_ts <= '2021-04-18 21:00:31.69+02'::timestamp with time zone) AND (measurement.alias = '1002149751'::text))
       Rows Removed by Filter: 2152725
       ->  WindowAgg  (cost=104880.52..119621.90 rows=235862 width=112) (actual time=12030.287..18600.273 rows=2157059 loops=1)
             ->  Sort  (cost=104880.52..105470.18 rows=235862 width=68) (actual time=12030.232..13001.832 rows=2157059 loops=1)
                   Sort Key: ((pts.sid)::text), addt.measurement_ts
                   Sort Method: external merge  Disk: 160488kB
                   ->  Gather  (cost=1017.07..74155.19 rows=235862 width=68) (actual time=29.601..817.669 rows=2157059 loops=1)
                         Workers Planned: 2
                         Workers Launched: 2
                         ->  Hash Join  (cost=17.07..49568.99 rows=98276 width=68) (actual time=21.239..517.195 rows=719020 loops=3)
                               Hash Cond: (addt.entityid = (pts.sid)::text)
                               Join Filter: ((addt.measurement_ts >= pts."startDate") AND (addt.measurement_ts <= pts."endDate"))
                               Rows Removed by Join Filter: 3001
                               ->  Parallel Seq Scan on all_devices_data_table addt  (cost=0.00..32967.84 rows=884484 width=48) (actual time=0.041..147.070 rows=707600 loops=3)
                               ->  Hash  (cost=10.92..10.92 rows=492 width=56) (actual time=21.029..21.031 rows=496 loops=3)
                                     Buckets: 1024  Batches: 1  Memory Usage: 53kB
                                     ->  Seq Scan on ppe_to_sigfox pts  (cost=0.00..10.92 rows=492 width=56) (actual time=20.783..20.898 rows=496 loops=3)
Planning Time: 1.008 ms
JIT:
 Functions: 55
 Options: Inlining false, Optimization false, Expressions true, Deforming true
 Timing: Generation 10.280 ms, Inlining 0.000 ms, Optimization 3.027 ms, Emission 58.193 ms, Total 71.500 ms
Execution Time: 18882.123 ms

查看查詢計劃,第一個查詢首先過濾行,然後排序。但是,第二個查詢是首先排序(這需要很多時間,因為它在磁碟上排序)然後過濾。

為什麼會這樣,我該如何解決它,以便使用視圖獲得良好的查詢計劃和性能?

就像註釋一樣,openiot_json.all_pde_data原始查詢中使用的也是 a VIEW,它使用表:

CREATE OR REPLACE VIEW openiot_json.all_pde_data
AS SELECT pts.sid::text AS entityid,
   pts.alias,
   addt.recvtime,
   addt.measurement_ts,
   addt.volume_m3 + pts."offset" AS volume_m3,
   addt.pressure_mca,
   addt.battery_volt,
   pts."offset"
  FROM openiot_json.all_devices_data_table addt
    JOIN openiot_json.ppe_to_sigfox pts ON addt.entityid = pts.sid::text AND addt.measurement_ts >= pts."startDate" AND addt.measurement_ts <= pts."endDate";

我正在使用以下版本:

x86_64-pc-linux-gnu 上的 PostgreSQL 12.4 (Debian 12.4-1.pgdg100+1),由 gcc (Debian 8.3.0-6) 8.3.0 編譯,64 位

這兩個查詢是不等價的。

沒有視圖的第一個查詢在應用該子句執行視窗函式。視圖上的第二個查詢在應用子句之前 執行視窗函式。WHERE
WHERE

這可能導致不同的結果。並且(顯然)針對不同的查詢計劃。

您的第二個視圖peniot_json.all_pde_data不使用會產生類似效果的視窗功能(或任何其他功能)。

考慮查詢中的事件序列,SELECT如下所示:

引用自:https://dba.stackexchange.com/questions/290004