Postgresql

給定以下表格/索引/說明規劃,為什麼查詢這麼慢?

  • December 19, 2020

我有以下表格:

CREATE TABLE trace
(
 uuid uuid NOT NULL,
 result text,
 ruleid uuid,
 previousruleresultid uuid,
 senderduns text,
 receiverduns text,
 eventid uuid,
 relatedeventid uuid,
 emailrecipients text,
 emailbody text,
 created timestamp with time zone,
 identifierkeyvalues text,
 failuremessage text,
 errorflag boolean DEFAULT false,
 active boolean,
 carrier text,
 modified timestamp with time zone,
 processingmode numeric,
 currentstatus text,
 origin text,
 destination text,
 ontimestatus text,
 shipper text,
 deliveryrequesteddate timestamp without time zone,
 eventdate timestamp without time zone,
 CONSTRAINT traceruleresult_pk PRIMARY KEY (uuid),
 CONSTRAINT tracerule_fk FOREIGN KEY (ruleid)
     REFERENCES rule (uuid) MATCH SIMPLE
     ON UPDATE NO ACTION ON DELETE NO ACTION
)

CREATE TABLE resultitem
(
 resultid uuid NOT NULL,
 type text NOT NULL,
 name text NOT NULL,
 created timestamp without time zone,
 CONSTRAINT shipmentitem_pkey PRIMARY KEY (resultid, type, name),
 CONSTRAINT resultitem_resultid_fkey FOREIGN KEY (resultid)
     REFERENCES trace (uuid) MATCH SIMPLE
     ON UPDATE NO ACTION ON DELETE NO ACTION
)

這是我的查詢:

explain analyze select
   distinct trace.*
from
   Trace trace 
left outer join
   ResultItem resultitem2_ 
       on trace.uuid=resultitem2_.resultId 
where
   trace.errorFlag=false 
   and (
       trace.deliveryRequestedDate>= '2017-08-03T14:01:45.555Z' 
       or trace.deliveryRequestedDate is null
   ) 
   and (
       trace.deliveryRequestedDate<= '2017-08-24T14:01:45.555Z'
       or trace.deliveryRequestedDate is null
   ) 
   and (
       trace.previousRuleResultId is null
   ) 
order by
   trace.deliveryRequestedDate asc limit 15

這是解釋計劃的結果:

'Limit  (cost=56463.26..56464.24 rows=15 width=709) (actual time=28542.669..28542.755 rows=15 loops=1)'
'  ->  Unique  (cost=56463.26..58530.39 rows=31802 width=709) (actual time=28542.666..28542.723 rows=15 loops=1)'
'        ->  Sort  (cost=56463.26..56542.77 rows=31802 width=709) (actual time=28542.662..28542.679 rows=15 loops=1)'
'              Sort Key: trace.deliveryrequesteddate, trace.uuid, trace.result, trace.ruleid, trace.previousruleresultid, trace.senderduns, trace.receiverduns, trace.eventid, trace.relatedeventid, trace.emailrecipients, trace.emailbody, trace.created, trace.identifierkeyvalues, trace.failuremessage, trace.errorflag, trace.active, trace.carrier, trace.modified, trace.processingmode, trace.currentstatus, trace.origin, trace.destination, trace.ontimestatus, trace.shipper, trace.eventdate'
'              Sort Method: external merge  Disk: 88088kB'
'              ->  Hash Right Join  (cost=39049.07..44081.98 rows=31802 width=709) (actual time=566.923..1281.588 rows=72427 loops=1)'
'                    Hash Cond: (resultitem2_.resultid = trace.uuid)'
'                    ->  Seq Scan on resultitem resultitem2_  (cost=0.00..1140.41 rows=54141 width=16) (actual time=0.005..115.204 rows=54062 loops=1)'
'                    ->  Hash  (cost=35793.54..35793.54 rows=31802 width=709) (actual time=566.765..566.765 rows=26961 loops=1)'
'                          Buckets: 1024  Batches: 8  Memory Usage: 4075kB'
'                          ->  Seq Scan on trace  (cost=0.00..35793.54 rows=31802 width=709) (actual time=0.010..339.345 rows=26961 loops=1)'
'                                Filter: ((NOT errorflag) AND (previousruleresultid IS NULL) AND ((deliveryrequesteddate >= '2017-08-03 14:01:45.555'::timestamp without time zone) OR (deliveryrequesteddate IS NULL)) AND ((deliveryrequesteddate <= '2017-08-24 14:01:45.555'::timestamp without time zone) OR (deliveryrequesteddate IS NULL)))'
'                                Rows Removed by Filter: 147970'
'Planning time: 0.420 ms'
'Execution time: 28569.880 ms'

我很難看到問題出在哪裡。顯然,我的左連接導致速度變慢,因為刪除該連接會使查詢在幾毫秒內返回,而不是近 30 秒。我不確定這是否是索引問題,因為我將 where 子句中的每一列都編入索引並且約束到位。我覺得我在這裡缺少一些基本的東西,但不確定是什麼。任何幫助將不勝感激。

嘗試添加以下索引:

  • 跟踪錯誤標誌
  • trace.deliveryRequestedDate
  • trace.previousRuleResultId
  • 結果項.resultid

主要問題一目了然

i) 數據類型:大量的文本數據類型,自然會變得繁重

ii) 編寫優化查詢

iii) 合適的指數。

iv) PRIMARY KEY (resultid, type, name):這裡的類型和名稱是無限大小,PK 的那部分對於 PK 來說是不錯的選擇。如果您可以限制該大小或更改數據類型或任何替代解決方案。

在您的查詢中,left join 和 trace.* 正在發揮作用,當您不需要結果表列時,為什麼不使用“Exists 子句”來消除差異。

這樣您就可以理解我的查詢並將其轉換為 POSTGRESQL,

select
    trace.result ,previousruleresultid ,mention other column that you need
from
   Trace trace 
where exists(select 1 from 
   ResultItem resultitem2_ 
       where  trace.uuid=resultitem2_.resultId )
and
   trace.errorFlag=false 
   and (
           ( trace.deliveryRequestedDate>= '2017-08-03T14:01:45.555Z' 
           and trace.deliveryRequestedDate<= '2017-08-24T14:01:45.555Z')
           or (  trace.deliveryRequestedDate is null ) 
       )

   and (
       trace.previousRuleResultId is null
   ) 
order by
   trace.deliveryRequestedDate asc limit 15

引用自:https://dba.stackexchange.com/questions/183187