Postgresql

如何使用 OR 子句優化帶有時間戳的查詢?

  • March 24, 2021

我有下表:

create table myTable
(
   id        serial                                                               not null
   rowtype   integer                                                              not null
   starttime timestamp                                                            not null,
   endtime   timestamp,
   clientid  integer                                                              not null
   deletedat timestamp default '1980-01-01 00:00:00'::timestamp without time zone not null,

   constraint myTable_pkey primary key,
   constraint myTable_rowtype_fkey references rowtype (number),
   constraint myTable_clientid_fkey references client,
   constraint myTable_rowtype_starttime_endtime_clientid_deletedat_key unique (rowtype, starttime, endtime, clientid, deletedat)
);

我需要執行以下查詢:

select myTable.id as id, client.id as clientid, myTable.rowtype as rowtype, myTable.starttime as starttime, myTable.endtime as endtime
from myTable
   inner join client on myTable.clientid = client.id 
where myTable.rowtype in (1, 2, 3, 4)
 and ((myTable.starttime>'2021-03-04 19:33:26+00' and myTable.starttime<'2021-03-05 00:40:28+00')
       or (myTable.endtime>'2021-03-04 19:33:26+00' and myTable.endtime<'2021-03-05 00:40:28+00'))
 and myTable.deletedat<'2000-01-01 00:00:00'
 and client.deletedat<'2000-01-01 00:00:00';

解釋分析:

Nested Loop  (cost=0.84..15782.99 rows=1 width=28) (actual time=246.485..557.682 rows=2 loops=1)
   ->  Index Scan using client_unique_otherFk on client  (cost=0.28..8.30 rows=1 width=4) (actual time=0.047..0.052 rows=1 loops=1)
       Index Cond: ((otherFk = 10) AND (deletedat < '2000-01-01 00:00:00'::timestamp without time zone))
   ->  Index Scan using myTable_rowtype_starttime_endtime_clientid_deletedat_key on myTable  (cost=0.56..15774.63 rows=6 width=28) (actual time=246.460..557.643 rows=2 loops=1)
       Index Cond: ((rowtype = ANY ('{1,2,3,4}'::integer[])) AND (clientid = client.id) AND (deletedat < '2000-01-01 00:00:00'::timestamp without time zone))
       Filter: (((starttime > '2021-03-04 19:33:26'::timestamp without time zone) AND (starttime < '2021-03-05 00:40:28'::timestamp without time zone)) OR ((endtime > '2021-03-04 19:33:26'::timestamp without time zone) AND (endtime < '2021-03-05 00:40:28'::timestamp without time zone)))
       Rows Removed by Filter: 9423

通過解釋分析我看到子句for myTable.starttimeand myTable.endtimedo not use myTable_rowtype_starttime_endtime_clientid_deletedat_keyindex。我認為這是 OR 子句,對吧?

  1. 我的第一個問題是,在這種情況下,有一種方法可以使用 OR 子句查詢使用索引嗎?

但是,我僅使用starttime或僅endtime在 Where 子句中測試查詢並刪除 OR。在兩種情況下都使用 index myTable_rowtype_starttime_endtime_clientid_deletedat_key,但是當我只使用endtimequery 的執行時間執行時,它比我只使用 starttime 慢得多,即使兩者都使用相同的 index

查詢僅使用 endtime 範例:

select myTable.id as id, client.id as clientid, myTable.rowtype as rowtype, myTable.starttime as starttime, myTable.endtime as endtime
from myTable
   inner join client on myTable.clientid = client.id 
where myTable.rowtype in (1, 2, 3, 4)
 and (myTable.endtime>'2021-03-04 19:33:26+00' and myTable.endtime<'2021-03-05 00:40:28+00')
 and myTable.deletedat<'2000-01-01 00:00:00'
 and client.deletedat<'2000-01-01 00:00:00';

解釋分析:

Nested Loop  (cost=0.84..15232.04 rows=1 width=28) (actual time=276.628..568.299 rows=1 loops=1)
   ->  Index Scan using client_unique_otherFk on client  (cost=0.28..8.30 rows=1 width=4) (actual time=0.047..0.052 rows=1 loops=1)
       Index Cond: ((otherFk = 10) AND (deletedat < '2000-01-01 00:00:00'::timestamp without time zone))
   ->  Index Scan using myTable_rowtype_starttime_endtime_clientid_deletedat_key on myTable  (cost=0.56..15223.71 rows=3 width=28) (actual time=276.573..568.233 rows=1 loops=1)
       Index Cond: ((rowtype = ANY ('{1,2,3,4}'::integer[])) AND (endtime > '2021-03-04 19:33:26'::timestamp without time zone) AND (endtime < '2021-03-05 00:40:28'::timestamp without time zone) AND (clientid = client.id) AND (deletedat < '2000-01-01 00:00:00'::timestamp without time zone))
Planning time: 0.453 ms
Execution time: 568.409 ms

僅使用 starttime 而不是 endtime,執行時間平均為 0.449 毫秒。

  1. 我的第二個問題是關於starttimeendtime欄位的時間差,兩者都使用相同的索引。這個時差是否與 endtime 不是 null 但 starttime 是否有任何關係?如果不是,那會是什麼?
  2. 我能做些什麼來優化這個查詢?(使用帶有 OR 的 starttime 和 endtime 就像第一個範例查詢一樣)。

通常,最有效的方法是避免可怕OR並將其替換為UNION

  SELECT myTable.id, client.id, ...
  FROM myTable
      JOIN client ON myTable.clientid = client.id 
  WHERE myTable.rowtype in (1, 2, 3, 4)
    AND myTable.starttime > '2021-03-04 19:33:26+00'
    AND myTable.starttime < '2021-03-05 00:40:28+00'
    AND myTable.deletedat < '2000-01-01 00:00:00'
    AND client.deletedat < '2000-01-01 00:00:00';
UNION
  SELECT myTable.id, client.id, ...
  FROM myTable
      JOIN client ON myTable.clientid = client.id 
  WHERE myTable.rowtype in (1, 2, 3, 4)
    AND myTable.endtime > '2021-03-04 19:33:26+00'
    AND myTable.endtime < '2021-03-05 00:40:28+00'
    AND myTable.deletedat < '2000-01-01 00:00:00'
    AND client.deletedat < '2000-01-01 00:00:00';

如果SELECT列表中有兩個主鍵,這將產生相同的結果。如果您可以忍受重複,則可以使用更高效的UNION ALL.

我不確定哪些條件是選擇性條件,但您可以從兩個索引ON mytable (starttime)ON mytable (endtime).

引用自:https://dba.stackexchange.com/questions/287536