Postgresql

慢速選擇視圖

  • April 3, 2015

我在 Postgres 9.3.1 中對視圖的簡單SELECT陳述非常緩慢。收據表有 2000 萬個條目,另一個有大約 17k。

但我仍然認為它應該快於~16 秒?

receipts

    Column      |            Type           | Modifiers | Storage     | Stats target  | Description
-----------------+---------------------------+-----------+-------------+---------------+--------------
profitcenter    | character varying(5)      | not null  | extended    |               |
receiptnumber   | bigint                    | not null  | plain       |               |
receiptposition | character varying(6)      | not null  | extended    |               |
date            | date                      | not null  | plain       |               |
customerid      | integer                   |           | plain       |               |
vkorg           | character varying(4)      |           | extended    |               |
artikelnummer   | integer                   |           | plain       |               |
price           | numeric(18,2)             |           | main        |               |
Indexes receiptnumber, date and customerid as B-Tree.

cc_add_entry2

   Column     |          Type          | Modifiers | Storage     | Stats target  | Description
---------------+------------------------+-----------+-------------+---------------+--------------
receiptnumber | bigint                 | not null  | plain       |               |
customerid    | bigint                 |           | plain       |               |
user          | character varying(255) |           | extended    |               |
note          | text                   |           | extended    |               |
date          | date                   |           | plain       |               |
Indexes receiptnumber, customerid as B-Tree.

觀點很簡單:

SELECT s.receiptnumber,
   s.receiptposition,
   s.date,
   COALESCE(b.customerid, (s.customerid)::bigint) AS customerid,
   s.price
  FROM (dashboard.receipts s
    LEFT JOIN dashboard.cc_add_entry2 b ON ((s.receiptnumber = b.receiptnumber)));

EXPLAIN ANALYZE通過一個簡單的查詢:

explain analyze select * from test2 where customerid = 520001215;

Hash Left Join  (cost=582.40..1086787.72 rows=100185 width=37) (actual time=2124.048..15937.744 rows=52 loops=1)
  Hash Cond: (s.receiptnumber = b.receiptnumber)
  Filter: (COALESCE(b.customerid, (s.customerid)::bigint) = 520001215)
  Rows Removed by Filter: 20036943
  ->  Seq Scan on receipts s  (cost=0.00..599969.96 rows=20036996 width=29) (actual time=0.904..7187.563 rows=20036995 loops=1)
  ->  Hash  (cost=360.51..360.51 rows=17751 width=16) (actual time=5.850..5.850 rows=17751 loops=1)
        Buckets: 2048  Batches: 1  Memory Usage: 694kB
        ->  Seq Scan on cc_add_entry2 b  (cost=0.00..360.51 rows=17751 width=16) (actual time=0.005..2.537 rows=17751 l
oops=1)
Total runtime: 15937.962 ms

我很確定原因是customerid視圖中的結果是:

COALESCE(b.customerid, s.customerid::bigint)

LEFT JOIN, cast &之後,COALESCE謂詞不能被下推,因此必須完整讀取兩個表,然後customerid = 520001215才能應用過濾器。在兩個表中都可以為 NULL的事實customerid也無濟於事。至少可以設置一個NOT NULL嗎?

效率極低。

執行此查詢,並比較性能:

SELECT r.receiptnumber
   ,  r.receiptposition
   ,  r.date
   ,  c.customerid
   ,  r.price
FROM   dashboard.cc_add_entry2 c 
JOIN   dashboard.receipts r USING (receiptnumber)
WHERE  c.customerid = 520001215;

UNION ALL
SELECT r.receiptnumber
   ,  r.receiptposition
   ,  r.date
   ,  r.customerid::bigint
   ,  r.price
FROM   dashboard.receipts r
LEFT   JOIN dashboard.cc_add_entry2 c USING (receiptnumber)
WHERE  r.customerid = 520001215
AND    c.customerid IS NULL;

應該快得多。如果不能將其包裝到視圖中,請改用此查詢或編寫一個 SQL 或 plpgsql 函式customerid作為參數。

這些多列索引也將提供更多幫助:

CREATE INDEX foo1 ON dashboard.cc_add_entry2 (customerid, receiptnumber);
CREATE INDEX foo2 ON dashboard.cc_add_entry2 (receiptnumber)
WHERE customerid IS NULL;
CREATE INDEX foo3 ON dashboard.receipts (customerid, receiptnumber);

引用自:https://dba.stackexchange.com/questions/96909