Oracle

Oracle 查詢優化

  • April 6, 2015

我們有一個在視圖上執行的查詢,其執行時間約為 6 分鐘。

SELECT COUNT(*) 
FROM someschema.item_total_violations 
WHERE channel_id = 122507833
AND item_date >= TIMESTAMP'2015-02-01 00:00:00';

它不是物化視圖,因此item_date.

視圖的基本查詢如下:

SELECT i.id
   ,i.title
   ,i.body
   ,i.description
   ,i.item_date
   ,i.created_date
   ,i.channel_id
   ,i.community_id
   ,i.person_id
   ,i.message_type
   ,ch.client_id
   ,upper(p.NAME) AS person_name_upper
   ,upper(ch.NAME) AS channel_name_upper
   ,p.NAME AS person_name
   ,ch.NAME AS channel_name
   ,ch.connector_type AS channel_type
   ,cm.NAME AS community_name
   ,cm.network_id
   ,nvl(ct2.cnt, 0) + nvl(ct4.cnt, 0) AS total_violations_count
   ,nvl(ct1.cnt, 0) + nvl(ct3.cnt, 0) AS quarantined_violations_count
   ,nvl(ct5.cnt, 0) AS quarantined_attributes_count
   ,nvl(lh_info.cnt_lh, 0) AS legal_holds_count
   ,nvl(lh_info.cnt_lh_w_no_date, 0) AS legal_holds_count_w_no_date
   ,nvl(lh_info.max_hold_until, to_date('1970-01-01', 'yyyy-mm-dd')) AS legal_hold_max_hold_until
FROM item i
   ,channel ch
   ,person p
   ,community cm
   ,(
       SELECT /*+ NO_MERGE */ pv.item_id
           ,count(*) AS cnt
       FROM policy_violation pv
       WHERE pv.item_id IS NOT NULL
           AND pv.quarantine_status = 'QUARANTINED'
       GROUP BY pv.item_id
       ) ct1
   ,(
       SELECT /*+ NO_MERGE */ pv.item_id
           ,count(*) AS cnt
       FROM policy_violation pv
       WHERE pv.item_id IS NOT NULL
       GROUP BY pv.item_id
       ) ct2
   ,(
       SELECT /*+ NO_MERGE */ aa.item_id AS item_id
           ,COUNT(*) AS cnt
       FROM item_attachment aa
           ,policy_violation pv
       WHERE aa.id = pv.item_attachment_id
           AND pv.quarantine_status = 'QUARANTINED'
       GROUP BY aa.item_id
       ) ct3
   ,(
       SELECT /*+ NO_MERGE */ aa.item_id AS item_id
           ,COUNT(*) AS cnt
       FROM item_attachment aa
           ,policy_violation pv
       WHERE aa.id = pv.item_attachment_id
           AND pv.item_id IS NULL
       GROUP BY aa.item_id
       ) ct4
   ,(
       SELECT /*+ NO_MERGE */ pv.item_id
           ,count(DISTINCT (pv.item_attribute_id)) AS cnt
       FROM policy_violation pv
       WHERE pv.item_id IS NOT NULL
           AND pv.item_attribute_id IS NOT NULL
           AND pv.quarantine_status = 'QUARANTINED'
       GROUP BY pv.item_id
       ) ct5
   ,(
       SELECT lh_rel.item_id AS item_id
           ,count(DISTINCT (lh.id)) AS cnt_lh
           ,max(lh.HOLD_UNTIL) AS max_hold_until
           ,sum(CASE 
                   WHEN lh.hold_until IS NULL
                       THEN 1
                   ELSE 0
                   END) AS cnt_lh_w_no_date
       FROM LEGAL_HOLD_RELATION lh_rel
           ,LEGAL_HOLD lh
       WHERE lh.id = lh_rel.legal_hold_id
           AND lh.STATUS = 'A'
       GROUP BY lh_rel.item_id
       ) lh_info
WHERE ch.STATUS = 'A'
   AND i.channel_id = ch.id
   AND i.person_id = p.id
   AND i.community_id = cm.id(+)
   AND i.id = ct1.item_id(+)
   AND i.id = ct2.item_id(+)
   AND i.id = ct3.item_id(+)
   AND i.id = ct4.item_id(+)
   AND i.id = ct5.item_id(+)
   AND i.id = lh_info.item_id(+)

item表是最大的,我們已經嘗試對錶進行分區,chaannel_id但這對我們的查詢沒有幫助。我知道原始查詢中沒有索引item_date,它對性能有影響。除了嘗試物化視圖之外,我還可以對視圖的基本查詢進行改進嗎?

你為什麼要對這個查詢使用這個視圖?

該視圖收集違規和其他內容,但您的查詢根本不關心這些,只關心項目的數量。由於外部連接,視圖列出了所有項目,而不管這些項目,因此您基本上執行了許多不必要的額外工作來收集違規和其他內容(NO_MERGE 提示使情況變得更糟),然後您將其丟棄並根據項目計數完全不同的標準。

查詢應該是這樣的:

select
 count(*) 
from
 item i
 join channel ch on (i.channel_id = ch.id)
 join person p on (i.person_id = p.id)
where
 ch.status = 'A'
 and i.channel_id = 122507833
 and i.item_date >= timestamp'2015-02-01 00:00:00';

此外,您不需要物化視圖來索引列。只需在基表的列上創建索引。

除了主鍵和外鍵的必備索引之外,另一個可能的候選者是 item_date 列,具體取決於您的數據分佈。

您可以通過(例如)替換來提高查詢效率:

   ,(
       SELECT /*+ NO_MERGE */ pv.item_id
           ,count(*) AS cnt
       FROM policy_violation pv
       WHERE pv.item_id IS NOT NULL
           AND pv.quarantine_status = 'QUARANTINED'
       GROUP BY pv.item_id
       ) ct1
   ,(
       SELECT /*+ NO_MERGE */ pv.item_id
           ,count(*) AS cnt
       FROM policy_violation pv
       WHERE pv.item_id IS NOT NULL
       GROUP BY pv.item_id
       ) ct2
   ,

和:

   ,(
       SELECT /*+ NO_MERGE */ pv.item_id
           ,count(*) AS cnt
           ,count(case when pv.quarantine_status = 'QUARANTINED' then 1 end) as cnt_quarantined
       FROM policy_violation pv
       WHERE pv.item_id IS NOT NULL
       GROUP BY pv.item_id
       ) ct2
   ,

這種重寫也可以應用於查詢中的其他子選擇。這應該給我們:

[...]
,nvl(ct2.cnt_quarantined, 0) + nvl(ct4.cnt_quarantined, 0) AS quarantined_violations_count
[...]
,(
   SELECT /*+ NO_MERGE */ pv.item_id
       ,count(*) AS cnt
       ,count(case when pv.quarantine_status = 'QUARANTINED' then 1 end) as cnt_quarantined 
   FROM policy_violation pv
   WHERE pv.item_id IS NOT NULL
   GROUP BY pv.item_id
   ) ct2
,(
   SELECT /*+ NO_MERGE */ aa.item_id AS item_id
       ,COUNT(*) AS cnt
       ,count(case when pv.quarantine_status = 'QUARANTINED' then 1 end) as cnt_quarantined   
   FROM item_attachment aa
       ,policy_violation pv
   WHERE aa.id = pv.item_attachment_id
       AND pv.item_id IS NULL
   GROUP BY aa.item_id
   ) ct4

您可能可以使用左連接進一步合併這兩個重寫,但如果不了解您的模型的更多細節,很難說。無論如何,這種重寫應該會減少您的查詢必須做的工作量。

作為一個建議,我建議您對此類大型查詢使用 ansi 連接,因為它可以更輕鬆地閱讀和理解查詢

引用自:https://dba.stackexchange.com/questions/97096