Sql-Server

如何重寫慢速 CTE 構造以匹配臨時表的速度

  • December 7, 2020

我認為這個社區的一般建議是避免使用臨時表來支持 CTE。但是,我有時會遇到 CTE 構造非常慢,而它們的臨時表等價物非常快的情況。

例如,這旋轉了幾個小時,似乎永遠不會產生結果。查詢計劃充滿了嵌套循環。

CREATE TABLE #TRIANGLES
(
   NODE_A VARCHAR(22),
   NODE_B VARCHAR(22),
   NODE_C VARCHAR(22)
)
;


INSERT INTO #TRIANGLES VALUES
/*  150,000 ROWS  */
;


CREATE NONCLUSTERED INDEX IDX_A ON #TRIANGLES (NODE_A);

CREATE NONCLUSTERED INDEX IDX_B ON #TRIANGLES (NODE_B);

CREATE NONCLUSTERED INDEX IDX_C ON #TRIANGLES (NODE_C);



WITH
TRIANGLES_FILTERED AS
(
   -- **** FILTERING OF THE TRIANGLE TABLE OCCURS IN A CTE ****
   SELECT   *
   FROM     #TRIANGLES AS T
   WHERE    LEN(T.NODE_A) = 2  AND
            LEN(T.NODE_B) = 2  AND
            LEN(T.NODE_C) = 2
),
CONNECTABLE_NODES AS
(
   SELECT   DISTINCT T1.NODE_C AS [NODE]
   FROM     TRIANGLES_FILTERED AS T1

            INNER JOIN
            TRIANGLES_FILTERED AS T2
            ON T1.NODE_B = T2.NODE_A  AND
               T1.NODE_C = T2.NODE_B

            INNER JOIN
            TRIANGLES_FILTERED AS T3
            ON T2.NODE_B = T3.NODE_A  AND
               T2.NODE_C = T3.NODE_B

   WHERE    T1.NODE_A <> T2.NODE_C  AND
            T1.NODE_A <> T3.NODE_C  AND
            T2.NODE_A <> T3.NODE_C
)
SELECT   *
FROM     #TRIANGLES AS T1
WHERE    T1.NODE_A IN (SELECT * FROM CONNECTABLE_NODES)  AND
        T1.NODE_B IN (SELECT * FROM CONNECTABLE_NODES)  AND
        T1.NODE_C IN (SELECT * FROM CONNECTABLE_NODES)
;

查詢計劃: https ://www.brentozar.com/pastetheplan/?id=rk_5TaiiP

鑑於此的查詢計劃使用雜湊匹配,並且它在瞬間執行:

CREATE TABLE #TRIANGLES
(
   NODE_A VARCHAR(22),
   NODE_B VARCHAR(22),
   NODE_C VARCHAR(22)
)
;


INSERT INTO #TRIANGLES VALUES
/*  150,000 ROWS  */
;


CREATE NONCLUSTERED INDEX IDX_A ON #TRIANGLES (NODE_A);

CREATE NONCLUSTERED INDEX IDX_B ON #TRIANGLES (NODE_B);

CREATE NONCLUSTERED INDEX IDX_C ON #TRIANGLES (NODE_C);



-- **** FILTERING OF THE TRIANGLE TABLE SAVED INTO A TEMP TABLE ****
SELECT   *
INTO     #TRIANGLES_FILTERED
FROM     #TRIANGLES AS T
WHERE    LEN(T.NODE_A) = 2  AND
        LEN(T.NODE_B) = 2  AND
        LEN(T.NODE_C) = 2
;    

CREATE NONCLUSTERED INDEX IDX_A ON #TRIANGLES_FILTERED (NODE_A);

CREATE NONCLUSTERED INDEX IDX_B ON #TRIANGLES_FILTERED (NODE_B);

CREATE NONCLUSTERED INDEX IDX_C ON #TRIANGLES_FILTERED (NODE_C);



WITH
CONNECTABLE_NODES AS
(
   SELECT   DISTINCT T1.NODE_C AS [NODE]
   FROM     #TRIANGLES_FILTERED AS T1

            INNER JOIN
            #TRIANGLES_FILTERED AS T2
            ON T1.NODE_B = T2.NODE_A  AND
               T1.NODE_C = T2.NODE_B

            INNER JOIN
            #TRIANGLES_FILTERED AS T3
            ON T2.NODE_B = T3.NODE_A  AND
               T2.NODE_C = T3.NODE_B

   WHERE    T1.NODE_A <> T2.NODE_C  AND
            T1.NODE_A <> T3.NODE_C  AND
            T2.NODE_A <> T3.NODE_C
)
SELECT   *
FROM     #TRIANGLES AS T1
WHERE    T1.NODE_A IN (SELECT * FROM CONNECTABLE_NODES)  AND
        T1.NODE_B IN (SELECT * FROM CONNECTABLE_NODES)  AND
        T1.NODE_C IN (SELECT * FROM CONNECTABLE_NODES)
;

查詢計劃: https ://www.brentozar.com/pastetheplan/?id=B1cZC6isD

我將如何將第一個重寫為與第二個一樣快?

順便說一句,如果您想知道所有幾何/拓撲是什麼,我需要知道在創建這個難題時所有三角形是如何相互連接的:

https ://puzzling.stackexchange.com/questions/105275/dragon -召喚咒語

有時 CTE 的估計錯誤。臨時表很擅長。

因此,CTE 使用這些索引是因為他們認為那裡的行數較少。第一個慢的原因是RID Lookup。如果您刪除索引或將輸出列添加為索引中的包含。它會更快。

這裡有一篇很棒的部落格文章。

我認為他們之間沒有勝利。您應該根據具體情況使用它們。並在相同的情況下嘗試它們。通過這種方式,您可以看到成本。

引用自:https://dba.stackexchange.com/questions/281050