格式錯誤的直方圖導致對嵌套循環的錯誤估計
我們在 SQL Server 2016 SP2 CU12 Enterprise 上有一個查詢,其中查詢優化器估計只有 1 行會來自嵌套循環連接運算符,實際上有 108501 行返回。這導致
Sort
操作員溢出到 TempDB。對 Nested Loops Join 的內部(索引搜尋)和外部輸入(索引搜尋)的估計是正確的。
我添加了跟踪標誌 2363(選擇性計算)和 3604(將輸出重定向到消息視窗),在這裡我發現有一個格式錯誤的直方圖:
Plan for computation: CSelCalcExpressionComparedToExpression( QCOL: [Object1].Column1 x_cmpEq QCOL: [Object3].Column18 ) Loaded histogram for column QCOL: [Object1].Column1 from stats with id 1 *** WARNING: badly-formed histogram *** Loaded histogram for column QCOL: [Object3].Column18 from stats with id 9 Selectivity: 1.07973e-009 Stats collection generated: CStCollJoin(ID=4, CARD=1 x_jtLeftSemi) CStCollBaseTable(ID=1, CARD=5.01133e+007 TBL: Schema1.Table2 AS TBL: AA) CStCollFilter(ID=3, CARD=108210) CStCollBaseTable(ID=2, CARD=2.00511e+006 TBL: Schema1.Table1 AS TBL: A) End selectivity computation
以上只是部分輸出,全文可以看這裡
當我用全掃描更新格式錯誤的直方圖時,估計是正確的(沒有全掃描,這個問題沒有解決)。
但是,只要在表中插入一條記錄,直方圖就會再次形成錯誤。
查詢計劃(帶有錯誤直方圖)可以在這裡找到,在這裡您可以找到更新統計資訊後的查詢計劃。
未啟用查詢優化器修復。當我為此查詢啟用原始基數估計器時,使用跟踪標誌 9481,我得到與更新統計資訊後相同的查詢計劃。
什麼會導致直方圖格式不正確?
有沒有辦法解決這個問題?
我嘗試了該
PERSIST_SAMPLE_PERCENT
選項,但沒有任何區別,直方圖的格式也很糟糕。
什麼會導致直方圖格式不正確?
這當然是一個錯誤。當 SQL Server 縮放不精確的直方圖並遇到每個鍵顯然具有多個值的唯一列時,就會發生這種情況。這在SQL Server 2016 中得到了部分解決,但在我看來,完整的修復只存在於 SQL Server 2017 上(我在 CU21 上測試過)。
有沒有辦法解決這個問題?
使用
FULLSCAN
統計資訊,原始 CE,升級到 SQL Server 2017,或等待完整修復被反向移植。展示
使用我的標準 Numbers 表創建腳本:
DROP TABLE IF EXISTS dbo.Numbers; GO -- Itzik Ben-Gan's row generator WITH L0 AS (SELECT 1 AS c UNION ALL SELECT 1), L1 AS (SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B), L2 AS (SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B), L3 AS (SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B), L4 AS (SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B), L5 AS (SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B), Nums AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5) SELECT -- Destination column type integer NOT NULL ISNULL(CONVERT(integer, N.n), 0) AS n INTO dbo.Numbers FROM Nums AS N WHERE N.n >= 1 AND N.n <= 10 * 1000 * 1000 OPTION (MAXDOP 1); GO ALTER TABLE dbo.Numbers ADD CONSTRAINT PK_Numbers_n PRIMARY KEY CLUSTERED (n) WITH (SORT_IN_TEMPDB = ON, MAXDOP = 1, FILLFACTOR = 100);
更新要採樣的統計資訊:
UPDATE STATISTICS dbo.Numbers;
瑣碎查詢(僅請求估計計劃):
SELECT * FROM dbo.Numbers AS N1 WHERE EXISTS ( SELECT * FROM dbo.Numbers AS N2 WHERE N2.n = N1.n ) OPTION (QUERYTRACEON 3604, QUERYTRACEON 2363);
跟踪輸出包括:
Begin selectivity computation Input tree: LogOp_LeftSemiJoin CStCollBaseTable(ID=1, CARD=1e+007 TBL: dbo.Numbers AS TBL: N1) CStCollBaseTable(ID=2, CARD=1e+007 TBL: dbo.Numbers AS TBL: N2) ScaOp_Comp x_cmpEq ScaOp_Identifier QCOL: [N2].n ScaOp_Identifier QCOL: [N1].n Plan for computation: CSelCalcExpressionComparedToExpression( QCOL: [N1].n x_cmpEq QCOL: [N2].n ) Loaded histogram for column QCOL: [N1].n from stats with id 1 *** WARNING: badly-formed histogram *** Selectivity: 1
在 Microsoft SQL Server 2016 (SP2-CU14) 上驗證。
通過以下方式獲得的統計直方圖:
DBCC SHOW_STATISTICS (Numbers, [PK_Numbers_n]) WITH HISTOGRAM;
有如下條目:
在 SQL Server 2017 上,直方圖為:
請注意,這些
EQ_ROWS
步驟是預先調整的,而不是依靠 CE 來完成。