Query-Performance

格式錯誤的直方圖導致對嵌套循環的錯誤估計

  • August 27, 2020

我們在 SQL Server 2016 SP2 CU12 Enterprise 上有一個查詢,其中查詢優化器估計只有 1 行會來自嵌套循環連接運算符,實際上有 108501 行返回。這導致Sort操作員溢出到 TempDB。

對 Nested Loops Join 的內部(索引搜尋)和外部輸入(索引搜尋)的估計是正確的。

我添加了跟踪標誌 2363(選擇性計算)和 3604(將輸出重定向到消息視窗),在這裡我發現有一個格式錯誤的直方圖:

Plan for computation:

  CSelCalcExpressionComparedToExpression( QCOL: [Object1].Column1 x_cmpEq QCOL: [Object3].Column18 )

Loaded histogram for column QCOL: [Object1].Column1 from stats with id 1  *** WARNING: badly-formed histogram ***

Loaded histogram for column QCOL: [Object3].Column18 from stats with id 9

Selectivity: 1.07973e-009

Stats collection generated: 

  CStCollJoin(ID=4, CARD=1 x_jtLeftSemi)

      CStCollBaseTable(ID=1, CARD=5.01133e+007 TBL: Schema1.Table2 AS TBL: AA)

      CStCollFilter(ID=3, CARD=108210)

          CStCollBaseTable(ID=2, CARD=2.00511e+006 TBL: Schema1.Table1 AS TBL: A)

End selectivity computation

以上只是部分輸出,全文可以看這裡

當我用全掃描更新格式錯誤的直方圖時,估計是正確的(沒有全掃描,這個問題沒有解決)。

但是,只要在表中插入一條記錄,直方圖就會再次形成錯誤。

查詢計劃(帶有錯誤直方圖)可以在這裡找到,在這裡您可以找到更新統計資訊後的查詢計劃。

未啟用查詢優化器修復。當我為此查詢啟用原始基數估計器時,使用跟踪標誌 9481,我得到與更新統計資訊後相同的查詢計劃。

什麼會導致直方圖格式不正確?

有沒有辦法解決這個問題?

我嘗試了該PERSIST_SAMPLE_PERCENT選項,但沒有任何區別,直方圖的格式也很糟糕。

什麼會導致直方圖格式不正確?

這當然是一個錯誤。當 SQL Server 縮放不精確的直方圖並遇到每個鍵顯然具有多個值的唯一列時,就會發生這種情況。這在SQL Server 2016 中得到了部分解決,但在我看來,完整的修復只存在於 SQL Server 2017 上(我在 CU21 上測試過)。

有沒有辦法解決這個問題?

使用FULLSCAN統計資訊,原始 CE,升級到 SQL Server 2017,或等待完整修復被反向移植。

展示

使用我的標準 Numbers 表創建腳本:

DROP TABLE IF EXISTS dbo.Numbers;
GO
-- Itzik Ben-Gan's row generator
WITH
 L0   AS (SELECT 1 AS c UNION ALL SELECT 1),
 L1   AS (SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
 L2   AS (SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
 L3   AS (SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
 L4   AS (SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
 L5   AS (SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
 Nums AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)
SELECT
   -- Destination column type integer NOT NULL
   ISNULL(CONVERT(integer, N.n), 0) AS n
INTO dbo.Numbers
FROM Nums AS N
WHERE N.n >= 1
AND N.n <= 10 * 1000 * 1000
OPTION (MAXDOP 1);
GO
ALTER TABLE dbo.Numbers
ADD CONSTRAINT PK_Numbers_n
PRIMARY KEY CLUSTERED (n)
WITH (SORT_IN_TEMPDB = ON, MAXDOP = 1, FILLFACTOR = 100);

更新要採樣的統計資訊:

UPDATE STATISTICS dbo.Numbers;

瑣碎查詢(僅請求估計計劃):

SELECT * 
FROM dbo.Numbers AS N1
WHERE EXISTS
(
   SELECT * 
   FROM dbo.Numbers AS N2
   WHERE N2.n = N1.n
)
OPTION (QUERYTRACEON 3604, QUERYTRACEON 2363);

跟踪輸出包括:

Begin selectivity computation

Input tree:

 LogOp_LeftSemiJoin
     CStCollBaseTable(ID=1, CARD=1e+007 TBL: dbo.Numbers AS TBL: N1)
     CStCollBaseTable(ID=2, CARD=1e+007 TBL: dbo.Numbers AS TBL: N2)
     ScaOp_Comp x_cmpEq
         ScaOp_Identifier QCOL: [N2].n
         ScaOp_Identifier QCOL: [N1].n

Plan for computation:
 CSelCalcExpressionComparedToExpression( QCOL: [N1].n x_cmpEq QCOL: [N2].n )

Loaded histogram for column QCOL: [N1].n from stats with id 1
   *** WARNING: badly-formed histogram ***

Selectivity: 1

在 Microsoft SQL Server 2016 (SP2-CU14) 上驗證。

通過以下方式獲得的統計直方圖:

DBCC SHOW_STATISTICS 
   (Numbers, [PK_Numbers_n])
   WITH HISTOGRAM;

有如下條目:

2016年直方圖

在 SQL Server 2017 上,直方圖為:

2017年直方圖

請注意,這些EQ_ROWS步驟是預先調整的,而不是依靠 CE 來完成。

引用自:https://dba.stackexchange.com/questions/270270