Sql-Server

重新索引是否更新統計資訊?

  • March 17, 2021

上週我一直在學習 MS10775A 課程,但培訓師無法可靠回答的一個問題是:

重新索引是否會更新統計資訊?

我們發現網上的討論爭論它有和它沒有。

在關心更新統計資訊時,您可以記住以下幾點(複製自Rebuilding Indexes vs. Updating Statistics (Benjamin Nevarez)

  1. 預設情況下,該UPDATE STATISTICS語句僅使用表的記錄樣本。使用UPDATE STATISTICS WITH FULLSCAN將掃描整個表。
  2. 預設情況下,該UPDATE STATISTICS語句同時更新索引和列統計資訊。使用該COLUMNS選項將僅更新列統計資訊。使用該INDEX選項將僅更新索引統計資訊。
  3. 重建索引,例如 usingALTER INDEX … REBUILD也將更新索引統計資訊與 using 等效,WITH FULLSCAN 除非 表已分區,在這種情況下,統計資訊僅被採樣(適用於 SQL Server 2012 及更高版本)。
  4. 使用手動創建CREATE STATISTICS的統計資訊不會被任何ALTER INDEX ... REBUILD操作更新,包括ALTER TABLE ... REBUILD. ALTER TABLE ... REBUILD如果在要重建的表上定義了聚集索引的統計資訊,則會更新聚集索引的統計資訊。
  5. 重新組織索引,例如 usingALTER INDEX … REORGANIZE不會更新任何統計資訊。

簡短的回答是您需要使用UPDATE STATISTICS更新列統計資訊,並且索引重建將僅更新索引統計資訊。UPDATE STATISTICS (tablename) WITH FULLSCAN;您可以使用語法強制更新表上的所有統計資訊,包括索引統計資訊和手動創建的統計資訊。

以下程式碼說明了上面封裝的規則:

首先,我們將創建一個包含幾列和一個聚集索引的表:

USE tempdb;

IF OBJECT_ID(N'dbo.SomeTable', N'U') IS NOT NULL
DROP TABLE dbo.SomeTable;

CREATE TABLE dbo.SomeTable
(
   rn int NOT NULL IDENTITY(1,1)
       CONSTRAINT pk
       PRIMARY KEY NONCLUSTERED
   , i int NOT NULL INDEX i 
   , d sysname NOT NULL
) ON [PRIMARY] WITH (DATA_COMPRESSION = NONE);

CREATE UNIQUE CLUSTERED INDEX cx ON dbo.SomeTable (i, d);

CREATE STATISTICS d ON dbo.SomeTable (d) WITH FULLSCAN;

INSERT INTO dbo.SomeTable (d, i)
SELECT c1.name, c1.id
FROM sys.syscolumns c1;

此查詢顯示每個 stats 對像上次更新的日期:

SELECT ObjectName = sc.name + N'.' + o.name
   , StatsName = s.name
   , StatsDate = STATS_DATE(s.object_id, s.stats_id)
FROM sys.stats s
   INNER JOIN sys.objects o ON s.object_id = o.object_id
   INNER JOIN sys.schemas sc ON o.schema_id = sc.schema_id
WHERE sc.name = N'dbo'
   AND o.name = N'SomeTable';

結果顯示尚未發生更新,這是正確的,因為我們剛剛創建了表:

╔═══════════════╦═══════════╦═══════════╗
║ ObjectName ║ StatsName ║ StatsDate ║
╠═══════════════╬═══════════╬═══════════╣
║ dbo.SomeTable ║ cx ║ NULL ║
║ dbo.SomeTable ║ i ║ NULL ║
║ dbo.SomeTable ║ pk ║ NULL ║
║ dbo.SomeTable ║ d ║ NULL ║
╚═══════════════╩═══════════╩═══════════╝

讓我們重建整個表,看看是否更新了統計資訊:

ALTER TABLE dbo.SomeTable REBUILD;

SELECT ObjectName = sc.name + N'.' + o.name
   , StatsName = s.name
   , StatsDate = STATS_DATE(s.object_id, s.stats_id)
FROM sys.stats s
   INNER JOIN sys.objects o ON s.object_id = o.object_id
   INNER JOIN sys.schemas sc ON o.schema_id = sc.schema_id
WHERE sc.name = N'dbo'
   AND o.name = N'SomeTable';
╔═══════════════╦═══════════╦═════════════════════════╗
║ ObjectName ║ StatsName ║ StatsDate ║
╠═══════════════╬═══════════╬═════════════════════════╣
║ dbo.SomeTable ║ cx ║ 2018-09-17 14:09:13.590 ║
║ dbo.SomeTable ║ i ║ NULL ║
║ dbo.SomeTable ║ pk ║ NULL ║
║ dbo.SomeTable ║ d ║ NULL ║
╚═══════════════╩═══════════╩═════════════════════════╝

結果顯示僅更新了聚集索引統計資訊。

接下來,我們執行離散UPDATE STATS操作:

UPDATE STATISTICS dbo.SomeTable(d) WITH FULLSCAN;

SELECT ObjectName = sc.name + N'.' + o.name
   , StatsName = s.name
   , StatsDate = STATS_DATE(s.object_id, s.stats_id)
FROM sys.stats s
   INNER JOIN sys.objects o ON s.object_id = o.object_id
   INNER JOIN sys.schemas sc ON o.schema_id = sc.schema_id
WHERE sc.name = N'dbo'
   AND o.name = N'SomeTable';

如您所見,我們剛剛更新了該d列的統計資訊:

╔═══════════════╦═══════════╦═════════════════════════╗
║ ObjectName ║ StatsName ║ StatsDate ║
╠═══════════════╬═══════════╬═════════════════════════╣
║ dbo.SomeTable ║ cx ║ 2018-09-17 14:09:13.590 ║
║ dbo.SomeTable ║ i ║ NULL ║
║ dbo.SomeTable ║ pk ║ NULL ║
║ dbo.SomeTable ║ d ║ 2018-09-17 14:09:13.597 ║
╚═══════════════╩═══════════╩═════════════════════════╝

現在,我們將更新整個表的統計資訊:

UPDATE STATISTICS dbo.SomeTable WITH FULLSCAN;

SELECT ObjectName = sc.name + N'.' + o.name
   , StatsName = s.name
   , StatsDate = STATS_DATE(s.object_id, s.stats_id)
FROM sys.stats s
   INNER JOIN sys.objects o ON s.object_id = o.object_id
   INNER JOIN sys.schemas sc ON o.schema_id = sc.schema_id
WHERE sc.name = N'dbo'
   AND o.name = N'SomeTable';
╔═══════════════╦═══════════╦═════════════════════════╗
║ ObjectName ║ StatsName ║ StatsDate ║
╠═══════════════╬═══════════╬═════════════════════════╣
║ dbo.SomeTable ║ cx ║ 2018-09-17 14:09:13.600 ║
║ dbo.SomeTable ║ i ║ 2018-09-17 14:09:13.600 ║
║ dbo.SomeTable ║ pk ║ 2018-09-17 14:09:13.603 ║
║ dbo.SomeTable ║ d ║ 2018-09-17 14:09:13.607 ║
╚═══════════════╩═══════════╩═════════════════════════╝

如您所見,確定所有統計資訊都已更新的唯一方法是手動更新每個統計資訊,或使用UPDATE STATISTICS (table);.

引用自:https://dba.stackexchange.com/questions/48991