查詢性能問題
使用 Demo 從這裡重現我的問題,表結構更改如下,並將展示分區函式修改為 datetime
CREATE TABLE [dbo].[DemoPartitionedTable]( [DemoID] [int] IDENTITY(1,1) NOT NULL, [SomeData] [sysname] NOT NULL, [CaptureDate] [datetime] NULL, CONSTRAINT [PK_DemoPartitionedTable] UNIQUE NONCLUSTERED ( [DemoID] ASC, [CaptureDate] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ) GO
使用者執行如下查詢(查詢 1)
SELECT [DemoID], [SomeData], [CaptureDate] FROM [dbo].[DemoPartitionedTable] WHERE (1=1) and (1=1) and CONVERT(varchar(10),CaptureDate,112) between 20190912 and 20190912
計劃是https://www.brentozar.com/pastetheplan/?id=r1veZhovH
返回 500 萬行大約需要 4 小時
如果我使用如下分區鍵來改進上述程式碼(查詢 2)
SELECT [DemoID], [SomeData], [CaptureDate] FROM [dbo].[DemoPartitionedTable] WHERE (1=1) and (1=1) and CaptureDate>= cast('2019-09-12' as date) and CaptureDate< cast('2019-09-13' as date)
這在大約 40 分鐘內完成
計劃這個https://www.brentozar.com/pastetheplan/?id=BkDYl3svB
現在,如果我在下面執行,它將在 10 分鐘內完成並根據需要直接掃描當天的特定分區,而前 2 個查詢掃描所有分區(查詢 3)
SELECT [DemoID], [SomeData], [CaptureDate] FROM [dbo].[DemoPartitionedTable] WHERE (1=1) AND (1=1) AND $partition.DemoPartitionFunction(CaptureDate)>=$partition.DemoPartitionFunction('09/12/2019') AND $partition.DemoPartitionFunction(CaptureDate)<$partition.DemoPartitionFunction('09/13/2019')
計劃https://www.brentozar.com/pastetheplan/?id=S1j9CjsPB
上面的問題是使用者必須在查詢之前找到PF名稱,如我的查詢3中使用的那樣。為什麼查詢2不能執行類似於查詢並使用分區消除?有沒有一種方法可以對查詢進行編碼以查找 PF 名稱而不是在查詢中進行硬編碼?
請指教
問題中的表沒有分區。我假設預期的定義是:
CREATE PARTITION FUNCTION DemoPartitionFunction (datetime) AS RANGE RIGHT FOR VALUES (DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -7), DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -6), DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -5), DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -4), DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -3), DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -2), DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -1), DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 0), DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 1), DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 2), DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 3), DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 4), DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 5), DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 6), DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 7)); CREATE PARTITION SCHEME DemoPartitionScheme AS PARTITION DemoPartitionFunction ALL TO ([DEFAULT]); CREATE TABLE [dbo].[DemoPartitionedTable]( [DemoID] [int] IDENTITY(1,1) NOT NULL, [SomeData] [sysname] NOT NULL, [CaptureDate] [datetime] NULL, CONSTRAINT [PK_DemoPartitionedTable] UNIQUE NONCLUSTERED ( [DemoID] ASC, [CaptureDate] ASC ) ON DemoPartitionScheme(CaptureDate) ) ON DemoPartitionScheme(CaptureDate);
查詢 1
SELECT [DemoID], [SomeData], [CaptureDate] FROM [dbo].[DemoPartitionedTable] WHERE (1=1) and (1=1) and CONVERT(varchar(10),CaptureDate,112) between 20190912 and 20190912
除了多餘的
1=1
謂詞之外,日期比較是一堆不匹配的類型。列datetime
CaptureDate被轉換為varchar(10)
沒有明確的樣式,然後與integers
.堆掃描上的殘差謂詞反映了這種混亂:
CONVERT_IMPLICIT(int,CONVERT(varchar(10),[dbo].[DemoPartitionedTable].[CaptureDate],112),0)>=(20190912) AND CONVERT_IMPLICIT(int,CONVERT(varchar(10),[dbo].[DemoPartitionedTable].[CaptureDate],112),0)<=(20190912)
這些謂詞不能使用索引,也不能幫助 SQL Server 消除分區。
查詢 2
SELECT [DemoID], [SomeData], [CaptureDate] FROM [dbo].[DemoPartitionedTable] WHERE (1=1) and (1=1) and CaptureDate>= cast('2019-09-12' as date) and CaptureDate< cast('2019-09-13' as date)
這個好一點。它避免轉換列,但出於某種原因選擇將
datetime
CaptureDate列與date
數據類型進行比較。這些是不同的數據類型。計劃中沒有關於類型轉換的警告,因為 SQL Server 可以在編譯期間將date
類型化文字轉換為。datetime
然而,類型不一致足以阻止分區消除。正確編寫查詢
SELECT DPT.DemoID, DPT.SomeData, DPT.CaptureDate FROM dbo.DemoPartitionedTable AS DPT WHERE 1 = 1 AND DPT.CaptureDate >= CONVERT(datetime, '20190912', 112) AND DPT.CaptureDate < CONVERT(datetime, '20190913', 112);
上面的查詢將
datetime
CaptureDate列與datetime
文字進行比較。注意顯式樣式的使用,因此 SQL Server 知道字元串的格式。在處理日期和時間時,您通常應該更喜歡CONVERT
使用正確的樣式。CAST
1 = 1
謂詞是為了避免簡單的參數化。此查詢允許 SQL Server 執行分區消除,如堆表上的Seek Predicate所示:
Seek Keys[1]: Prefix: PtnId1001 = Scalar Operator((1))
仍然有一個殘差謂詞應用於分區中的每一行:
[dbo].[DemoPartitionedTable].[CaptureDate] as [DPT].[CaptureDate]>='2019-09-12 00:00:00.000' AND [dbo].[DemoPartitionedTable].[CaptureDate] as [DPT].[CaptureDate]<'2019-09-13 00:00:00.000'
這是因為該表缺少以CaptureDate作為前導鍵的覆蓋索引。如果您希望查詢盡可能快,您應該創建一個這樣的索引。
有關更多詳細資訊,請參閱我的文章為什麼分區消除不起作用?
編寫查詢時請注意數據類型。