對於最近的行,在 DateTime 上使用 RANK 查詢優化問題
我的視圖有問題,該視圖旨在儲存同步數據,用於 iOS 應用程序,將數據來回同步到 Web 服務。XML 欄位用於儲存用於轉置為 PDF 文件的結構化數據和 iOS 對象。對該表的頻繁寫入和讀取。
該表有大約 60 萬條記錄,由於兩個大型 XML 欄位,數據使用率非常高。數據庫表如下所示:
CREATE TABLE [dbo].[SyncData]( [Id] [int] IDENTITY(1,1) NOT NULL, [CustomerId] [int] NOT NULL, [CategoryId] [int] NOT NULL, [SyncXml] [xml] NOT NULL, [SyncDump] [xml] NULL, [DateCreated] [datetime] NOT NULL CONSTRAINT [DF_SyncData_DateCreated] DEFAULT (getdate()), [DateUpdated] [datetime] NOT NULL CONSTRAINT [DF_SyncData_DateUpdated] DEFAULT (getdate()), [DELETED_FLAG] [bit] NOT NULL CONSTRAINT [DF_SyncData_DELETED_FLAG] DEFAULT ((0)), CONSTRAINT [PK_SyncData] PRIMARY KEY CLUSTERED ( [Id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] GO
CustomerId 和 CategoryId 列具有外鍵。
Web 服務返回按類別組織的特定客戶的最新同步數據:
CREATE VIEW [dbo].[vw_SyncDataRecent] AS WITH Ranked AS ( SELECT Id, CustomerId, CategoryId, ROW_NUMBER() OVER ( PARTITION BY Id, CustomerId ORDER BY DateCreated DESC) AS rn FROM SyncData AS cb ) SELECT Id, CustomerId, CategoryId FROM Ranked WHERE rn = 1 GO
該表已經有幾個索引,它們可能是剩餘的(該數據庫自 2004 年以來一直在生產,並且已經添加了許多功能),所以我不確定它們是否還需要:
CREATE NONCLUSTERED INDEX [IX_SyncData_CustomerId] ON [dbo].[SyncData] ( [CustomerId] ASC ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO CREATE NONCLUSTERED INDEX [IX_SyncData_CustomerId_CategoryId] ON [dbo].[SyncData] ( [CustomerId] ASC, [CategoryId] ASC ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO CREATE NONCLUSTERED INDEX [IX_SyncData_CustomerId_CategoryId_DELETED_FLAG] ON [dbo].[SyncData] ( [CustomerId] ASC, [CategoryId] ASC, [DELETED_FLAG] ASC ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO
通常使用 CustomerId 上的 WHERE 子句簡單地呼叫視圖:
SELECT * FROM vw_SyncDataRecent WHERE CustomerId = 600528
結果如下所示:
| Id | CustomerId | CategoryId | |----- |------------ |------------ | | 189 | 600528 | 45 | | 188 | 600528 | 46 | | 191 | 600528 | 48 | | 192 | 600528 | 49 | | 190 | 600528 | 53 |
這些操作的性能很糟糕,最糟糕的是它對數據庫伺服器產生了嚴重影響,始終以 100% 的速度執行(這個視圖查詢似乎是最糟糕的罪魁禍首之一)。有沒有更好更有效的方法來做到這一點。我也剛剛意識到我不需要任何被標記為邏輯刪除的行(DELETED_FLAG),這似乎被忽略了,但我想這是另一回事。
我最初的想法是視圖本身是主要問題,因為它對整個數據集進行排名。此外,沒有包含 DateCreated 列的索引,但我不確定這對排序有多大用處。
由於該表很大,並且在生產中積極使用,我不想對錶本身進行大的更改,但我很樂意更改視圖或將其切換為使用儲存過程。
我的直覺就是這樣做,並創建一個儲存過程:
CREATE PROCEDURE [dbo].[_SyncData_Recent] ( @CustomerId int = null ) AS BEGIN WITH Ranked AS ( SELECT Id, CustomerId, CategoryId, ROW_NUMBER() OVER (PARTITION BY CustomerId, CategoryId ORDER BY DateCreated DESC) AS rn FROM SyncData AS cb WHERE CustomerId = @CustomerId AND DELETED_FLAG = 0 ) SELECT Id, CustomerId, CategoryId FROM Ranked WHERE rn = 1 AND CustomerId = @CustomerId END GO
想法?
我最初的想法是視圖本身是主要問題,因為它對整個數據集進行排名。
就是這樣,視窗函式應用於整個數據集,然後由
CustomerId
和過濾rn=1
。有關更多資訊,請參閱我之前的答案。僅當
CustomerId
數據到達過濾器運算符時才應用過濾器謂詞:您可以
OPTION(RECOMPILE)
使用過濾器添加到選擇查詢中以CustomerId
在執行時查看並在通過視窗函式傳遞數據之前進行過濾SELECT * FROM vw_SyncDataRecent WHERE CustomerId = 600528 OPTION(RECOMPILE);
我的其他答案中提到的其他方法不使用視圖,而是使用 tvf,不使用視窗函式進行過濾,儲存過程,……
TVF 範例
CREATE FUNCTION dbo.[Fnc_SyncDataRecent] ( @P1 INT ) RETURNS TABLE WITH SCHEMABINDING AS RETURN WITH Ranked AS ( SELECT Id, CustomerId, CategoryId, ROW_NUMBER() OVER ( PARTITION BY Id, CustomerId ORDER BY DateCreated DESC) AS rn FROM dbo.SyncData AS cb ) SELECT Id, CustomerId, CategoryId FROM Ranked WHERE rn = 1 and CustomerId = @P1 SELECT * FROM dbo.[Fnc_SyncDataRecent](600528);
我還要添加這個索引:
CREATE INDEX IX_CustomerID3 ON dbo.SyncData([CustomerId]) INCLUDE(CategoryId,ID,DateCreated);
編輯 您的儲存過程
CREATE PROCEDURE [dbo].[_SyncData_Recent] ( @CustomerId int = null ) AS BEGIN WITH Ranked AS ( SELECT Id, CustomerId, CategoryId, ROW_NUMBER() OVER (PARTITION BY CustomerId, CategoryId ORDER BY DateCreated DESC) AS rn FROM SyncData AS cb WHERE CustomerId = @CustomerId AND DELETED_FLAG = 0 ) SELECT Id, CustomerId, CategoryId FROM Ranked WHERE rn = 1 AND CustomerId = @CustomerId END GO EXEC [dbo].[_SyncData_Recent] 600528;
也可以應用更早的過濾,但我會添加這個索引:
CREATE INDEX IX_CustomerID_CategoryId_DateCreated_ ON dbo.SyncData([CustomerId],CategoryId,DateCreated DESC) INCLUDE(DELETED_FLAG,ID) WHERE DELETED_FLAG = 0;
刪除排序運算符。
最後一個過濾器
WHERE rn = 1 AND CustomerId = @CustomerId
可能只是WHERE rn = 1
.