聚集索引掃描對性能不利嗎?
我有一個複雜的查詢(見下文),它在大型數據庫上表現不佳。我現在通過刪除連接來分析查詢,查看實際計劃並逐步添加連接。我看到聚集索引掃描具有高 I/O 成本和 100% 的 CPU 成本進行排序。這個查詢可以更優化嗎?
WITH currentPeriod AS (SELECT ROW_NUMBER() OVER (PARTITION BY tp.[StartDate] ORDER BY m.[Created] DESC) AS rowIndex , m.[Created], m.[MessageId] , tp.[StartDate], tp.[EndDate], tp.[IsCorrection] FROM [TimePeriods] tp JOIN [Messages] m on m.Id = tp.[MessageId] ) SELECT [Created], [StartDate], [EndDate], [IsCorrection] FROM CurrentPeriod cur WHERE rowIndex = 1 OPTION (RECOMPILE)
我已經將 Message 表上的 PK 索引從 Clustered 更改為 Non-clustered,並在 Created 上添加了一個 Clustered 索引。
編輯
這是上述查詢計劃的連結:https
://www.brentozar.com/pastetheplan/?id=Hkw-4TxPB 這是完整查詢的計劃:https ://www.brentozar.com/pastetheplan/ ?id=SyYrDplPH背景(可以跳過)
我不是 DBA,我只是最了解數據庫的人,並不意味著我對數據庫了解很多。
使用自定義上傳應用程序填充表格。上傳的文件是 XML 文件。每個文件都是一個消息,一個消息可以有多個 TimePeriod。消息還可以更正早期消息的 TimePeriods。我只對上次更新的時期感興趣。這就是我使用 ROW_NUMBER OVER 和 PARTITION BY 的原因。查詢的結果數據是正確的。這已經得到驗證。
完整的查詢:
WITH currentPeriod AS (SELECT ROW_NUMBER() OVER (PARTITION BY ce.[ReferenceCode], tp.[StartDate], e.[Id], ep.[ContractNumber] ORDER BY m.[Created] DESC, ep.[ContractNumber] DESC, IIF(fp.[LbTab] = N'010', 1, 0), fp.[DatAanv] DESC) AS rowIndex , ce.[ReferenceCode], ce.[Name] as [EntityName] , e.[SocialSecurityNumber], e.[EmployeeNumber], e.[Initials], e.[Firstname], e.[Prefix], e.[Surname], e.[BirthDate] , CAST((DATEDIFF(DAY, e.[BirthDate], ep.[HireDate]) / 365.24) as FLOAT) [AgeHired], e.[Gender], e.[PhoneNumber], e.[PhoneNumber2], e.[Email] , TRIM(CONCAT_WS(' ', a.[Street], COALESCE(a.[Number], COALESCE(a.[NumberString], '')), COALESCE(a.[NumberExtension], ''))) [StreetLine] , a.[ZipCode], a.[City], a.[CountryCode] , ep.[HireDate], ep.[DepartureDate] , f.[AantVerlU], f.[LnSv], f.[AantSV] , fp.[IndAvrLkvOudrWn], fp.[IndAvrLkvAgWn], fp.[IndAvrLkvDgBafSb], fp.[IndAvrLkvHpAgWn], fp.[DatAanv] , fp.[LbTab], fp.[IndWAO], fp.[IndWW], fp.[IndZW] , tp.[StartDate], tp.[EndDate], tp.[IsCorrection] FROM [TimePeriods] tp JOIN [Messages] m on m.Id = tp.[MessageId] JOIN [CorporateEntities] ce on m.[CorporateEntityId] = ce.Id JOIN [Financials] f on f.[TimePeriodId] = tp.Id JOIN [FinancialPeriods] fp on fp.[FinancialId] = f.Id JOIN [EmployementPeriods] ep on ep.Id = f.[EmployementPeriodId] JOIN [Employees] e on e.Id = ep.[EmployeeId] JOIN [Addresses] a on a.Id = e.Id WHERE NOT EXISTS (SELECT 1 FROM [dbo].[Withdrawals] w JOIN [dbo].[TimePeriods] tp2 ON tp2.Id = w.[TimePeriodId] JOIN [dbo].[Messages] m2 ON m2.id = tp2.MessageId AND m2.[CorporateEntityId] = m.[CorporateEntityId] WHERE w.SofiNr is not null AND w.SofiNr = e.[SocialSecurityNumber] AND tp2.StartDate = tp.[StartDate] AND w.[NumIv] = ep.[ContractNumber] AND m2.[Created] > m.[Created] AND CONVERT(date, m.[Created]) <= CONVERT(date,@messageDate)) AND CONVERT(date, tp.StartDate) >= CONVERT(date, @date26Param) AND CONVERT(date, tp.EndDate) <= CONVERT(date, @endDate) AND CONVERT(date, m.[Created]) <= CONVERT(date, @messageDate)) SELECT * FROM CurrentPeriod cur WHERE rowIndex = 1 AND [AantVerlU] > 0 AND [LbTab] != N'010' AND (SELECT count(1) FROM CurrentPeriod cur2 WHERE cur2.[StartDate] > DATEADD(week, -26, cur.[StartDate]) AND cur2.[StartDate] < cur.[StartDate] AND COALESCE(cur2.[SocialSecurityNumber], LEFT(cur2.[ReferenceCode], 9) + '-' + cur2.[EmployeeNumber]) = COALESCE(cur.[SocialSecurityNumber], LEFT(cur.[ReferenceCode], 9) + '-' + cur.Employeenumber) AND cur2.[AantVerlU] > 0 AND cur2.[LbTab] != N'010') = 0 AND StartDate >= @startDate ORDER BY [SocialSecurityNumber], [StartDate];
優化器可能很難優化;將 CTE 部分假離線到 #Table 並查詢它。
問題:聚集索引掃描對性能不利?嗯,不一定。意味著您沒有查詢可以使用的更好(更具選擇性)的索引。對於較大的表也表現更差……但它可能更糟糕 - 堆(根本沒有數據順序)。
AND (SELECT count(1) FROM CurrentPeriod cur2 WHERE cur2.[StartDate] > DATEADD(week, -26, cur.[StartDate]) AND cur2.[StartDate] < cur.[StartDate] AND COALESCE(cur2.[SocialSecurityNumber], LEFT(cur2.[ReferenceCode], 9) + '-' + cur2.[EmployeeNumber]) = COALESCE(cur.[SocialSecurityNumber], LEFT(cur.[ReferenceCode], 9) + '-' + cur.Employeenumber) AND cur2.[AantVerlU] > 0 AND cur2.[LbTab] != N'010') = 0
您在此處的邏輯規定了必須計算日期是否適用的有效性的每一行……您可能還希望將連接條件的結果儲存在 #table 中
AND COALESCE(cur2.[SocialSecurityNumber], LEFT(cur2.[ReferenceCode], 9) + '-' + cur2.[EmployeeNumber]) = COALESCE(cur.[SocialSecurityNumber], LEFT(cur.[ReferenceCode], 9) + '-' + cur.Employeenumber)
SQL 是面向集合的,在大量資訊中非常高效;但是在大量表格中以行級別為基礎的詳細資訊…… RBAR 地獄的咒語 - 痛苦的行!