Sql-Server

優化我的查詢

  • July 21, 2020

這是我的查詢:

DECLARE @monthStartDate date;
DECLARE @monthEndDate date;
   
SET @monthStartDate = DATEFROMPARTS(2020,6,1);
SET @monthEndDate = EOMONTH(@monthStartDate);
    
SELECT @monthStartDate, MIN([Date]), @monthEndDate, GroupId, TypeId,
   COUNT(UserID) as 'CountTotal',
   SUM(CASE WHEN [IsA] = 1 THEN 1 ELSE 0 END) as 'CountA',
   SUM(CASE WHEN [IsB] = 1 THEN 1 ELSE 0 END) as 'CountB',
   SUM(CASE WHEN [IsC] = 1 THEN 1 ELSE 0 END) as 'CountC',
   SUM(CASE WHEN [IsEven] = 1 THEN 1 ELSE 0 END) as 'CountEven',
   SUM(CASE WHEN [IsOdd] = 1 THEN 1 ELSE 0 END) as 'CountOdd',
   SUM(CASE WHEN [IsEven] = 1 AND [IsA] = 1 THEN 1 ELSE 0 END) as 'CountAEven',
   SUM(CASE WHEN [IsOdd] = 1 AND [IsA] = 1 THEN 1 ELSE 0 END) as 'CountAOdd',
   SUM(CASE WHEN [IsEven] = 1 AND [IsC] = 1 THEN 1 ELSE 0 END) as 'CountCEven',
   SUM(CASE WHEN [IsOdd] = 1 AND [IsC] = 1 THEN 1 ELSE 0 END) as 'CountCOdd',
   SUM(CASE WHEN [IsEven] = 1 AND [IsB] = 1 THEN 1 ELSE 0 END) as 'CountBEven',
   SUM(CASE WHEN [IsOdd] = 1 AND [IsB] = 1 THEN 1 ELSE 0 END) as 'CountBOdd'
FROM MyTable
WHERE [Date] >= @monthStartDate AND [Date] <= @monthEndDate
GROUP BY [Date], GroupId, TypeId
ORDER BY Date, GroupId, TypeId

從該表中選擇:

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [MyTable](
   [UserID] [varchar](32) NOT NULL,
   [Date] [date] NOT NULL,
   [GroupId] [varchar](16) NOT NULL,
   [IsEven] [bit] NOT NULL,
   [IsOdd] [bit] NOT NULL,
   [TypeId] [int] NOT NULL,
   [IsA] [bit] NOT NULL,
   [IsB] [bit] NOT NULL,
   [IsC] [bit] NOT NULL,
CONSTRAINT [PK_Http_Uniques] PRIMARY KEY CLUSTERED 
(
   [Date] DESC,
   [GroupId] ASC,
   [UserID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

ALTER TABLE [MyTable]  WITH CHECK ADD  CONSTRAINT [CK_MyTable_Period] CHECK  (([IsIn]<>(0) OR [IsOut]<>(0)))
GO

ALTER TABLE [MyTable] CHECK CONSTRAINT [CK_MyTable_Period]
GO

它有效,但表格最終將有數百萬行,我顯然對性能持謹慎態度。

您是否看到優化此查詢的方法?謝謝。

如果您嘗試在 SQL Server 中優化聚合查詢,您可以使用的最佳工具是列儲存索引。它們專門用於處理此類數據的聚合。當我們談論數百萬行或更多數據時尤其如此。

最初的問題沒有說明您正在執行什麼版本的 SQL Server,但如果您使用的是 2016 或更高版本,您可以實現對儲存和索引的最佳使用。如果針對您的表的大多數查詢本質上是分析性的,您可以將表儲存為聚集列儲存。然後,您可以添加非聚集索引以支持點查找。另一方面,如果您仍然主要看到 OLTP 查詢,而只有偶爾的分析查詢,則可以將非聚集列儲存添加到儲存為聚集索引的表中。您可以在此處閱讀有關選擇正確列儲存索引的更多資訊。

由於列儲存索引的初始實現不太理想,它們最初是只讀的,許多人已將它們從工具箱中剔除。但是,您正在查看的正是他們要解決的查詢類型。測試一下。保證您將看到查詢性能的巨大改進,即使使用局部變數(順便說一下,已經向您指出,有時會因為它們如何影響行估計而導致性能問題,在許多在這種情況下,參數值或硬編碼值可能會表現得更好,因為由於更好的行估計,您將獲得更好的計劃)。

對這些欄位進行索引將是關鍵:

$$ Date $$, 組 ID, 類型 ID 但是,無論您的查詢的“典型”日期範圍是什麼,我都會為該大小設置分區。即,如果您通常查詢一個月的數據並按月設置分區,那麼無論您擁有多少數據,您都只會查看一個分區(即該月的行)。如果分區在同一欄位上,優化器將選擇分區。10 行或 100M,您將看到相同/相似的性能。

引用自:https://dba.stackexchange.com/questions/271324