Sql-Server

對范圍內的行進行分組

  • May 25, 2019

假設我有以下(人為的)監控數據表:

RowID       DateStamp            Success         Error
-----       -----------------    -------         -----------
1001        5/24/2019 11:23am    1               None 
1004        5/24/2019 11:24am    1               None 
1005        5/24/2019 11:25am    1               None 
1009        5/24/2019 11:26am    0               SQL Timeout
1018        5/24/2019 11:27am    0               SQL Timeout 
1019        5/24/2019 11:28am    1               None 
1026        5/24/2019 11:29am    1               None 
1035        5/24/2019 11:30am    0               Planned Maintenance
1100        5/24/2019 11:31am    0               Planned Maintenance
1111        5/24/2019 11:32am    1               None 

我想對具有相同狀態和錯誤的行進行分組,但前提是它們按日期或 ID 按順序彼此相鄰(ID 將正確排序,但不保證按順序排列),如下所示:

Starting             Ending               Polls  Success   Error
-----------------    -----------------    -----  -------   -----------
5/24/2019 11:23am    5/24/2019 11:25am    3      1         None
5/24/2019 11:26am    5/24/2019 11:27am    2      0         SQL Timeout
5/24/2019 11:28am    5/24/2019 11:29am    2      1         None
5/24/2019 11:30am    5/24/2019 11:31am    2      0         Planned Maintenance
5/24/2019 11:32am    5/24/2019 11:32am    1      1         None

一個簡單的GROUP BY不起作用:

SELECT MIN(DateStamp) as Starting, MAX(DateStamp) as Ending, 
     Count(*) as Polls, Success, Error
FROM myTable
GROUP BY Success, Error
ORDER BY Starting

由於這將所有具有相同狀態/錯誤的行組合在一起,無論它們何時發生:

Starting             Ending               Polls  Success   Error
-----------------    -----------------    -----  -------   -----------
5/24/2019 11:23am    5/24/2019 11:32am    6      1         None
5/24/2019 11:26am    5/24/2019 11:27am    2      0         SQL Timeout
5/24/2019 11:30am    5/24/2019 11:31am    2      0         Planned Maintenance

在 SQL Server 2012 中是否有一種簡單的方法可以做到這一點?

辨識島嶼的另一種方法是使用LAG分析函式和SUM()視窗聚合函式:

WITH
 sentinels AS
 (
   SELECT
     DateStamp,
     Success,
     Error,
     Grp = CASE
             WHEN Success = LAG(Success) OVER (ORDER BY DateStamp ASC)
              AND Error   = LAG(Error  ) OVER (ORDER BY DateStamp ASC)
             THEN 0
             ELSE 1
           END
   FROM
     dbo.myTable
 ),
 fullyMarked AS
 (
   SELECT
     DateStamp,
     Success,
     Error,
     Grp = SUM(Grp) OVER (ORDER BY DateStamp ASC ROWS UNBOUNDED PRECEDING)
   FROM
     sentinels
 )
SELECT
 Starting  = MIN(DateStamp),
 Ending    = MAX(DateStamp),
 Polls     = COUNT(*),
 Success,
 Error
FROM
 fullyMarked
GROUP BY
 Success,
 Error,
 Grp
ORDER BY
 Starting ASC
;

CTE 用 1 標記每個島的sentinels第一行,其他用 0 標記:

RowID  DateStamp          Success  Error                Grp
-----  -----------------  -------  -------------------  ---
1001   5/24/2019 11:23am  1        None                 1
1004   5/24/2019 11:24am  1        None                 0
1005   5/24/2019 11:25am  1        None                 0
1009   5/24/2019 11:26am  0        SQL Timeout          1
1018   5/24/2019 11:27am  0        SQL Timeout          0
1019   5/24/2019 11:28am  1        None                 1
1026   5/24/2019 11:29am  1        None                 0
1035   5/24/2019 11:30am  0        Planned Maintenance  1
1100   5/24/2019 11:31am  0        Planned Maintenance  0
1111   5/24/2019 11:32am  1        None                 1

fullyMarkedCTE 獲取列的輸出併sentinels計算列的總和Grp,從而有效地為每個島分配一個唯一的 ID:

RowID  DateStamp          Success  Error                Grp
-----  -----------------  -------  -------------------  ---
1001   5/24/2019 11:23am  1        None                 1    -- 1
1004   5/24/2019 11:24am  1        None                 1    -- 1+0
1005   5/24/2019 11:25am  1        None                 1    -- 1+0+0
1009   5/24/2019 11:26am  0        SQL Timeout          2    -- 1+0+0+1
1018   5/24/2019 11:27am  0        SQL Timeout          2    -- 1+0+0+1+0
1019   5/24/2019 11:28am  1        None                 3    -- 1+0+0+1+0+1
1026   5/24/2019 11:29am  1        None                 3    -- 1+0+0+1+0+1+0
1035   5/24/2019 11:30am  0        Planned Maintenance  4    -- 1+0+0+1+0+1+0+1
1100   5/24/2019 11:31am  0        Planned Maintenance  4    -- 1+0+0+1+0+1+0+1+0
1111   5/24/2019 11:32am  1        None                 5    -- 1+0+0+1+0+1+0+1+0+1

最後一步是Success, Error, Grp根據需要對其他列的數據進行分組和聚合,這將為您提供預期的結果。

你可以在db<>fiddle.uk使用這個現場展示來玩這個解決方案。dbfiddle 徽標

這是一個使用創建組的範例ROW_NUMBER()

;WITH CTE_GROUP AS (
 SELECT 
 Success, 
 Error,
 DateStamp,
 ROW_NUMBER() OVER (PARTITION BY Success, Error ORDER BY DateStamp)
     - ROW_NUMBER() OVER (ORDER BY DateStamp) as SuccesErrorDateStamp,
FROM dbo.myTable
)
SELECT MIN(DateStamp) as Starting, MAX(DateStamp) as Ending, 
     Count(*) as Polls, Success, Error
FROM CTE_GROUP
GROUP BY Success, Error,SuccesErrorDateStamp
ORDER BY Starting;

導致

Starting                    Ending                      Polls   Success Error
2019-05-24 11:23:00.0000000 2019-05-24 11:25:00.0000000 3       1       None
2019-05-24 11:26:00.0000000 2019-05-24 11:27:00.0000000 2       0       SQL Timeout
2019-05-24 11:28:00.0000000 2019-05-24 11:29:00.0000000 2       1       None
2019-05-24 11:30:00.0000000 2019-05-24 11:31:00.0000000 2       0       Planned Maintenance
2019-05-24 11:32:00.0000000 2019-05-24 11:32:00.0000000 1       1       None

DB<>小提琴

感謝@AndriyM指出我只需要一ROW_NUMBER()

引用自:https://dba.stackexchange.com/questions/239002