Sql-Server
獲取由具有連續值計數的外鍵分組的行
我有一個帶有 、 和 的表的
transactions
SQL Server 數據庫。client_id``date``is_cancelled
我正在嘗試獲取在標記為 is_cancelled的行中有 3 個或更多事務的 client_ids ,以及 in_a_row 計數。我已經得到了以下內容,當 is_cancelled 支持是連續的時,is_same 的值為 1,並且取消的事務總數為 1(這不是我所需要的)
SELECT client_id, date, is_same, SUM(is_same) OVER (PARTITION BY client_id ORDER BY date) AS sum_same, transCancelled FROM ( SELECT client_id, LAG(is_cancelled) OVER (PARTITION BY client_id ORDER BY date) AS previous_cancelled, CASE WHEN is_cancelled = LAG(is_cancelled) OVER (PARTITION BY client_id ORDER BY date) THEN 1 ELSE 0 END as is_same, date, is_cancelled FROM transactions WHERE deleted_at IS NULL -- Ignore soft-deleted rows ) AS t_01 WHERE previous_cancelled = 1 ORDER BY date
擺弄範例數據:https ://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=a0c9b12203ab2d0c83f73604ccc9d0a0
預期數據(client_id,count)
1, 3
3, 6
這是一種差距和孤島問題。
這種類型的大多數解決方案的關鍵是計算更改,因此您需要一
is_different
列,而不是is_same
. 然後您有條件地計算該列(如果您使用NULL
而不是更容易0
)為每組行創建一個 ID。目前尚不清楚您到底想要什麼最終結果,但通過對該結果進行分組,您可以獲得連續行的最大和最小數量,以及實際行組的計數,每個
client_id
:WITH PrevValues AS ( SELECT client_id, LAG(is_cancelled) OVER (PARTITION BY client_id ORDER BY date) AS previous_cancelled, CASE WHEN is_cancelled = LAG(is_cancelled) OVER (PARTITION BY client_id ORDER BY date) THEN NULL ELSE 1 END as is_different, date, is_cancelled FROM transactions ), Grouped AS ( SELECT client_id, date, is_different, COUNT(is_different) OVER (PARTITION BY client_id ORDER BY date ROWS UNBOUNDED PRECEDING) AS group_id FROM PrevValues ), ByGroups AS ( SELECT client_id, COUNT(*) as in_a_row FROM Grouped GROUP BY client_id, group_id HAVING COUNT(*) >= 3 ) SELECT client_id, MAX(in_a_row) as max_in_a_row, MIN(in_a_row) as min_in_a_row, COUNT(*) as num_groups FROM ByGroups GROUP BY client_id;
請注意,您的範例數據具有相同日期的行,這是您應該始終使用的原因之一
ROWS UNBOUNDED PRECEDING
(有序視窗函式的預設值RANGE UNBOUNDED PRECEDING
是略有不同的)。無論如何,您應該始終嘗試確定性排序。