Sql-Server

獲取由具有連續值計數的外鍵分組的行

  • February 14, 2022

我有一個帶有 、 和 的表的transactionsSQL Server 數據庫。client_id``date``is_cancelled

我正在嘗試獲取在標記為 is_cancelled的行中有 3 個或更多事務的 client_ids ,以及 in_a_row 計數。我已經得到了以下內容,當 is_cancelled 支持是連續的時,is_same 的值為 1,並且取消的事務總數為 1(這不是我所需要的)

SELECT 
client_id,
date,
is_same,
SUM(is_same) OVER (PARTITION BY client_id ORDER BY date) AS sum_same,
transCancelled
FROM
(
   SELECT
   client_id,
   LAG(is_cancelled) OVER (PARTITION BY client_id ORDER BY date) AS previous_cancelled,
   CASE 
       WHEN is_cancelled = LAG(is_cancelled) OVER (PARTITION BY client_id ORDER BY date)
       THEN 1
       ELSE 0
   END as is_same,
   date,
   is_cancelled
   FROM transactions
   WHERE deleted_at IS NULL -- Ignore soft-deleted rows
) AS t_01
WHERE previous_cancelled = 1
ORDER BY date

擺弄範例數據:https ://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=a0c9b12203ab2d0c83f73604ccc9d0a0

預期數據(client_id,count) 1, 3 3, 6

這是一種差距和孤島問題。

這種類型的大多數解決方案的關鍵是計算更改,因此您需要一is_different列,而不是is_same. 然後您有條件地計算該列(如果您使用NULL而不是更容易0)為每組行創建一個 ID。

目前尚不清楚您到底想要什麼最終結果,但通過對該結果進行分組,您可以獲得連續行的最大和最小數量,以及實際行組的計數,每個client_id

WITH PrevValues AS (
   SELECT
     client_id,
     LAG(is_cancelled) OVER (PARTITION BY client_id ORDER BY date) AS previous_cancelled,
     CASE WHEN is_cancelled = LAG(is_cancelled) OVER (PARTITION BY client_id ORDER BY date)
       THEN NULL ELSE 1 END
       as is_different,
     date,
     is_cancelled
   FROM transactions
),
Grouped AS (
   SELECT 
     client_id,
     date,
     is_different,
     COUNT(is_different) OVER (PARTITION BY client_id ORDER BY date ROWS UNBOUNDED PRECEDING) AS group_id
   FROM PrevValues
),
ByGroups AS (
   SELECT
     client_id,
     COUNT(*) as in_a_row
   FROM Grouped
   GROUP BY
     client_id,
     group_id
   HAVING COUNT(*) >= 3
)
SELECT
 client_id,
 MAX(in_a_row) as max_in_a_row,
 MIN(in_a_row) as min_in_a_row,
 COUNT(*) as num_groups
FROM ByGroups
GROUP BY
 client_id;

db<>小提琴

請注意,您的範例數據具有相同日期的行,這是您應該始終使用的原因之一ROWS UNBOUNDED PRECEDING(有序視窗函式的預設值RANGE UNBOUNDED PRECEDING是略有不同的)。無論如何,您應該始終嘗試確定性排序。

引用自:https://dba.stackexchange.com/questions/307440