Sql-Server

在計數表中同時計數 MAX 事件(基於 datatime2)

  • March 25, 2021

你好,

我有一項任務,旨在將進入給定埠(幹線)的幾個連接匯總到時隙中併計算結果。問題是我想實現兩種類型的計數:

  • 一次計數應計算給定插槽中與 a 的連接總數Trunk(每個插槽 30 分鐘)
  • 其次,我想找出同時發生的最大連接數:例如總共有 10 個連接,但只有 3 個同時連接

表:RAW_DATA


GatewayName     StartDateTime                   DisconnectDateTime              ConnectionDuration  Trunk
GW1             2021-02-24 20:01:00.0000000     2021-02-24 20:05:30.0000000     270000              T1  -- 1 *
GW1             2021-02-24 20:05:20.6000000     2021-02-24 20:07:50.1000000     149500              T1  -- 2 *
GW1             2021-02-24 20:04:50.0000000     2021-02-24 20:08:24.0000000     214000              T1  -- 3 *
GW1             2021-02-24 20:15:50.0000000     2021-02-24 20:17:00.0000000     70000               T1  -- 0
GW1             2021-02-24 20:20:50.0000000     2021-02-24 20:21:00.0000000     10000               T1  -- 1
GW1             2021-02-24 20:20:59.0000000     2021-02-24 20:24:00.0000000     181000              T1  -- 2
GW1             2021-02-24 20:25:00.0000000     2021-02-24 20:28:30.0000000     210000              T1  -- 0
GW2             2021-02-24 20:41:49.0000000     2021-02-24 20:43:24.0000000     95000               T2
GW3             2021-02-24 22:46:54.2000000     2021-02-24 22:48:25.2000000     91000               T1
GW99            2021-02-24 22:47:25.1000000     2021-02-24 22:47:54.4000000     29300               T2

源數據表如上圖所示,我在前幾行添加了一些註釋,說明我將如何計數。所以你可以看到前三行同時出現(至少部分),然後在第四行我重新開始(這裡用 0 表示)。所以我想要保存三個表示*為“贏家”的數Simultaneous = 3

Simultaneous我試圖將它添加到儲存過程中,我已經能夠獲取計數表併計算 ‘ALL’ ,但是在計算連接時我一無所知。

CREATE OR ALTER PROCEDURE [schemma].[TrunkSummery]
@date datetime2(7),
@period int
AS
BEGIN
   -- Creat timeslotable
       with numbers(val) as 
           (select 1 union all select val + 1 from numbers where val < 48) -- 48 hence 30 * 2 = 60 min -> 24h
       select @date, nbr.val, 
          dateadd(minute, (nbr.val - 1) * 30, @date) as period_start, 
          dateadd(minute, (nbr.val    ) * 30, @date) as period_end 
       from numbers as nbr 
       order by nbr.val;
       
   -- Enummurate over timeslot table and RAW_DAtA
       with numbers(val) as 
           (select 1 union all select val + 1 from numbers where val < 48), -- 48 hence 30 * 2 = 60 min -> 24h
       periods as (
           select @date as [date], nbr.val, 
          dateadd(minute, (nbr.val - 1) * 30, @date) as period_start, 
          dateadd(minute, (nbr.val    ) * 30, @date) as period_end 
          from numbers as nbr)
       select pers.period_start, @period as Period, src.trunk, count(src.trunk) as 'all' 
       /*, count(Simultaneous) as 'sim' */ 
       --from periods as pers left  join @raw_data as src 
       from periods as pers inner join @raw_data as src 
       on src.StartDateTime >= pers.period_start and src.StartDateTime < pers.period_end
       group by src.trunk, pers.period_start, pers.period_end
       order by src.trunk;
END
GO

EXECUTE [schemma].[TrunkSummery] @date = '20210224', @period = 30;
GO

輸出表

-- Current
period_start                    Period  trunk   all
2021-02-24 20:00:00.0000000     30      T1      7
2021-02-24 22:30:00.0000000     30      T1      1
2021-02-24 20:30:00.0000000     30      T2      1
2021-02-24 22:30:00.0000000     30      T2      1

-- Desiered
period_start                    Period  trunk   all  sim
2021-02-24 20:00:00.0000000     30      T1      7    3   
2021-02-24 22:30:00.0000000     30      T1      1    1
2021-02-24 20:30:00.0000000     30      T2      1    1
2021-02-24 22:30:00.0000000     30      T2      1    1

這有可能嗎?

小提琴範例在這裡

提前感謝大家:)

更新 1

在考慮閱讀更多內容之後,也許我可以以某種方式使用“LEAD”函式來查看目前行上的“disconnectDateTime”是否小於其下方行上的“StartDateTime”,並儲存跳數直到“DisconnectDateTime”更少比“開始日期時間”。並繼續像這樣對插槽內的所有行進行操作。如果一個子句超過最後一個 MAX 計數,則使用新記錄更新它。

更新 2

我添加了一個在執行時間段排序之前執行的臨時表,該表的目的是查看下一行StartDateTime以查看它是否在目前行之前DisconnectDateTime

GatewayName StartDateTime           DisconnectDateTime      ConDur  Trunk   nrDDT                   sim
GW1         2021-02-24 20:01:00.0   2021-02-24 20:05:30.0   270000  T1      2021-02-24 20:04:50.0   2 -- OK 
GW1         2021-02-24 20:04:50.0   2021-02-24 20:08:24.0   214000  T1      2021-02-24 20:05:20.6   2 -- Bad: 3
GW1         2021-02-24 20:05:20.6   2021-02-24 20:07:50.1   149500  T1      2021-02-24 20:15:50.0   0
GW1         2021-02-24 20:15:50.0   2021-02-24 20:17:00.0   70000   T1      2021-02-24 20:20:50.0   0
GW1         2021-02-24 20:20:50.0   2021-02-24 20:21:00.0   10000   T1      2021-02-24 20:20:59.0   2 -- OK
GW1         2021-02-24 20:20:59.0   2021-02-24 20:24:00.0   181000  T1      2021-02-24 20:25:00.0   0
GW1         2021-02-24 20:25:00.0   2021-02-24 20:28:30.0   210000  T1      2021-02-24 22:46:54.2   0
GW3         2021-02-24 22:46:54.2   2021-02-24 22:48:25.2   91000   T1      NULL                    0
GW2         2021-02-24 20:41:49.0   2021-02-24 20:43:24.0   95000   T2      2021-02-24 22:47:25.1   0
GW99        2021-02-24 22:47:25.1   2021-02-24 22:47:54.4   29300   T2      NULL                    0

問題是我想增加一個計數,如果我case遇到了(sim 列),我嘗試了一個全域變數和一個局部變數,但是它正在為每一行重置,我不能強制它回到0if 我的else子句被輸入。

CREATE OR ALTER PROCEDURE GenerateTrunkSum
@date datetime2(7),
@period int
AS
BEGIN

DECLARE @raw_data table
(
GatewayName varchar(23),
StartDateTime datetime2(7),
DisconnectDateTime datetime2(7),
ConnectionDuration int ,
Trunk varchar(10)
);

INSERT INTO @raw_data values('GW1', '2021-02-24 20:01:00.0', '2021-02-24 20:05:30.0', DATEDIFF(millisecond, '2021-02-24 20:01:00.0', '2021-02-24 20:05:30.0'), 'T1')
INSERT INTO @raw_data values('GW1', '2021-02-24 20:05:20.6', '2021-02-24 20:07:50.1', DATEDIFF(millisecond, '2021-02-24 20:05:20.6', '2021-02-24 20:07:50.1'), 'T1')
INSERT INTO @raw_data values('GW1', '2021-02-24 20:04:50.0', '2021-02-24 20:08:24.0', DATEDIFF(millisecond, '2021-02-24 20:04:50.0', '2021-02-24 20:08:24.0'), 'T1')
INSERT INTO @raw_data values('GW1', '2021-02-24 20:15:50.0', '2021-02-24 20:17:00.0', DATEDIFF(millisecond, '2021-02-24 20:15:50.0', '2021-02-24 20:17:00.0'), 'T1')
INSERT INTO @raw_data values('GW1', '2021-02-24 20:20:50.0', '2021-02-24 20:21:00.0', DATEDIFF(millisecond, '2021-02-24 20:20:50.0', '2021-02-24 20:21:00.0'), 'T1')
INSERT INTO @raw_data values('GW1', '2021-02-24 20:20:59.0', '2021-02-24 20:24:00.0', DATEDIFF(millisecond, '2021-02-24 20:20:59.0', '2021-02-24 20:24:00.0'), 'T1')
INSERT INTO @raw_data values('GW1', '2021-02-24 20:25:00.0', '2021-02-24 20:28:30.0', DATEDIFF(millisecond, '2021-02-24 20:25:00.0', '2021-02-24 20:28:30.0'), 'T1')
INSERT INTO @raw_data values('GW2', '2021-02-24 20:41:49.0 ', '2021-02-24 20:43:24.0', DATEDIFF(millisecond, '2021-02-24 20:41:49.0 ', '2021-02-24 20:43:24.0'), 'T2')
INSERT INTO @raw_data values('GW3', '2021-02-24 22:46:54.2', '2021-02-24 22:48:25.2', DATEDIFF(millisecond, '2021-02-24 22:46:54.2', '2021-02-24 22:48:25.2'), 'T1')
INSERT INTO @raw_data values('GW99', '2021-02-24 22:47:25.1', '2021-02-24 22:47:54.4', DATEDIFF(millisecond, '2021-02-24 22:47:25.1', '2021-02-24 22:47:54.4'), 'T2')


declare @localvar int -- Alt 2
set @localvar = 1
  SELECT GatewayName, StartDateTime, DisconnectDateTime, ConnectionDuration, Trunk, LEAD(StartDateTime, 1, NULL) OVER ( PARTITION BY Trunk ORDER BY  StartDateTime ) as nrDDT,
      CASE 
          WHEN DATEDIFF(MILLISECOND, LEAD(StartDateTime, 1, NULL) OVER ( PARTITION BY Trunk ORDER BY  StartDateTime ), DisconnectDateTime) >= 0 THEN @localvar + 1 -- Add if Match 1 = 1 M3 = 3 etc
          ELSE 0 -- Reset @localvar = 0
      END AS sim
  INTO #Temp
  FROM @raw_data;

      select * from #Temp;
  
  -- Creat timeslotable
      with numbers(val) as 
          (select 1 union all select val + 1 from numbers where val < 48)
      select @date, nbr.val, 
          dateadd(minute, (nbr.val - 1) * 30, @date) as period_start, 
          dateadd(minute, (nbr.val    ) * 30, @date) as period_end 
      from numbers as nbr 
      order by nbr.val;

      --Enummerate
      with numbers(val) as 
          (select 1 union all select val + 1 from numbers where val < 48),
      periods as (
          select @date as [date], nbr.val, 
          dateadd(minute, (nbr.val - 1) * 30, @date) as period_start, 
          dateadd(minute, (nbr.val    ) * 30, @date) as period_end 
          from numbers as nbr)
      select pers.period_start, @period as Period, src.trunk, count(src.GatewayName) as 'all', 
          -- Case Added in update 2
          CASE
              WHEN MAX(src.sim) < 1 THEN 1 -- if max is 0 set 1, defaults to at least one active
              ELSE MAX(src.sim)
          END AS simultaneous
      --from periods as pers left  join HDO.CDR_RAW as src 
      from periods as pers inner join #Temp as src 
      on src.StartDateTime >= pers.period_start and src.StartDateTime < pers.period_end
      group by src.trunk, pers.period_start
      order by src.trunk 
END
GO

EXECUTE GenerateTrunkSum @date = '20210224', @period = 1800; 

所以我的問題是:有人知道如何讓這個計數機制起作用嗎?我想要這樣的原因是能夠MAX在我的最後一個 select 語句中執行 a (請參閱CASE

更新 3

一旦我通過@bbaird檢查了下面的目前答案,使用了更大更真實的數據集;StartDateTime&DisconnectDateTime可能會溢出到不同的時期。

更新小提琴

電流輸出

period_start            period_end              Period  Trunk   All Sim
2021-02-24 20:00:00.0   2021-02-24 20:30:00.0   30      E1      2   4
2021-02-24 20:30:00.0   2021-02-24 21:00:00.0   30      E1      23  6
2021-02-24 20:30:00.0   2021-02-24 21:00:00.0   30      E2      1   1
2021-02-24 20:00:00.0   2021-02-24 20:30:00.0   30      T1      2   4
2021-02-24 20:30:00.0   2021-02-24 21:00:00.0   30      T1      23  6
2021-02-24 20:30:00.0   2021-02-24 21:00:00.0   30      T2      1   1

如您所見,期間2021-02-24 20:00:00.0Trunk = 'E1'總計數為 2,但 Sim 為 4。我不知道為什麼會這樣。

如果我們查看 for 周圍的2021-02-24 20:00:00.0E1

-- @raw_data
1 ('GW1', '2021-02-24 20:23:43.1', '2021-02-24 20:32:32.9', ..., 'T1', 'E1' ), 
2 ('GW1', '2021-02-24 20:24:34.6', '2021-02-24 20:33:05.1', ..., 'T1', 'E1' ), 
3 ('GW1', '2021-02-24 20:31:09.5', '2021-02-24 20:32:51.4', ..., 'T1', 'E1' ),
4 ('GW1', '2021-02-24 20:32:42.3', '2021-02-24 20:33:09.3', ..., 'T1', 'E1' ), 
5 ('GW1', '2021-02-24 20:34:50.6', '2021-02-24 20:37:30.9', ..., 'T1', 'E1' ), 
6 ('GW1', '2021-02-24 20:35:34.2', '2021-02-24 20:36:25.4', ..., 'T1', 'E1' ),

查看數據,我會說第 1 行和第 2 行應該是唯一被計算的行(count(*) as 'ALL')因為sim T:20:00et 似乎基於上面顯示的輸出,但是我不確定額外的兩個Sim計數來自哪裡。基於我對輸出進行了一些分析,它可能是Connectionsummary.ConnectionCount因為,MAX(COALESCE(ConnectionSummary.ConnectionCount,0)) AS Sim 接縫ConnectionSummary即使在 30 分鐘標記後也有連續的行;我認為這需要在列舉之前分組到時隙中。

其次,當涉及到一個可靠的查詢時,它變得更加棘手20:30我假設第 1 行和第 2 行可能應該算作 sim 的一部分,因為第 3 行同時發生。如果超過 30 min-mark(00/30),是否有可能計算兩次條目?

我們可以獲得每個中繼/週期/間隔組合的活動連接數,如下所示:

WITH Numbers(val) AS
(
 SELECT
   1 
   
   UNION ALL
   
 SELECT
   val + 1 
 FROM
   numbers 
 WHERE
   val < 48
)
,ConnectionPeriod AS
(
 SELECT
   period.period_start
  ,period.period_end
  ,@period as Period
  ,src.Trunk
  ,CASE
     WHEN src.StartDateTime < period.period_start THEN period.period_start
     ELSE src.StartDateTime
   END AS StartDateTime
  ,CASE
     WHEN src.DisconnectDateTime > period.period_end THEN period.period_end
     ELSE src.DisconnectDateTime
   END AS DisconnectDateTime
 FROM
   (
     SELECT
       dateadd(minute, (val - 1) * 30, @date) as period_start
      ,dateadd(minute, (val    ) * 30, @date) as period_end 
     FROM
       numbers
   ) period 
 INNER JOIN
   (
     SELECT 
       TrunkIngress as Trunk
      ,StartDateTime
      ,DisconnectDateTime
     FROM 
       @raw_data

   UNION ALL

     SELECT 
       TrunkEgress
      ,StartDateTime
      ,DisconnectDateTime
     FROM 
       @raw_data
   ) src
     ON src.StartDateTime >= period.period_start
         AND src.StartDateTime < period.period_end
)
SELECT
 cp1.period_start
,cp1.Trunk
,cp1.StartDateTime
,cp1.DisconnectDateTime
,COUNT(*) AS ConnectionCount
FROM
 ConnectionPeriod cp1
LEFT JOIN
 ConnectionPeriod cp2
   ON cp2.Trunk = cp1.Trunk
       AND cp2.StartDateTime < cp1.DisconnectDateTime
       AND cp2.DisconnectDateTime > cp1.StartDateTime
GROUP BY
 cp1.period_start
,cp1.Trunk
,cp1.StartDateTime
,cp1.DisconnectDateTime

由於您只想計算在給定時間段內/之後開始的重複連接,因此我們首先將每個連接加入ConnectionPeriod創建這些組合。然後我們將StartDateTimeand調整DisconnectDateTime為在 each 內Period

然後將其加入倒數第二個聚合併取最大值ConnectionCount

WITH Numbers(val) AS
(
 SELECT
   1 
   
   UNION ALL
   
 SELECT
   val + 1 
 FROM
   numbers 
 WHERE
   val < 48
)
,ConnectionPeriod AS
(
 SELECT
   period.period_start
  ,period.period_end
  ,@period as Period
  ,src.Trunk
  ,CASE
     WHEN src.StartDateTime < period.period_start THEN period.period_start
     ELSE src.StartDateTime
   END AS StartDateTime
  ,CASE
     WHEN src.DisconnectDateTime > period.period_end THEN period.period_end
     ELSE src.DisconnectDateTime
   END AS DisconnectDateTime
 FROM
   (
     SELECT
       dateadd(minute, (val - 1) * 30, @date) as period_start
      ,dateadd(minute, (val    ) * 30, @date) as period_end 
     FROM
       numbers
   ) period 
 INNER JOIN
   (
     SELECT 
       TrunkIngress as Trunk
      ,StartDateTime
      ,DisconnectDateTime
     FROM 
       @raw_data

   UNION ALL

     SELECT 
       TrunkEgress
      ,StartDateTime
      ,DisconnectDateTime
     FROM 
       @raw_data
   ) src
     ON src.StartDateTime >= period.period_start
         AND src.StartDateTime < period.period_end
)
SELECT
 PeriodSummary.period_start
,PeriodSummary.Period
,PeriodSummary.Trunk
,PeriodSummary.[all]
,MAX(COALESCE(ConnectionSummary.ConnectionCount,0)) AS sim
FROM
 (
   SELECT
     period_start
    ,period_end
    ,@period as Period
    ,Trunk
    ,COUNT(*) as [all]
   FROM
     ConnectionPeriod
   GROUP BY
    Trunk
   ,period_start
   ,period_end
 ) PeriodSummary
INNER JOIN
 (
   SELECT
     cp1.period_start
    ,cp1.Trunk
    ,cp1.StartDateTime
    ,cp1.DisconnectDateTime
    ,COUNT(*) AS ConnectionCount
   FROM
     ConnectionPeriod cp1
   LEFT JOIN
     ConnectionPeriod cp2
       ON cp2.Trunk = cp1.Trunk
           AND cp2.StartDateTime < cp1.DisconnectDateTime
           AND cp2.DisconnectDateTime > cp1.StartDateTime
   GROUP BY
     cp1.period_start
    ,cp1.Trunk
    ,cp1.StartDateTime
    ,cp1.DisconnectDateTime
 ) ConnectionSummary
   ON ConnectionSummary.Trunk = PeriodSummary.Trunk
        AND ConnectionSummary.period_start =  PeriodSummary.Period_Start
GROUP BY
 PeriodSummary.period_start
,PeriodSummary.Period
,PeriodSummary.Trunk
,PeriodSummary.[all]

在這裡更新小提琴:https ://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=317652b78da1165c60b3f0c1f3055107

要考慮的另一件事是,我不確定您是否要計算在特定時間間隔內開始的連接,或者在該時間間隔內**處於活動狀態的連接。如果是後者,您需要將連接邏輯更改為:

src.StartDateTime < pers.period_end
 AND src.DisconnectDateTime > pers.period_start

ConnectionSummary.Trunk = PeriodSummary.Trunk
 AND ConnectionSummary.StartDateTime < PeriodSummary.Period_End  
 AND ConnectionSummary.DisconnectDateTime > PeriodSummary.Period_Start 

分別。

此查詢可能會在​​少量行的情況下執行良好,但由於缺乏良好的連接條件,考慮到它必須將源表連接到週期 3 次,它可能會有點麻煩。ConnectionPeriod在這種情況下,將CTE 替換為臨時表可能是有意義的。

引用自:https://dba.stackexchange.com/questions/287364