Sql-Server

有什麼方法可以在沒有游標的情況下計算值的頻率?

  • July 20, 2017

鑑於我有一個具有增量值的列,並且我知道該值將攀升至 X(X 是任意的),我想計算最長的重複條紋是什麼而沒有最後一次計數(因為它可能會上升到他結束,我們還沒有做到)

它們由日期時間鍵控,因此排列它們以獲得正確的順序是微不足道的。排除最後出現的零之前的所有內容

例子:

0,1,2,3, 0,1,2, 0, 0,1,2,3,4, 0,1,2,3,4, 0,1,2,3,4,5, 0,1

將是“最長的重複條紋是 4 並且它連續發生了兩次。”

這最終將成為一個循環,我從集合中的最高值開始,產生 1@>=5。2@>=4 3@>=3, 3@>=2

我可以更詳細地解釋,現在只是一個概念,如果含糊不清,請原諒我,請問我需要在哪裡更詳細地說明。

因為我知道我必須使用程序循環來獲得最終值,Id est Cursor…而且,因為我真的不想嵌套它們,所以我可以忍受 5 或 10 次將最大值迭代為 0,但是不希望其中的數千個潛在迭代重複很多次……我想讓 SQL 做盡可能多的工作,但不要給它餵一些我知道會削弱它的東西結束。

所以……關於如何在基於集合的查詢中在 MSSQL 2008R2+ 中實現這一點的任何猜測,而不使用游標?

或者我應該在我領先的時候退出,把它拉進 C# 並在那裡做。

編輯:盡我所能在我的手機上輸入這個,如果我是在psudocode中用SQL來做,這就是預期的目標。

Define datetime for start and end point
Select data that is between the first zero after start, and the last zero prior to end.
Define current, last, streak, count
Loop through rows returned.
If (current != 0) { 
last = current
}
Else
{ //we have reset, how high did we get before, and is it the same as the last time we reset?
   If (last == streak) {
       //we have made another streak of same value as last
       Count +=1
(update the table keeping count that the sequence of (last) has now occurred (count) times at this point) 
   }
   Else
   {
       //we set a different max value, therefore a *different* streak
       //so we record it as the new streak to see if *it* repeats further
       Streak = last
Count = 1
   }
Last = 0
}

我想我今晚會回去用 C# 編寫一個功能範例和模式圖形,看看這是否有助於說“我將如何或什至可以在 SQL 中做同樣的事情?”

澄清:如果值是:

0,1,2,3,4, 0,1,2, 0,1,2, 0,1,2, 0,1,2,3,4, 0,1 

結果應該是兩個 4 還是三個 2?

在該範例中,將只有一個重複條紋,即 0、1、2 的連續序列,如其後返回零所示。重複是關鍵,雖然相同的序列會在整個數據中發生,但我想知道它在直接重複中發生了多少次,而不是在整個集合中發生了30,1,2****次,它連續執行了 3 次。從邏輯上講,相同的模式會在我的數據中經常重複,我關心它在某個模式中的哪個位置,然後我會尋找一致的模式,例如 0-4 3 次,然後隨機,然後返回 0-4 3 次。

如果它有幫助,數據就是振動強度的量度。通過檢測重複模式,它會暴露不直接屬於源的諧波。我想知道什麼是本質上的噪音,以及當你駕駛一輛不平衡的汽車時的回饋是什麼,它只會在特定速度之間搖晃。

編輯#2(更好的解釋和功能游標範例)鑑於此處提供的數據(實際範例)並消除了似乎是我可以處理的簡單過濾問題的輔助內容,行號與日期時間等…

樣本數據:

樣本數據

將該數據載入到範例表中

CREATE TABLE [dbo].[Table_1](
   [row] [int] NOT NULL,
   [streak] [int] NOT NULL,
   [rStreak] [varchar](100) NULL
) ON [PRIMARY]

實現計算的功能範例…

DECLARE @row INT
DECLARE @current INT
DECLARE @last INT
DECLARE @streak INT
DECLARE @streakCount INT
DECLARE @parser CURSOR

SET @streak = 0
--SET @streakCount = 1
SET @parser = CURSOR
FOR SELECT [row],
          [streak]
   FROM   [Table_1] --For the sake of brievity I am not filtering the *between* zeros and date range here, but I have that part covered.

OPEN @parser

FETCH next FROM @parser INTO @row, @current

WHILE @@FETCH_STATUS = 0
 BEGIN
     PRINT 'Row:' + Cast(@row AS VARCHAR) + ' Value:'
           + Cast(@current AS VARCHAR)

     IF @current = 0
       BEGIN
           PRINT 'Resetting streak of:'
                 + Cast(@streak AS VARCHAR)

           IF @streak = @last
              AND NOT @streak = 0
             BEGIN
                 SET @streakCount = @streakCount + 1

                 UPDATE [Table_1]
                 SET    [rStreak] = 'Concurrent Streak:'
                                    + Cast(@streakCount AS VARCHAR) + '@'
                                    + Cast(@streak AS VARCHAR)
                 WHERE  [row] = @row
             END
           ELSE
             SET @streakCount = 1

           SET @last = @streak
           SET @streak = 0
       END
     ELSE
       BEGIN
           SET @streak = @current

           PRINT 'Calculating streak length:'
                 + Cast(@streak AS VARCHAR)

           UPDATE [Table_1]
           SET    [rStreak] = 'Streak:' + Cast(@streak AS VARCHAR)
           WHERE  [row] = @row
       END

     FETCH next FROM @parser INTO @row, @current
 END

CLOSE @parser

DEALLOCATE @parser 

輸出:

Row:1 Value:0
Resetting streak of:0
Row:2 Value:1
Calculating streak length:1

(1 row(s) affected)
Row:3 Value:2
Calculating streak length:2

(1 row(s) affected)
Row:4 Value:3
Calculating streak length:3

(1 row(s) affected)
Row:5 Value:0
Resetting streak of:3
Row:6 Value:1
Calculating streak length:1

(1 row(s) affected)
Row:7 Value:2
Calculating streak length:2

(1 row(s) affected)
Row:8 Value:3
Calculating streak length:3

(1 row(s) affected)
Row:9 Value:0
Resetting streak of:3

(1 row(s) affected)
Row:10 Value:1
Calculating streak length:1

(1 row(s) affected)
Row:11 Value:2
Calculating streak length:2

(1 row(s) affected)
Row:12 Value:0
Resetting streak of:2
Row:13 Value:0
Resetting streak of:0
Row:14 Value:0
Resetting streak of:0
Row:15 Value:0
Resetting streak of:0
Row:16 Value:1
Calculating streak length:1

(1 row(s) affected)
Row:17 Value:2
Calculating streak length:2

(1 row(s) affected)
Row:18 Value:3
Calculating streak length:3

(1 row(s) affected)
Row:19 Value:0
Resetting streak of:3
Row:20 Value:1
Calculating streak length:1

(1 row(s) affected)
Row:21 Value:2
Calculating streak length:2

(1 row(s) affected)
Row:22 Value:3
Calculating streak length:3

(1 row(s) affected)
Row:23 Value:0
Resetting streak of:3

(1 row(s) affected)
Row:24 Value:1
Calculating streak length:1

(1 row(s) affected)
Row:25 Value:2
Calculating streak length:2

(1 row(s) affected)
Row:26 Value:0
Resetting streak of:2
Row:27 Value:1
Calculating streak length:1

(1 row(s) affected)
Row:28 Value:2
Calculating streak length:2

(1 row(s) affected)
Row:29 Value:0
Resetting streak of:2

(1 row(s) affected)

並產生以下數據(正確)

row streak  rStreak
1   0   NULL
2   1   Streak:1
3   2   Streak:2
4   3   Streak:3
5   0   NULL
6   1   Streak:1
7   2   Streak:2
8   3   Streak:3
9   0   Concurrent Streak:2@3
10  1   Streak:1
11  2   Streak:2
12  0   NULL
13  0   NULL
14  0   NULL
15  0   NULL
16  1   Streak:1
17  2   Streak:2
18  3   Streak:3
19  0   NULL
20  1   Streak:1
21  2   Streak:2
22  3   Streak:3
23  0   Concurrent Streak:2@3
24  1   Streak:1
25  2   Streak:2
26  0   NULL
27  1   Streak:1
28  2   Streak:2
29  0   Concurrent Streak:2@2

所以希望這可以從整體上澄清問題,我想知道這是否可以在沒有游標的情況下實現,需要我修改原始表,或者使用中間臨時表/表變數?

本質上這個模型是有效的,但它不能很好地擴展,而這在小數據集上很簡單,當我使用 RBAR 時,當我開始處理數十萬或行時,它會變得非常低效。

感謝大家一直以來的投入和耐心。:-)

假設您的所有序列都由從 0 到 X 單調地增長而沒有間隙的數字組成,您可以像這樣使用 smth:

declare @t table (id int identity, col int);
insert into @t(col) values
(0),(1),(2),(3),(0),(1),(2),(0),(0),(1),(2),(3),(4),(0),(1),(2),(3),(4),(0),(1),(2),(3),(4),(5),(0),(1);

with cte as
(
select id, row_number() over(order by id) as rn
from @t
where col = 0
)

,cte1 as
(
select c1.id as id1,
      c2.id as id2,
      c2.id - c1.id as len_
from cte as c1 
    join cte as c2
       on c1.rn + 1 = c2.rn
)

,cte2 as
(
select len_, count(*) as cnt
from cte1
group by len_
having  count(*) > 1
)

select top 1 *
from cte2
order by len_ desc;

這裡我用身份欄來模仿你的

它們由日期時間鍵控,因此排列它們以獲得正確的順序是微不足道的

所以我只計算包含在 0 之間的成員數,並假設如果計數相同,則數字也相同,然後僅過濾出現次數超過 1 的那些。

結果, len_ 是條紋的長度(也許您需要從中減去 1),而 cnt 是相應的計數。

如果您的表足夠大,當然您不應該使用 cte,而是將第一個結果集(cte)保存到 #tmp 表中

重申問題,以確保我這樣做

$$ not $$正確理解:給定一個序列序列,每個序列從零開始,每次增加一個,你想知道哪個序列出現最多的連續出現,不包括最終序列,可能不完整? 我想我有:

CREATE TABLE #Values
(
DateRecorded    DATETIMEOFFSET(0) NOT NULL PRIMARY KEY,
Value   INT NOT NULL
)
INSERT INTO #Values (DateRecorded, Value) VALUES
   ('2017-01-01', 0), ('2017-01-02', 1), ('2017-01-03', 2), ('2017-01-04', 3), ('2017-01-05', 4),
   ('2017-01-06', 0), ('2017-01-07', 1), ('2017-01-08', 2),
   ('2017-01-09', 0), ('2017-01-10', 1), ('2017-01-11', 2),
   ('2017-01-12', 0), ('2017-01-13', 1), ('2017-01-14', 2),
   ('2017-01-15', 0), ('2017-01-16', 1), ('2017-01-17', 2), ('2017-01-18', 3), ('2017-01-19', 4),
   ('2017-01-20', 0), ('2017-01-21', 1), ('2017-01-22', 2), ('2017-01-23', 3),
   ('2017-01-24', 0), ('2017-01-25', 1), ('2017-01-26', 2), ('2017-01-27', 3)

;WITH StartDates AS
   (
   SELECT DateRecorded FROM #Values WHERE Value = 0
   ),
Grps1 AS
   (
   SELECT
       St1.DateRecorded AS DateStarted, MIN(St2.DateRecorded) AS NextGroupStarted
   FROM
       StartDates AS St1
       INNER JOIN StartDates AS St2 ON St2.DateRecorded > St1.DateRecorded
   GROUP BY
       St1.DateRecorded
   ),
Grps2 AS
   (
   SELECT
       ROW_NUMBER() OVER (ORDER BY Grps1.DateStarted) AS Seq, MAX(V.Value) AS MaxValue
   FROM
       #Values AS V
       INNER JOIN Grps1 ON V.DateRecorded >= Grps1.DateStarted AND V.DateRecorded < Grps1.NextGroupStarted
   GROUP BY
       Grps1.DateStarted
   ),
Streaks AS
   (
   SELECT
       Seq, MaxValue, ROW_NUMBER() OVER (PARTITION BY MaxValue ORDER BY Seq) AS PositionInThisStreak
   FROM
       Grps2
   )
SELECT
   MaxValue, PositionInThisStreak AS NumSequencesInThisStreak
FROM
   Streaks
WHERE
   PositionInThisStreak = (SELECT MAX(PositionInThisStreak) FROM Streaks)

輸出是MaxValue "2", NumSequencesInThisStreak "3"

  • 這個邏輯期望所有序列都從零開始。
  • 此邏輯不會處理有間隙的序列。{ 0, 1, 3, 4 } 的序列和 { 0, 1, 2, 4 } 的序列將被認為是等效的:兩個組以 4 結尾。

引用自:https://dba.stackexchange.com/questions/180171