有什麼方法可以在沒有游標的情況下計算值的頻率?
鑑於我有一個具有增量值的列,並且我知道該值將攀升至 X(X 是任意的),我想計算最長的重複條紋是什麼而沒有最後一次計數(因為它可能會上升到他結束,我們還沒有做到)
它們由日期時間鍵控,因此排列它們以獲得正確的順序是微不足道的。排除最後出現的零之前的所有內容
例子:
0,1,2,3, 0,1,2, 0, 0,1,2,3,4, 0,1,2,3,4, 0,1,2,3,4,5, 0,1
將是“最長的重複條紋是 4 並且它連續發生了兩次。”
這最終將成為一個循環,我從集合中的最高值開始,產生 1@>=5。2@>=4 3@>=3, 3@>=2
我可以更詳細地解釋,現在只是一個概念,如果含糊不清,請原諒我,請問我需要在哪裡更詳細地說明。
因為我知道我必須使用程序循環來獲得最終值,Id est Cursor…而且,因為我真的不想嵌套它們,所以我可以忍受 5 或 10 次將最大值迭代為 0,但是不希望其中的數千個潛在迭代重複很多次……我想讓 SQL 做盡可能多的工作,但不要給它餵一些我知道會削弱它的東西結束。
所以……關於如何在基於集合的查詢中在 MSSQL 2008R2+ 中實現這一點的任何猜測,而不使用游標?
或者我應該在我領先的時候退出,把它拉進 C# 並在那裡做。
編輯:盡我所能在我的手機上輸入這個,如果我是在psudocode中用SQL來做的,這就是預期的目標。
Define datetime for start and end point Select data that is between the first zero after start, and the last zero prior to end. Define current, last, streak, count Loop through rows returned. If (current != 0) { last = current } Else { //we have reset, how high did we get before, and is it the same as the last time we reset? If (last == streak) { //we have made another streak of same value as last Count +=1 (update the table keeping count that the sequence of (last) has now occurred (count) times at this point) } Else { //we set a different max value, therefore a *different* streak //so we record it as the new streak to see if *it* repeats further Streak = last Count = 1 } Last = 0 }
我想我今晚會回去用 C# 編寫一個功能範例和模式圖形,看看這是否有助於說“我將如何或什至可以在 SQL 中做同樣的事情?”
澄清:如果值是:
0,1,2,3,4, 0,1,2, 0,1,2, 0,1,2, 0,1,2,3,4, 0,1
結果應該是兩個 4 還是三個 2?
在該範例中,將只有一個重複條紋,即 0、1、2 的連續序列,如其後返回零所示。重複是關鍵,雖然相同的序列會在整個數據中發生,但我想知道它在直接重複中發生了多少次,而不是在整個集合中發生了3
0,1,2
****次,它連續執行了 3 次。從邏輯上講,相同的模式會在我的數據中經常重複,我關心它在某個模式中的哪個位置,然後我會尋找一致的模式,例如 0-4 3 次,然後隨機,然後返回 0-4 3 次。如果它有幫助,數據就是振動強度的量度。通過檢測重複模式,它會暴露不直接屬於源的諧波。我想知道什麼是本質上的噪音,以及當你駕駛一輛不平衡的汽車時的回饋是什麼,它只會在特定速度之間搖晃。
編輯#2(更好的解釋和功能游標範例)鑑於此處提供的數據(實際範例)並消除了似乎是我可以處理的簡單過濾問題的輔助內容,行號與日期時間等…
樣本數據:
將該數據載入到範例表中
CREATE TABLE [dbo].[Table_1]( [row] [int] NOT NULL, [streak] [int] NOT NULL, [rStreak] [varchar](100) NULL ) ON [PRIMARY]
實現計算的功能範例…
DECLARE @row INT DECLARE @current INT DECLARE @last INT DECLARE @streak INT DECLARE @streakCount INT DECLARE @parser CURSOR SET @streak = 0 --SET @streakCount = 1 SET @parser = CURSOR FOR SELECT [row], [streak] FROM [Table_1] --For the sake of brievity I am not filtering the *between* zeros and date range here, but I have that part covered. OPEN @parser FETCH next FROM @parser INTO @row, @current WHILE @@FETCH_STATUS = 0 BEGIN PRINT 'Row:' + Cast(@row AS VARCHAR) + ' Value:' + Cast(@current AS VARCHAR) IF @current = 0 BEGIN PRINT 'Resetting streak of:' + Cast(@streak AS VARCHAR) IF @streak = @last AND NOT @streak = 0 BEGIN SET @streakCount = @streakCount + 1 UPDATE [Table_1] SET [rStreak] = 'Concurrent Streak:' + Cast(@streakCount AS VARCHAR) + '@' + Cast(@streak AS VARCHAR) WHERE [row] = @row END ELSE SET @streakCount = 1 SET @last = @streak SET @streak = 0 END ELSE BEGIN SET @streak = @current PRINT 'Calculating streak length:' + Cast(@streak AS VARCHAR) UPDATE [Table_1] SET [rStreak] = 'Streak:' + Cast(@streak AS VARCHAR) WHERE [row] = @row END FETCH next FROM @parser INTO @row, @current END CLOSE @parser DEALLOCATE @parser
輸出:
Row:1 Value:0 Resetting streak of:0 Row:2 Value:1 Calculating streak length:1 (1 row(s) affected) Row:3 Value:2 Calculating streak length:2 (1 row(s) affected) Row:4 Value:3 Calculating streak length:3 (1 row(s) affected) Row:5 Value:0 Resetting streak of:3 Row:6 Value:1 Calculating streak length:1 (1 row(s) affected) Row:7 Value:2 Calculating streak length:2 (1 row(s) affected) Row:8 Value:3 Calculating streak length:3 (1 row(s) affected) Row:9 Value:0 Resetting streak of:3 (1 row(s) affected) Row:10 Value:1 Calculating streak length:1 (1 row(s) affected) Row:11 Value:2 Calculating streak length:2 (1 row(s) affected) Row:12 Value:0 Resetting streak of:2 Row:13 Value:0 Resetting streak of:0 Row:14 Value:0 Resetting streak of:0 Row:15 Value:0 Resetting streak of:0 Row:16 Value:1 Calculating streak length:1 (1 row(s) affected) Row:17 Value:2 Calculating streak length:2 (1 row(s) affected) Row:18 Value:3 Calculating streak length:3 (1 row(s) affected) Row:19 Value:0 Resetting streak of:3 Row:20 Value:1 Calculating streak length:1 (1 row(s) affected) Row:21 Value:2 Calculating streak length:2 (1 row(s) affected) Row:22 Value:3 Calculating streak length:3 (1 row(s) affected) Row:23 Value:0 Resetting streak of:3 (1 row(s) affected) Row:24 Value:1 Calculating streak length:1 (1 row(s) affected) Row:25 Value:2 Calculating streak length:2 (1 row(s) affected) Row:26 Value:0 Resetting streak of:2 Row:27 Value:1 Calculating streak length:1 (1 row(s) affected) Row:28 Value:2 Calculating streak length:2 (1 row(s) affected) Row:29 Value:0 Resetting streak of:2 (1 row(s) affected)
並產生以下數據(正確)
row streak rStreak 1 0 NULL 2 1 Streak:1 3 2 Streak:2 4 3 Streak:3 5 0 NULL 6 1 Streak:1 7 2 Streak:2 8 3 Streak:3 9 0 Concurrent Streak:2@3 10 1 Streak:1 11 2 Streak:2 12 0 NULL 13 0 NULL 14 0 NULL 15 0 NULL 16 1 Streak:1 17 2 Streak:2 18 3 Streak:3 19 0 NULL 20 1 Streak:1 21 2 Streak:2 22 3 Streak:3 23 0 Concurrent Streak:2@3 24 1 Streak:1 25 2 Streak:2 26 0 NULL 27 1 Streak:1 28 2 Streak:2 29 0 Concurrent Streak:2@2
所以希望這可以從整體上澄清問題,我想知道這是否可以在沒有游標的情況下實現,需要我修改原始表,或者使用中間臨時表/表變數?
本質上這個模型是有效的,但它不能很好地擴展,而這在小數據集上很簡單,當我使用 RBAR 時,當我開始處理數十萬或行時,它會變得非常低效。
感謝大家一直以來的投入和耐心。:-)
假設您的所有序列都由從 0 到 X 單調地增長而沒有間隙的數字組成,您可以像這樣使用 smth:
declare @t table (id int identity, col int); insert into @t(col) values (0),(1),(2),(3),(0),(1),(2),(0),(0),(1),(2),(3),(4),(0),(1),(2),(3),(4),(0),(1),(2),(3),(4),(5),(0),(1); with cte as ( select id, row_number() over(order by id) as rn from @t where col = 0 ) ,cte1 as ( select c1.id as id1, c2.id as id2, c2.id - c1.id as len_ from cte as c1 join cte as c2 on c1.rn + 1 = c2.rn ) ,cte2 as ( select len_, count(*) as cnt from cte1 group by len_ having count(*) > 1 ) select top 1 * from cte2 order by len_ desc;
這裡我用身份欄來模仿你的
它們由日期時間鍵控,因此排列它們以獲得正確的順序是微不足道的
所以我只計算包含在 0 之間的成員數,並假設如果計數相同,則數字也相同,然後僅過濾出現次數超過 1 的那些。
結果, len_ 是條紋的長度(也許您需要從中減去 1),而 cnt 是相應的計數。
如果您的表足夠大,當然您不應該使用 cte,而是將第一個結果集(cte)保存到 #tmp 表中
重申問題,以確保我這樣做
$$ not $$正確理解:給定一個序列序列,每個序列從零開始,每次增加一個,你想知道哪個序列出現最多的連續出現,不包括最終序列,可能不完整? 我想我有:
CREATE TABLE #Values ( DateRecorded DATETIMEOFFSET(0) NOT NULL PRIMARY KEY, Value INT NOT NULL ) INSERT INTO #Values (DateRecorded, Value) VALUES ('2017-01-01', 0), ('2017-01-02', 1), ('2017-01-03', 2), ('2017-01-04', 3), ('2017-01-05', 4), ('2017-01-06', 0), ('2017-01-07', 1), ('2017-01-08', 2), ('2017-01-09', 0), ('2017-01-10', 1), ('2017-01-11', 2), ('2017-01-12', 0), ('2017-01-13', 1), ('2017-01-14', 2), ('2017-01-15', 0), ('2017-01-16', 1), ('2017-01-17', 2), ('2017-01-18', 3), ('2017-01-19', 4), ('2017-01-20', 0), ('2017-01-21', 1), ('2017-01-22', 2), ('2017-01-23', 3), ('2017-01-24', 0), ('2017-01-25', 1), ('2017-01-26', 2), ('2017-01-27', 3) ;WITH StartDates AS ( SELECT DateRecorded FROM #Values WHERE Value = 0 ), Grps1 AS ( SELECT St1.DateRecorded AS DateStarted, MIN(St2.DateRecorded) AS NextGroupStarted FROM StartDates AS St1 INNER JOIN StartDates AS St2 ON St2.DateRecorded > St1.DateRecorded GROUP BY St1.DateRecorded ), Grps2 AS ( SELECT ROW_NUMBER() OVER (ORDER BY Grps1.DateStarted) AS Seq, MAX(V.Value) AS MaxValue FROM #Values AS V INNER JOIN Grps1 ON V.DateRecorded >= Grps1.DateStarted AND V.DateRecorded < Grps1.NextGroupStarted GROUP BY Grps1.DateStarted ), Streaks AS ( SELECT Seq, MaxValue, ROW_NUMBER() OVER (PARTITION BY MaxValue ORDER BY Seq) AS PositionInThisStreak FROM Grps2 ) SELECT MaxValue, PositionInThisStreak AS NumSequencesInThisStreak FROM Streaks WHERE PositionInThisStreak = (SELECT MAX(PositionInThisStreak) FROM Streaks)
輸出是
MaxValue "2", NumSequencesInThisStreak "3"
。
- 這個邏輯期望所有序列都從零開始。
- 此邏輯不會處理有間隙的序列。{ 0, 1, 3, 4 } 的序列和 { 0, 1, 2, 4 } 的序列將被認為是等效的:兩個組以 4 結尾。