Mariadb
沒有組鍵時如何聚合相關行?
我正在使用 MariaDB 10.6 並有一個表:
CREATE TABLE velocities ( `state` int(8) NOT NULL, `timestamp` datetime NOT NULL, `velocity` decimal(5,2) NOT NULL, `name` varchar(15) NOT NULL ) DEFAULT CHARSET=utf8;
每行描述一個對像在特定時間戳的速度。多行一起描述一個時期。該
state
列可以是 -1、-2、-3 或任何大於 0 的數字。如果state
是 -1,則表示它是周期的開始,如果是 -2,則表示我們處於週期的中間,如果是 -3這意味著它是一段時間內的最後一個時間戳。如果state
是大於 0 的數字,則表示已為期間分配了一個 id。該表中有多個句點,它們可能在不同名稱之間重疊,但不與相同名稱重疊。現在我想做一個查詢,每個週期返回一行
start_time
, ,end_time
,max_velocity_time
,velocity_at_start
,velocity_at_end
,max_velocity
。範例數據
INSERT INTO velocities (state, timestamp, velocity, name) VALUES (-2, "2021-01-01 00:00:01", 2, "FOO"); INSERT INTO velocities (state, timestamp, velocity, name) VALUES (-2, "2021-01-01 00:00:02", 3, "FOO"); INSERT INTO velocities (state, timestamp, velocity, name) VALUES (-2, "2021-01-01 00:00:03", 2, "FOO"); INSERT INTO velocities (state, timestamp, velocity, name) VALUES (-1, "2021-01-01 00:00:00", 3, "BAZ"); INSERT INTO velocities (state, timestamp, velocity, name) VALUES (-2, "2021-01-01 00:00:01", 4, "BAZ"); INSERT INTO velocities (state, timestamp, velocity, name) VALUES (-2, "2021-01-01 00:00:02", 5, "BAZ"); INSERT INTO velocities (state, timestamp, velocity, name) VALUES (-2, "2021-01-01 00:00:03", 6, "BAZ"); INSERT INTO velocities (state, timestamp, velocity, name) VALUES (-3, "2021-01-01 00:00:04", 2, "BAZ"); INSERT INTO velocities (state, timestamp, velocity, name) VALUES (-1, "2021-01-01 00:00:02", 4, "BAR"); INSERT INTO velocities (state, timestamp, velocity, name) VALUES (-2, "2021-01-01 00:00:03", 7, "BAR"); INSERT INTO velocities (state, timestamp, velocity, name) VALUES (-2, "2021-01-01 00:00:04", 8, "BAR"); INSERT INTO velocities (state, timestamp, velocity, name) VALUES (-2, "2021-01-01 00:00:05", 10, "BAR"); INSERT INTO velocities (state, timestamp, velocity, name) VALUES (-3, "2021-01-01 00:00:06", 2, "BAR"); INSERT INTO velocities (state, timestamp, velocity, name) VALUES (42, "2021-01-01 00:00:07", 5, "BAR"); INSERT INTO velocities (state, timestamp, velocity, name) VALUES (42, "2021-01-01 00:00:08", 7, "BAR"); INSERT INTO velocities (state, timestamp, velocity, name) VALUES (42, "2021-01-01 00:00:09", 10, "BAR"); INSERT INTO velocities (state, timestamp, velocity, name) VALUES (42, "2021-01-01 00:00:10", 17, "BAR"); INSERT INTO velocities (state, timestamp, velocity, name) VALUES (42, "2021-01-01 00:00:11", 2, "BAR");
鑑於此數據,我希望結果如下所示:
第一次嘗試
我的第一次嘗試從https://stackoverflow.com/questions/1136597/group-by-for-continuous-rows-in-sql獲得靈感,我讓它在 MySQL 8.0 中工作,但在 MariaDB 10.6 中不起作用我需要它。我得到的查詢是:
WITH cte AS ( SELECT @r := @r + ( CASE WHEN @state > 0 THEN @state != v.state WHEN @state - v.state < 0 THEN 1 ELSE 0 END ) AS id, @state := state AS _, v.name, v.timestamp, v.velocity, (CASE WHEN v.state > 0 THEN v.state ELSE NULL END) AS booked_id FROM (SELECT @r := 0, @state := 0) vars, velocities v ORDER BY v.name, v.timestamp ), inner_max_velocity_cte AS (SELECT id, MAX(velocity) AS velocity FROM cte GROUP BY id), max_velocity_cte AS ( SELECT cte.id, cte.timestamp, cte.velocity FROM cte INNER JOIN inner_max_velocity_cte x ON cte.id = x.id AND cte.velocity = x.velocity ), inner_end_time_cte AS (SELECT id, MAX(timestamp) AS timestamp FROM cte GROUP BY id), end_time_cte AS ( SELECT cte.id, cte.timestamp, cte.velocity FROM cte INNER JOIN inner_end_time_cte x ON cte.id = x.id AND cte.timestamp = x.timestamp ) SELECT cte.booked_id, cte.name, cte.timestamp AS start_time, end_time_cte.timestamp AS end_time, max_velocity_cte.timestamp AS max_velocity_time, cte.velocity AS start_time_velocity, end_time_cte.velocity AS end_time_velocity, max_velocity_cte.velocity AS max_velocity FROM cte LEFT OUTER JOIN max_velocity_cte ON max_velocity_cte.id = cte.id LEFT OUTER JOIN end_time_cte ON end_time_cte.id = cte.id GROUP BY cte.id ORDER BY start_time
有人建議我可以用視窗函式來做到這一點,但我不知道如何讓它工作?
這是一個典型的差距和孤島問題。
有很多解決方案。標準方法是為每個部分定義起點或終點,然後使用視窗條件
COUNT
對島嶼進行編號。然後,您只需按此分組編號進行分組。這裡有額外的並發症
- 我們需要檢查前一行是否為負,但這不是
- 您有一個不以 開頭的島
-1
,因此我們還需要檢查null
以前的值- 我們需要使用更多的視窗函式來獲得每組的最終值。為了避免額外的排序,我已經避免了多個行號,所以我們使用
LEAD
andLAG
為此WITH PrevValues AS ( SELECT *, LAG(state) OVER (PARTITION BY name ORDER BY timestamp) AS prevValue FROM velocities ), Groupings AS ( SELECT *, COUNT(CASE WHEN state = -1 OR prevValue IS NULL OR (prevValue < 0 AND state >= 0) THEN 1 END) OVER (PARTITION BY name ORDER BY timestamp ROWS UNBOUNDED PRECEDING) AS GroupId FROM PrevValues ), PerGroup AS ( SELECT *, IFNULL(LAG(GroupId) OVER (PARTITION BY name ORDER BY timestamp), -1) AS prevGroup, IFNULL(LEAD(GroupId) OVER (PARTITION BY name ORDER BY timestamp), -1) AS nextGroup, ROW_NUMBER() OVER (PARTITION BY name, GroupId ORDER BY velocity DESC) AS rnVelocity FROM Groupings ) SELECT (CASE WHEN state > 0 THEN state END) AS booked_id, name, MIN(timestamp) AS start_time, MAX(timestamp) AS end_time, MIN(CASE WHEN rnVelocity = 1 THEN timestamp END) AS max_velocity_time, MIN(CASE WHEN prevGroup <> GroupId THEN velocity END) AS start_time_velocity, MIN(CASE WHEN nextGroup <> GroupId THEN velocity END) AS end_time_velocity, MAX(velocity) AS max_velocity FROM PerGroup GROUP BY name, GroupId;