Mysql
時間序列數據 - 每日聚合查詢的問題
我正在嘗試使用每小時時間序列數據查詢 MySQL 數據庫,但我堅持使用(每日)聚合。
我的原始數據:
mysql> SELECT * FROM emoncms.custom_registers -> WHERE timestamp >= '2016-08-24' AND timestamp < '2016-08-25'; +-----+------+---------------------+----------------+ | id | node | timestamp | register_value | +-----+------+---------------------+----------------+ | 299 | 10 | 2016-08-24 00:00:03 | 869.458 | | 300 | 10 | 2016-08-24 01:00:00 | 869.825 | | 301 | 10 | 2016-08-24 02:00:03 | 870.202 | | 302 | 10 | 2016-08-24 03:00:01 | 870.616 | | 303 | 10 | 2016-08-24 04:00:05 | 870.977 | | 304 | 10 | 2016-08-24 05:00:03 | 871.376 | | 305 | 10 | 2016-08-24 06:00:01 | 871.759 | | 306 | 10 | 2016-08-24 07:00:00 | 872.156 | | 307 | 10 | 2016-08-24 08:00:04 | 872.629 | | 308 | 10 | 2016-08-24 09:00:02 | 873.038 | | 309 | 10 | 2016-08-24 10:00:01 | 873.512 | | 310 | 10 | 2016-08-24 11:00:01 | 874.009 | | 311 | 10 | 2016-08-24 12:00:01 | 874.429 | | 312 | 10 | 2016-08-24 13:00:00 | 874.894 | | 313 | 10 | 2016-08-24 14:00:04 | 875.331 | | 314 | 10 | 2016-08-24 15:00:04 | 876.715 | | 315 | 10 | 2016-08-24 16:00:04 | 877.249 | | 316 | 10 | 2016-08-24 17:00:04 | 878.672 | | 317 | 10 | 2016-08-24 18:00:04 | 879.19 | | 318 | 10 | 2016-08-24 19:00:04 | 879.603 | | 319 | 10 | 2016-08-24 20:00:04 | 880.027 | | 320 | 10 | 2016-08-24 21:00:04 | 880.386 | | 321 | 10 | 2016-08-24 22:00:05 | 880.787 | | 322 | 10 | 2016-08-24 23:00:00 | 881.218 | +-----+------+---------------------+----------------+ 24 rows in set (0.00 sec)
查詢區間差異時:
mysql> SELECT t2.node, -> ROUND(MIN(t2.register_value - t1.register_value),3) AS register_diff, -> t1.timestamp AS start_interval, t2.timestamp AS stop_interval -> FROM custom_registers AS t1 INNER JOIN custom_registers AS t2 -> on t1.timestamp < t2.timestamp WHERE t1.node = t2.node -> AND t1.timestamp >= '2016-08-24 00:00:00' -> AND t1.timestamp < '2016-08-25 00:00:00' -> GROUP BY t1.id ORDER BY t1.timestamp; +------+---------------+---------------------+---------------------+ | node | register_diff | start_interval | stop_interval | +------+---------------+---------------------+---------------------+ | 10 | 0.367 | 2016-08-24 00:00:03 | 2016-08-24 01:00:00 | | 10 | 0.377 | 2016-08-24 01:00:00 | 2016-08-24 02:00:03 | | 10 | 0.414 | 2016-08-24 02:00:03 | 2016-08-24 03:00:01 | | 10 | 0.361 | 2016-08-24 03:00:01 | 2016-08-24 04:00:05 | | 10 | 0.399 | 2016-08-24 04:00:05 | 2016-08-24 05:00:03 | | 10 | 0.383 | 2016-08-24 05:00:03 | 2016-08-24 06:00:01 | | 10 | 0.397 | 2016-08-24 06:00:01 | 2016-08-24 07:00:00 | | 10 | 0.473 | 2016-08-24 07:00:00 | 2016-08-24 08:00:04 | | 10 | 0.409 | 2016-08-24 08:00:04 | 2016-08-24 09:00:02 | | 10 | 0.474 | 2016-08-24 09:00:02 | 2016-08-24 10:00:01 | | 10 | 0.497 | 2016-08-24 10:00:01 | 2016-08-24 11:00:01 | | 10 | 0.420 | 2016-08-24 11:00:01 | 2016-08-24 12:00:01 | | 10 | 0.465 | 2016-08-24 12:00:01 | 2016-08-24 13:00:00 | | 10 | 0.437 | 2016-08-24 13:00:00 | 2016-08-24 14:00:04 | | 10 | 1.384 | 2016-08-24 14:00:04 | 2016-08-24 15:00:04 | | 10 | 0.534 | 2016-08-24 15:00:04 | 2016-08-24 16:00:04 | | 10 | 1.423 | 2016-08-24 16:00:04 | 2016-08-24 17:00:04 | | 10 | 0.518 | 2016-08-24 17:00:04 | 2016-08-24 18:00:04 | | 10 | 0.413 | 2016-08-24 18:00:04 | 2016-08-24 19:00:04 | | 10 | 0.424 | 2016-08-24 19:00:04 | 2016-08-24 20:00:04 | | 10 | 0.359 | 2016-08-24 20:00:04 | 2016-08-24 21:00:04 | | 10 | 0.401 | 2016-08-24 21:00:04 | 2016-08-24 22:00:05 | | 10 | 0.431 | 2016-08-24 22:00:05 | 2016-08-24 23:00:00 | | 10 | 0.371 | 2016-08-24 23:00:00 | 2016-08-25 00:00:01 | +------+---------------+---------------------+---------------------+ 24 rows in set (0.00 sec)
總計為 12.131。不幸的是,我的聚合查詢僅對 23 個間隔求和,而不是 24 個。它錯過了 0.371 的值。我的查詢有什麼問題?
mysql> SELECT t1.timestamp AS stop_interval, t2.register_min, -> t2.register_max, (register_max - register_min) AS register_diff -> FROM emoncms.custom_registers AS t1, -> (SELECT timestamp, ROUND(MIN(register_value),3) AS register_min, -> ROUND(MAX(register_value),3) AS register_max -> FROM emoncms.custom_registers WHERE timestamp >= '2016-08-24' -> AND timestamp <= ('2016-08-25'+ INTERVAL 1 MINUTE) -> GROUP BY YEAR(timestamp), WEEK(timestamp), DAY(timestamp) -> ) AS t2 WHERE t1.timestamp = t2.timestamp; +---------------------+--------------+--------------+---------------+ | stop_interval | register_min | register_max | register_diff | +---------------------+--------------+--------------+---------------+ | 2016-08-24 00:00:03 | 869.458 | 881.218 | 11.760 | | 2016-08-25 00:00:01 | 881.589 | 881.589 | 0.000 | +---------------------+--------------+--------------+---------------+ 2 rows in set (0.00 sec)
也許其中之一?
SELECT * FROM emoncms.custom_registers WHERE timestamp >= '2016-08-24' AND timestamp < '2016-08-24' + INTERVAL 25 HOUR; SELECT * FROM emoncms.custom_registers WHERE timestamp >= '2016-08-24' ORDER BY timestamp LIMIT 25;
注意使用 25,而不是 24。
您的查詢是無效的 sql,它似乎在 MySQL 中工作(除非您將 ONLY_FULL_GROUP_BY 添加到 @@sql_mode),但可能會產生不確定的結果。如果我理解您的意圖,您希望獲得每個節點的目前行和上一行。據我了解,視窗函式很快就會進入 MySQL,但還沒有。使用 LEAD 或 LAG 來解決您的問題很容易,但您可能無法使用這些:
select node, timestamp ts1 , lead(register_value) over (partition by node order by timestamp) - register_value as diff , lead(t1.timestamp) over (partition by node order by timestamp) ts2 from custom_registers
如果沒有視窗函式,第一步是查找目前行和下一行:
select t1.node, t1.timestamp ts1, min(t2.timestamp) ts2 from custom_registers t1 left join custom_registers t2 on t1.node = t2.node and t1.timestamp < t2.timestamp group by t1.node, t1.timestamp; +------+---------------------+---------------------+ | node | ts1 | ts2 | +------+---------------------+---------------------+ | 10 | 2016-08-24 00:00:03 | 2016-08-24 01:00:00 | | 10 | 2016-08-24 01:00:00 | 2016-08-24 02:00:03 | | 10 | 2016-08-24 02:00:03 | 2016-08-24 03:00:01 | | 10 | 2016-08-24 03:00:01 | 2016-08-24 04:00:05 | | 10 | 2016-08-24 04:00:05 | 2016-08-24 05:00:03 | | 10 | 2016-08-24 05:00:03 | 2016-08-24 06:00:01 | | 10 | 2016-08-24 06:00:01 | 2016-08-24 07:00:00 | | 10 | 2016-08-24 07:00:00 | 2016-08-24 08:00:04 | | 10 | 2016-08-24 08:00:04 | 2016-08-24 09:00:02 | | 10 | 2016-08-24 09:00:02 | 2016-08-24 10:00:01 | | 10 | 2016-08-24 10:00:01 | 2016-08-24 11:00:01 | | 10 | 2016-08-24 11:00:01 | 2016-08-24 12:00:01 | | 10 | 2016-08-24 12:00:01 | 2016-08-24 13:00:00 | | 10 | 2016-08-24 13:00:00 | 2016-08-24 14:00:04 | | 10 | 2016-08-24 14:00:04 | 2016-08-24 15:00:04 | | 10 | 2016-08-24 15:00:04 | 2016-08-24 16:00:04 | | 10 | 2016-08-24 16:00:04 | 2016-08-24 17:00:04 | | 10 | 2016-08-24 17:00:04 | 2016-08-24 18:00:04 | | 10 | 2016-08-24 18:00:04 | 2016-08-24 19:00:04 | | 10 | 2016-08-24 19:00:04 | 2016-08-24 20:00:04 | | 10 | 2016-08-24 20:00:04 | 2016-08-24 21:00:04 | | 10 | 2016-08-24 21:00:04 | 2016-08-24 22:00:05 | | 10 | 2016-08-24 22:00:05 | 2016-08-24 23:00:00 | | 10 | 2016-08-24 23:00:00 | NULL | +------+---------------------+---------------------+
一組 24 行(0.01 秒)
注意最後一行的空值,表示沒有下一行。
現在我們可以使用這個派生表並將其與原始表連接兩次:
select x.node, abs(t3.register_value - t4.register_value) as diff , x.ts1, x.ts2 from ( select t1.node, t1.timestamp ts1, min(t2.timestamp) ts2 from custom_registers t1 left join custom_registers t2 on t1.node = t2.node and t1.timestamp < t2.timestamp group by t1.node, t1.timestamp ) as x join custom_registers t3 on x.node = t3.node and x.ts1 = t3.timestamp join custom_registers t4 on x.node = t4.node and x.ts2 = t4.timestamp order by x.node, x.ts1;
這將使用您的樣本數據生成 23 行,我不確定您想對最後一行做什麼,因為那裡沒有下一行。
+------+-------+---------------------+---------------------+ | node | diff | ts1 | ts2 | +------+-------+---------------------+---------------------+ | 10 | 0.367 | 2016-08-24 00:00:03 | 2016-08-24 01:00:00 | | 10 | 0.377 | 2016-08-24 01:00:00 | 2016-08-24 02:00:03 | | 10 | 0.414 | 2016-08-24 02:00:03 | 2016-08-24 03:00:01 | | 10 | 0.361 | 2016-08-24 03:00:01 | 2016-08-24 04:00:05 | | 10 | 0.399 | 2016-08-24 04:00:05 | 2016-08-24 05:00:03 | | 10 | 0.383 | 2016-08-24 05:00:03 | 2016-08-24 06:00:01 | | 10 | 0.397 | 2016-08-24 06:00:01 | 2016-08-24 07:00:00 | | 10 | 0.473 | 2016-08-24 07:00:00 | 2016-08-24 08:00:04 | | 10 | 0.409 | 2016-08-24 08:00:04 | 2016-08-24 09:00:02 | | 10 | 0.474 | 2016-08-24 09:00:02 | 2016-08-24 10:00:01 | | 10 | 0.497 | 2016-08-24 10:00:01 | 2016-08-24 11:00:01 | | 10 | 0.420 | 2016-08-24 11:00:01 | 2016-08-24 12:00:01 | | 10 | 0.465 | 2016-08-24 12:00:01 | 2016-08-24 13:00:00 | | 10 | 0.437 | 2016-08-24 13:00:00 | 2016-08-24 14:00:04 | | 10 | 1.384 | 2016-08-24 14:00:04 | 2016-08-24 15:00:04 | | 10 | 0.534 | 2016-08-24 15:00:04 | 2016-08-24 16:00:04 | | 10 | 1.423 | 2016-08-24 16:00:04 | 2016-08-24 17:00:04 | | 10 | 0.518 | 2016-08-24 17:00:04 | 2016-08-24 18:00:04 | | 10 | 0.413 | 2016-08-24 18:00:04 | 2016-08-24 19:00:04 | | 10 | 0.424 | 2016-08-24 19:00:04 | 2016-08-24 20:00:04 | | 10 | 0.359 | 2016-08-24 20:00:04 | 2016-08-24 21:00:04 | | 10 | 0.401 | 2016-08-24 21:00:04 | 2016-08-24 22:00:05 | | 10 | 0.431 | 2016-08-24 22:00:05 | 2016-08-24 23:00:00 | +------+-------+---------------------+---------------------+ 23 rows in set (0.01 sec)
如果您只對日期總數感興趣,例如:
select node, dt, min_val, max_val, max_val-min_val as diff from ( select node, date(timestamp) dt, min(register_value) as min_val, max(register_value) as max_val from custom_registers group by node, date(timestamp) ) as t; +------+------------+---------+---------+--------+ | node | dt | min_val | max_val | diff | +------+------------+---------+---------+--------+ | 10 | 2016-08-24 | 869.458 | 881.218 | 11.760 | +------+------------+---------+---------+--------+
編輯:我現在註意到您的樣本中實際上有一個第二天的值。補充一點:
insert into custom_registers values (323,10,'2016-08-25 00:00:01',881.588); select x.node, abs(t3.register_value - t4.register_value) as diff, x.ts1, x.ts2 from (select t1.node, t1.timestamp ts1, min(t2.timestamp) ts2 from custom_registers t1 left join custom_registers t2 on t1.node = t2.node and t1.timestamp < t2.timestamp group by t1.node, t1.timestamp ) as x join custom_registers t3 on x.node = t3.node and x.ts1 = t3.timestamp join custom_registers t4 on x.node = t4.node and x.ts2 = t4.timestamp where ts1 >= '2016-08-24 00:00:00' and ts1 < '2016-08-25 00:00:00' order by x.node, x.ts1; [...] | 10 | 0.370 | 2016-08-24 23:00:00 | 2016-08-25 00:00:01 | 24 rows in set (0.01 sec)
日聚合變為:
select node, dt, min_val, max_val, max_val-min_val as diff from (select node, date(timestamp) dt, min(register_value) as min_val , max(register_value) as max_val from custom_registers group by node, date(timestamp) ) as t; +------+------------+---------+---------+--------+ | node | dt | min_val | max_val | diff | +------+------------+---------+---------+--------+ | 10 | 2016-08-24 | 869.458 | 881.218 | 11.760 | | 10 | 2016-08-25 | 881.588 | 881.588 | 0.000 | +------+------------+---------+---------+--------+ 2 rows in set (0.71 sec)