Mysql

從多個聯接中逐日匯總總和會返回意外結果

  • December 11, 2014

我正在嘗試製作一個表格,其中每天匯總任務時間和小組時間。我能夠為任務和組獲得想要的結果,但是當我嘗試在同一個查詢中獲得它們時,我得到了意想不到的結果。

這是我的測試:

樣本數據:

CREATE TABLE IF NOT EXISTS `groups` (
 `id` int(11) NOT NULL,
 `name` varchar(200) NOT NULL,
 `hours` float NOT NULL,
 `created` datetime NOT NULL
);


INSERT INTO `groups` (`id`, `name`, `hours`, `created`) VALUES
(1, 'Description of job 1', 11, '2014-12-02 10:09:52'),
(2, 'Description of job 2', 10, '2014-12-04 10:09:52'),
(3, 'Description of job 3', 25, '2014-12-11 10:09:52');


CREATE TABLE IF NOT EXISTS `tasks` (
 `id` int(7) NOT NULL,
 `groupid` int(11) NOT NULL,
 `hours` int(5) NOT NULL,
 `text` text NOT NULL,
 `created` datetime NOT NULL
);

INSERT INTO `tasks` (`id`, `groupid`, `hours`, `text`, `created`) VALUES
(1, 1, 1, 'Some task on job 1', '2014-12-03 10:10:00'),
(2, 1, 2, 'Some task on job 1', '2014-12-04 10:10:00'),
(3, 1, 3, 'Some task on job 1', '2014-12-10 10:10:00'),
(4, 2, 5, 'Some task on job 2', '2014-12-05 10:10:00'),
(5, 2, 5, 'Some task on job 2', '2014-12-06 10:10:00'),
(6, 2, 1, 'Some task on job 2', '2014-12-08 10:10:00');


CREATE TABLE IF NOT EXISTS `datetable` (
 `thedate` datetime NOT NULL
);

INSERT INTO `datetable` (`thedate`) VALUES
('2014-11-28 00:00:00'),
('2014-11-29 00:00:00'),
('2014-11-30 00:00:00'),
('2014-12-01 00:00:00'),
('2014-12-02 00:00:00'),
('2014-12-03 00:00:00'),
('2014-12-04 00:00:00'),
('2014-12-05 00:00:00'),
('2014-12-06 00:00:00'),
('2014-12-07 00:00:00'),
('2014-12-08 00:00:00'),
('2014-12-09 00:00:00'),
('2014-12-10 00:00:00'),
('2014-12-11 00:00:00'),
('2014-12-12 00:00:00'),
('2014-12-13 00:00:00');

現在,通過執行下面的查詢,我每天和前幾天的每項任務每天的總小時數。

SELECT
DATE_FORMAT(dt.thedate, '%Y-%m-%d') as the_date,
SUM(tt.sum_task) as sum_t

FROM datetable dt

LEFT JOIN   (
               SELECT
               DATE(tx.created) as created_date,
               SUM(tx.hours) as sum_task

               FROM tasks tx

                -- Some extra where clauses here

               GROUP BY tx.created
           )
           AS tt ON DATE(tt.created_date) <= DATE(dt.thedate)


GROUP BY dt.thedate

ORDER BY dt.thedate ASC

小提琴:http ://sqlfiddle.com/#!2/38fe4/2/0

好的。現在我想要與組相同類型的列,所以我以相同的方式添加它:

SELECT
DATE_FORMAT(dt.thedate, '%Y-%m-%d') as the_date,
SUM(tt.sum_task) as sum_t,
SUM(tg.sum_group) as sum_g

FROM datetable dt

LEFT JOIN   (
               SELECT
               DATE(tx.created) as created_date,
               SUM(tx.hours) as sum_task

               FROM tasks tx

                -- Some extra where clauses here

               GROUP BY tx.created
           )
           AS tt ON DATE(tt.created_date) <= DATE(dt.thedate)

LEFT JOIN   (
               SELECT
               DATE(gx.created) as created_date2,
               SUM(gx.hours) as sum_group

               FROM groups gx

                -- Some extra where clauses here

               GROUP BY gx.created
           )
           AS tg ON DATE(tg.created_date2) <= DATE(dt.thedate)

GROUP BY dt.thedate

ORDER BY dt.thedate ASC

小提琴:http ://sqlfiddle.com/#!2/38fe4/1

但現在我得到的數字似乎多次增加。

當我只使用一個 LEFT JOIN 進行查詢時,我得到了我想要的結果,但是當我嘗試將它們都加入時,我得到了意想不到的結果。

這裡到底發生了什麼,我如何在沒有自聚合數字的情況下輸出小組時間和任務時間?

預期結果:

the_date        sum_t   sum_g
2014-11-28      NULL    NULL
2014-11-29      NULL    NULL
2014-11-30      NULL    NULL
2014-12-01      NULL    NULL
2014-12-02      NULL    11
2014-12-03      1       11
2014-12-04      3       21
2014-12-05      8       21
2014-12-06      13      21
2014-12-07      13      21
2014-12-08      14      21
2014-12-09      14      21
2014-12-10      17      21
2014-12-11      17      46
2014-12-12      17      46
2014-12-13      17      46

實際結果:

the_date        sum_t   sum_g
2014-11-28      NULL    NULL
2014-11-29      NULL    NULL
2014-11-30      NULL    NULL
2014-12-01      NULL    NULL
2014-12-02      NULL    11
2014-12-03      1       11
2014-12-04      6       42
2014-12-05      16      63
2014-12-06      26      84
2014-12-07      26      84
2014-12-08      28      105
2014-12-09      28      105
2014-12-10      34      126
2014-12-11      51      276
2014-12-12      51      276
2014-12-13      51      276

我首先認為你也需要GROUP BYdatetable派生表中,以避免交叉連接,但似乎datetable已經有不同的日期,所以這不是原因。問題在於<=(多個)連接中。這會產生一種交叉連接和錯誤的結果。

因此,解決方案是在派生表內部LEFT JOIN<=外部級別連接中使用相等=和 no GROUP BY

另一個問題來自不使用GROUP BY DATE(DateColumn). 使用它 - 然後ON條件也可以簡化。

我還將DATE(tx.created) <= DATE(dt.thedate)條件更改為可以使用tx.created < (dt.thedate + INTERVAL 1 DAY)索引:(created, hours)

SELECT
DATE_FORMAT(dt.thedate, '%Y-%m-%d') AS the_date,
COALESCE(tt.sum_task, 0) AS sum_t,
COALESCE(tg.sum_group, 0) AS sum_g

FROM datetable AS dt

LEFT JOIN   (                                  -- this could be an INNER JOIN, no difference
               SELECT
               dt.thedate,
               SUM(tx.hours) as sum_task

               FROM datetable AS dt
                 LEFT JOIN tasks AS tx
                   ON tx.created < (dt.thedate + INTERVAL 1 DAY)

                -- Some extra where clauses here   -- move them to the ON above

               GROUP BY dt.thedate
           )
           AS tt ON tt.thedate = dt.thedate

LEFT JOIN   (                                 -- this could be an INNER JOIN, no difference
               SELECT
               dt.thedate,
               SUM(gx.hours) as sum_group

               FROM datetable AS dt
                 LEFT JOIN groups AS gx
                   ON gx.created < (dt.thedate + INTERVAL 1 DAY)

                -- Some extra where clauses here   -- move them to the ON above

               GROUP BY dt.thedate
           )
           AS tg ON tg.thedate = dt.thedate

ORDER BY dt.thedate ;

在**SQLFiddle測試**

引用自:https://dba.stackexchange.com/questions/85886