Postgresql

dense_rank 最多超過 3 天?

  • August 14, 2017

如何在其 3 天的行中獲得每行的列的 MAX(或另一個聚合函式)?

具有預期輸出和數據庫架構的 SQL 範例:http ://sqlfiddle.com/#!17/24686/3

CREATE TABLE public.tbl (
 date    DATE       NOT NULL,
 someNum DECIMAL    NOT NULL,
 name    VARCHAR(6) NOT NULL,
 elem    VARCHAR(9) NOT NULL,
 PRIMARY KEY ("date", someNum, "name", elem)
);

INSERT INTO public.tbl ("date", someNum, "name", elem) VALUES
 ('2017-12-05', 50.5, '0hello', 'nice elem'),
 ('2017-12-05', 05.5, '1hello', 'nice elem'),
 ('2017-12-05', 55.5, '2hello', 'nice elem'),
 ('2017-12-09', 59.5, '3hello', 'nice elem'),
 ('2017-12-09', 60.5, '4hello', 'nice elem'),
 ('2017-12-10', 90.5, '5hello', 'nice elem'),
 ('2017-12-12', 10.5, '6hello', 'nice elem'),
 ('2017-12-15', 50.3, '7hello', 'nice elem'),
 ('2017-12-30', 70.5, '8hello', 'nice elem'),
 ('2018-01-01', 50.5, '9hello', 'nice elem'),
 ('2017-12-05', 05.5, '10ello', 'mean elem'),
 ('2017-12-05', 5505, '11ello', 'mean elem'),
 ('2017-12-05', 6045, '12ello', 'mean elem'),
 ('2017-12-03', 9045, '13ello', 'mean elem'),
 ('2017-12-04', 1345, '14ello', 'mean elem'),
 ('2017-10-02', 1111, '15ello', 'mean elem'),
 ('2017-10-03', 5555, '16ello', 'mean elem'),
 ('2017-10-04', 66.6, '16ello', 'mean elem');

所需的輸出,

-- MAX over 3 day period
-- date         somenum     name        elem            max_over_3days
-- 2017-10-02   1111        '15ello'    'mean elem'     5555
-- 2017-10-03   5555        '16ello'    'mean elem'     5555
-- 2017-10-04   66.6        '16ello'    'mean elem'     5555
-- 2017-12-03   9045        '13ello'    'mean elem'     9045
-- 2017-12-05   6045        '12ello'    'mean elem'     9045
-- 2017-12-04   1345        '14ello'    'mean elem'     9045
-- 2017-12-05   05.5        '10ello'    'mean elem'     9045
-- 2017-12-05   5505        '11ello'    'mean elem'     9045
-- 2017-12-05   50.5        '0hello'    'nice elem'     9045
-- 2017-12-05   55.5        '2hello'    'nice elem'     9045
-- 2017-12-05   05.5        '1hello'    'nice elem'     9045
-- 2017-12-09   60.5        '4hello'    'nice elem'     90.5
-- 2017-12-09   59.5        '3hello'    'nice elem'     90.5
-- 2017-12-10   90.5        '5hello'    'nice elem'     99.5
-- 2017-12-13   99.5        '6hello'    'nice elem'     99.5
-- 2017-12-15   50.3        '7hello'    'nice elem'     99.5
-- 2017-12-30   70.5        '8hello'    'nice elem'     70.5
-- 2018-01-01   50.5        '9hello'    'nice elem'     70.5

SELECT * FROM public.tbl
GROUP BY elem, "date", name, someNum
ORDER BY elem, "date";

PS:我也對如何覆蓋 3 個工作日而不是日曆日感興趣

使用CROSS JOIN LATERAL

輸出與您的輸出相似,但請再次檢查。在這裡,我使用橫向someNume(作為循環)來找到3 周圍天的最大值(它沒有提到性能)。

SELECT tbl.date, tbl.somenum, tbl.name, tbl.elem, t2.max_somenum
FROM tbl
CROSS JOIN LATERAL (
 SELECT MAX(t2.somenum) AS max_somenum
 FROM tbl t2 
 WHERE t2.date >= (tbl.date - interval '3 days') AND t2.date <= (tbl.date + interval '3 days')
) t2
ORDER BY date;

http://sqlfiddle.com/#!17/24686/15

注意:不清楚周圍 3 天是什麼意思,所以我只是根據密集等級將日期範圍分成 3 天的組。如果您想要 3 天(上一個日期、這個日期、下一個日期),您可以使用 LEAD / LAG,或者,如果您的數據庫支持它,請參閱規範中的“視窗框架之間”。那應該這樣做。

對於我的範例:我提供了一些額外的計算供您考慮。

基本上,我們按日期獲得每行的密集排名,然後除以 3(使用整數截斷)來計算一個值,該值允許通過 PARTITION BY 子句將行分組為 3 個(日期)組。然後將 MAX 視窗函式應用於這些分區。

WITH cte1 AS (
         SELECT t.*
              , ROW_NUMBER() OVER (ORDER BY date, someNum, "name") AS rn
              , ROW_NUMBER() OVER (ORDER BY date                 ) AS rn2
              , RANK()       OVER (ORDER BY date, someNum, "name") AS r
              , RANK()       OVER (ORDER BY date                 ) AS r2
              , DENSE_RANK() OVER (ORDER BY date                 ) AS r3
           FROM public.tbl AS t
    )
  , cte2 AS (
         SELECT c1.*
              , r3/3                 AS part
              , ((r3-1)/3)           AS part2
           FROM cte1 AS c1
    )
SELECT c2.*
    , SUM(someNum) OVER (PARTITION BY part2) AS sumv
    , MIN(someNum) OVER (PARTITION BY part2) AS minv
    , MAX(someNum) OVER (PARTITION BY part2) AS maxv
 FROM cte2 AS c2
ORDER BY elem, date
;

經過進一步審查,如果您想要 3 天的最大值(1 天之前,目前,1 天之後),您的預期結果似乎是錯誤的。這是該案例的解決方案:

http://sqlfiddle.com/#!17/24686/26

WITH cte1 AS (
         SELECT t.date
              , MAX(t.someNum) AS maxnum
           FROM public.tbl AS t
          GROUP BY t.date
    )
  , cte2 AS (
         SELECT c1.date
              , c1.maxnum
              , MAX(maxnum) OVER (ORDER BY date ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS maxnum2
           FROM cte1 AS c1
    )
SELECT t.*
    , c2.maxnum2
 FROM public.tbl AS t
 JOIN cte2       AS c2
   ON c2.date = t.date
ORDER BY t.elem, t.date
;

此外,如果您的數據庫支持它,請參閱:“MAX(someNum) OVER (ORDER BY date RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING)”

引用自:https://dba.stackexchange.com/questions/183368