Redshift

基於開始/結束日期名冊的活躍使用者月度摘要

  • March 23, 2021

我有一張表格,說明誰參與了一個項目,從開始日期到結束日期。我需要一些幫助來編寫一個查詢,該查詢將自今年年初以來在每個月底返回“活躍”使用者的數量。

DROP TABLE "public"."roster";
CREATE TABLE "public"."roster" ("id" int,"user_id" int,"project_id" int,"start_date" datetime,"end_date" datetime,"closed_date" datetime, PRIMARY KEY ("id"));

INSERT INTO "public"."roster" ("id", "user_id", "project_id", "start_date", "end_date", "closed_date") VALUES
(1, 1, 1, '2019-05-27 00:00:00', '2021-01-15 00:00:00', NULL);

INSERT INTO "public"."roster" ("id", "user_id", "project_id", "start_date", "end_date", "closed_date") VALUES
(2, 2, 2, '2020-05-27 00:00:00', '2021-02-01 00:00:00', '2021-02-05 00:00:00');

INSERT INTO "public"."roster" ("id", "user_id", "project_id", "start_date", "end_date", "closed_date") VALUES
(3, 3, 3, '2020-05-27 00:00:00', '2024-02-01 00:00:00', '2021-02-05 00:00:00');

INSERT INTO "public"."roster" ("id", "user_id", "project_id", "start_date", "end_date", "closed_date") VALUES
(4, 4, 4, '2020-05-27 00:00:00', '2021-03-05 00:00:00', NULL);
id  user_id project_id  start_date            end_date              closed_date
1   1       1           2019-05-27 00:00:00   2021-01-15 00:00:00   NULL
2   2       2           2020-05-27 00:00:00   2021-02-01 00:00:00   2020-02-05 00:00:00
3   3       3           2020-05-27 00:00:00   2024-02-01 00:00:00   2020-02-05 00:00:00
4   4       4           2020-05-27 00:00:00   2021-03-05 00:00:00   NULL

結果將顯示每個月有多少不同的使用者有一個活躍的項目(僅從今年年初開始)。

所以對於上面的數據集,我們可以看到所有 4 個項目在 2021 年 1 月月份都是“活躍的”,因為結束日期在未來。

2021-01-31 | 3 (4 projects were active in some way, during January)
2021-02-28 | 2 (3 projects were active in some way, during February)
2021-03-31 | 1 (1 project was active in some way, during March)

最後一點複雜性是,有時項目可以在 EndDate 之前關閉,我想排除任何 endDate 在未來但項目實際上已經關閉的使用者。

例如,在上面的數據集中,第三個項目的結束日期為 2024 年 2 月,但該項目於 2021 年 2 月結束。所以從技術上講,該人在 2021 年 1 月和 2 月活躍,但不是 2021 年 3 月及以後。

ps 希望得到 Redshift 的答案

ps 希望在 Postgres 中得到答案(我正在使用 Redshift)

PostgreSQL 的解決方案:

WITH cte AS ( SELECT '2021-01-01'::DATE AS month_start, '2021-01-31'::DATE AS month_end UNION ALL
             SELECT '2021-02-01', '2021-02-28' UNION ALL
             SELECT '2021-03-01', '2021-03-31' )

SELECT cte.month_end, COUNT(*)
FROM roster
CROSS JOIN cte
WHERE start_date <= month_end 
 AND LEAST(end_date, closed_date) >= month_start
GROUP BY cte.month_end
ORDER BY cte.month_end;

或者

WITH cte AS ( SELECT '2021-01-31'::DATE AS month_end UNION ALL
             SELECT '2021-02-28' UNION ALL
             SELECT '2021-03-31' )

SELECT cte.month_end, COUNT(*)
FROM roster
CROSS JOIN cte
WHERE start_date <= month_end 
 AND LEAST(end_date, closed_date) >= DATE_TRUNC('month', month_end)
GROUP BY cte.month_end
ORDER BY cte.month_end;

https://dbfiddle.uk/?rdbms=postgres_12&fiddle=710260e9fa44a89cc9c2d536739f7c92

cte程式碼替換為您需要的程式碼(在第一個變體中為感興趣的月份生成月份的第一天和最後幾天,或者在第二個變體中僅生成最後幾個月的月份)。例如,使用日期生成器,從YYYY-MM-01下個月開始表示感興趣的月份,步驟為一個月,然後減去 - 一個月的第一天和最後一天的一天。

引用自:https://dba.stackexchange.com/questions/287519