Redshift
基於開始/結束日期名冊的活躍使用者月度摘要
我有一張表格,說明誰參與了一個項目,從開始日期到結束日期。我需要一些幫助來編寫一個查詢,該查詢將自今年年初以來在每個月底返回“活躍”使用者的數量。
DROP TABLE "public"."roster"; CREATE TABLE "public"."roster" ("id" int,"user_id" int,"project_id" int,"start_date" datetime,"end_date" datetime,"closed_date" datetime, PRIMARY KEY ("id")); INSERT INTO "public"."roster" ("id", "user_id", "project_id", "start_date", "end_date", "closed_date") VALUES (1, 1, 1, '2019-05-27 00:00:00', '2021-01-15 00:00:00', NULL); INSERT INTO "public"."roster" ("id", "user_id", "project_id", "start_date", "end_date", "closed_date") VALUES (2, 2, 2, '2020-05-27 00:00:00', '2021-02-01 00:00:00', '2021-02-05 00:00:00'); INSERT INTO "public"."roster" ("id", "user_id", "project_id", "start_date", "end_date", "closed_date") VALUES (3, 3, 3, '2020-05-27 00:00:00', '2024-02-01 00:00:00', '2021-02-05 00:00:00'); INSERT INTO "public"."roster" ("id", "user_id", "project_id", "start_date", "end_date", "closed_date") VALUES (4, 4, 4, '2020-05-27 00:00:00', '2021-03-05 00:00:00', NULL);
id user_id project_id start_date end_date closed_date 1 1 1 2019-05-27 00:00:00 2021-01-15 00:00:00 NULL 2 2 2 2020-05-27 00:00:00 2021-02-01 00:00:00 2020-02-05 00:00:00 3 3 3 2020-05-27 00:00:00 2024-02-01 00:00:00 2020-02-05 00:00:00 4 4 4 2020-05-27 00:00:00 2021-03-05 00:00:00 NULL
結果將顯示每個月有多少不同的使用者有一個活躍的項目(僅從今年年初開始)。
所以對於上面的數據集,我們可以看到所有 4 個項目在 2021 年 1 月月份都是“活躍的”,因為結束日期在未來。
2021-01-31 | 3 (4 projects were active in some way, during January) 2021-02-28 | 2 (3 projects were active in some way, during February) 2021-03-31 | 1 (1 project was active in some way, during March)
最後一點複雜性是,有時項目可以在 EndDate 之前關閉,我想排除任何 endDate 在未來但項目實際上已經關閉的使用者。
例如,在上面的數據集中,第三個項目的結束日期為 2024 年 2 月,但該項目於 2021 年 2 月結束。所以從技術上講,該人在 2021 年 1 月和 2 月活躍,但不是 2021 年 3 月及以後。
ps 希望得到 Redshift 的答案
ps 希望在 Postgres 中得到答案(我正在使用 Redshift)
PostgreSQL 的解決方案:
WITH cte AS ( SELECT '2021-01-01'::DATE AS month_start, '2021-01-31'::DATE AS month_end UNION ALL SELECT '2021-02-01', '2021-02-28' UNION ALL SELECT '2021-03-01', '2021-03-31' ) SELECT cte.month_end, COUNT(*) FROM roster CROSS JOIN cte WHERE start_date <= month_end AND LEAST(end_date, closed_date) >= month_start GROUP BY cte.month_end ORDER BY cte.month_end;
或者
WITH cte AS ( SELECT '2021-01-31'::DATE AS month_end UNION ALL SELECT '2021-02-28' UNION ALL SELECT '2021-03-31' ) SELECT cte.month_end, COUNT(*) FROM roster CROSS JOIN cte WHERE start_date <= month_end AND LEAST(end_date, closed_date) >= DATE_TRUNC('month', month_end) GROUP BY cte.month_end ORDER BY cte.month_end;
https://dbfiddle.uk/?rdbms=postgres_12&fiddle=710260e9fa44a89cc9c2d536739f7c92
將
cte
程式碼替換為您需要的程式碼(在第一個變體中為感興趣的月份生成月份的第一天和最後幾天,或者在第二個變體中僅生成最後幾個月的月份)。例如,使用日期生成器,從YYYY-MM-01
下個月開始表示感興趣的月份,步驟為一個月,然後減去 - 一個月的第一天和最後一天的一天。