Window-Functions

計算給定時間的累積金額

  • September 15, 2021

我想找出在 P 小時內花費超過 D 美元的客戶。假設 D=10 美元,P=48。然後在下面的交易數據中:

CUSTOMER         STAMP        AMOUNT
--------         -----        ------
  A     2021-06-02 00:22:53    1.44
  A     2021-06-02 06:24:17    1.51
  A     2021-06-03 07:09:45    2.73
  A     2021-06-03 15:57:30    3.92
  A     2021-06-04 06:41:21    1.83
  B     2021-06-01 02:50:22    2.65
  B     2021-06-03 07:01:36    4.05
  B     2021-06-04 05:20:10    3.30
  B     2021-06-04 09:53:53    2.64
  B     2021-06-04 14:54:00    2.26
  C     2021-06-01 16:01:38    2.61
  C     2021-06-01 23:38:25    1.16
  C     2021-06-02 14:41:02    2.82
  C     2021-06-03 00:28:37    1.54
  C     2021-06-03 02:06:46    1.19
  C     2021-06-04 17:16:29    2.05

客戶 B 符合標準,客戶 A 和 C 不符合(即使他們的總支出 > 10 美元)。

我知道如何使用視窗函式,但在這種情況下,我事先不知道視窗應該有多大。


我認為答案是這樣的:

select
 customer
, stamp
, amount
, amount > coalesce(sum(amount) over (order by extract('epoch_second', stamp)
range between 2*24*60*60 preceding and current row), 0) as is_alert
from t
where amount is not null
order by stamp

但我的數據庫是雪花,它似乎不支持這種語法。該頁面說:

對於累積窗框: … RANGE 類似於 ROWS,除了它只計算與目前行具有相同值的行的結果(根據指定的 ORDER BY 子句)。

Snowflake 確實支持執行您的要求,但語法不同:

這是腳本

CREATE SCHEMA IF NOT EXISTS demo;

create or replace table demo.customer_transactions(
 transaction_id integer identity
,customer varchar
,txn_date timestamp_ltz
,amount float
);

//
//CUSTOMER         STAMP        AMOUNT
//--------         -----        ------
insert into demo.customer_transactions (customer, txn_date, amount)
values
('A',     '2021-06-02 00:22:53',    1.44)
,('A',     '2021-06-02 06:24:17',    1.51)
,('A',     '2021-06-03 07:09:45',    2.73)
,('A',     '2021-06-03 15:57:30',    3.92)
,('A',     '2021-06-04 06:41:21',    1.83)
,('B',     '2021-06-01 02:50:22',    2.65)
,('B',     '2021-06-03 07:01:36',    4.05)
,('B',     '2021-06-04 05:20:10',   3.30)
,('B',     '2021-06-04 09:53:53',    2.64)
,('B',     '2021-06-04 14:54:00',    2.26)
,('C',    '2021-06-01 16:01:38',    2.61)
,('C',     '2021-06-01 23:38:25',    1.16)
,('C',     '2021-06-02 14:41:02',    2.82)
,('C',     '2021-06-03 00:28:37',    1.54)
,('C',     '2021-06-03 02:06:46',    1.19)
,('C',     '2021-06-04 17:16:29',    2.05)
;


select
*
,case when window_transaction_amount > 10.00 then 'warn' else 'ok' end as is_alert
from
(
select 
a.transaction_id
,a.customer
,a.txn_date
,a.txn_date as window_start_date
,dateadd(hour,48, a.txn_date) as window_end_date
,a.amount
//,array_construct(b.transaction_id) as b_transactions
,listagg(distinct b.transaction_id,',') within group (order by b.transaction_id) as window_transaction_ids
,sum(b.amount) as window_transaction_amount
from demo.customer_transactions a left outer join demo.customer_transactions b on a.customer = b.customer and b.txn_date between a.txn_date and dateadd(hour,48, a.txn_date)
group by 
 a.transaction_id
,a.customer
,a.txn_date
,dateadd(hour,48, a.txn_date)
,a.amount
) q
order by 1,2,3;

你得到的輸出是這樣的:在此處輸入圖像描述

引用自:https://dba.stackexchange.com/questions/295139