Db2
last_value 在具有附加維度的左連接中忽略 null?
這只是我開始考慮的一個愚蠢的例子,但似乎無法提出一個優雅的解決方案。假設一個日曆表(我將使用 Db2 語法):
create table calendar ( dt date not null primary key ); insert into calendar (dt) with t (dt) as ( values cast('2020-01-01' as date) union all select dt + 1 day from t where dt < '2020-01-11') select dt from t ;
和一個保持平衡的表:
create table balance ( dt date not null , amount int not null , primary key (dt) ); insert into balance (dt, amount) values ('2020-01-03',100) ,('2020-01-05', -50);
如果我們想複製最後一個已知餘額,我們可以使用 LAST_VALUE 和 ‘IGNORE NULLS’,例如:
select c.dt, last_value(b.amount, 'IGNORE NULLS') over (order by c.dt) from calendar c left join balance b on c.dt = b.dt;
但是,如果我們添加一個維度,比如 cid (customer_id),那麼 last_value 的含義就不再明顯了。通常我們會按 cid 進行分區,但 cid 由於左連接而失去:
create table balance1 ( cid int not null , dt date not null , amount int not null , primary key (cid, dt) ); insert into balance1 (cid, dt, amount) values (1, '2020-01-03',100) ,(1, '2020-01-05', -50) , (2, '2020-01-04',75), (2, '2020-01-08',55), (2, '2020-01-10', -35); select c.dt, last_value(b.amount, 'IGNORE NULLS') over (partition by ? order by c.dt) from calendar c left join balance b on c.dt = b.dt;
我能想到的最好辦法是在日曆和余額中的不同客戶之間使用笛卡爾積:
select cid, dt, last_value(amount, 'IGNORE NULLS') over (partition by cid order by dt) from ( select cid, dt, amount from balance1 union select distinct b.cid, c.dt, null from balance1 b cross join calendar c where not exists ( select 1 from balance1 where dt = c.dt and cid = b.cid ) ) t order by dt, cid ;
不是那麼漂亮,我正在尋找一個更優雅的解決方案。我在上面和這個Fiddle中使用了 Db2 語法,但這是我所追求的原則,所以任何供應商語法都可以。
為什麼不是這個?
SELECT def.cid, c.dt, LAST_VALUE(b.amount, 'IGNORE NULLS') OVER(PARTITION BY def.cid ORDER BY c.dt) FROM calendar c INNER JOIN (SELECT DISTINCT cid FROM balance1) AS def(cid) ON 1=1 LEFT JOIN balance1 b ON (c.dt, def.cid) = (b.dt, b.cid) ORDER BY c.dt, def.cid ;
dbfiddle中的比較