Sql-Server

根據之前的月末值填充缺失數據

  • August 16, 2017

給定以下數據:

create table #histories
(
   username varchar(10),
   account varchar(10),
   assigned date  
);

insert into #histories 
values 
('PHIL','ACCOUNT1','2017-01-04'),
('PETER','ACCOUNT1','2017-01-15'),
('DAVE','ACCOUNT1','2017-03-04'),
('ANDY','ACCOUNT1','2017-05-06'),
('DAVE','ACCOUNT1','2017-05-07'),
('FRED','ACCOUNT1','2017-05-08'),
('JAMES','ACCOUNT1','2017-08-05'),
('DAVE','ACCOUNT2','2017-01-02'),
('PHIL','ACCOUNT2','2017-01-18'),
('JOSH','ACCOUNT2','2017-04-08'),
('JAMES','ACCOUNT2','2017-04-09'),
('DAVE','ACCOUNT2','2017-05-06'),
('PHIL','ACCOUNT2','2017-05-07') ; 

…表示將給定使用者分配給帳戶的時間。

我正在尋找在每個月的最後一天確定誰擁有給定帳戶(指定日期是帳戶轉移所有權的日期),並填充任何缺少的月末dates(可能是從我可用的方便表創建的,具有有用的列DateKey,DateLastDayOfMonth,

$$ courtesy of @AaronBertrand $$) 1 . 期望的結果是:

PETER, ACCOUNT1, 2017-01-31
PETER, ACCOUNT1, 2017-02-28
DAVE, ACCOUNT1, 2017-03-31
DAVE, ACCOUNT1, 2017-04-30
FRED, ACCOUNT1, 2017-05-31
FRED, ACCOUNT1, 2017-06-30
FRED, ACCOUNT1, 2017-07-31
JAMES, ACCOUNT1, 2017-08-31
PHIL, ACCOUNT2, 2017-01-31
PHIL, ACCOUNT2, 2017-02-28
PHIL, ACCOUNT2, 2017-03-31
JAMES, ACCOUNT2, 2017-04-30
PHIL, ACCOUNT2, 2017-05-31

使用視窗函式執行此操作的初始部分是微不足道的,它添加了我正在努力解決的“缺失”行。

解決此問題的一種方法是執行以下操作:

  1. LEAD在 SQL Server 2008 上進行仿真。您可以APPLY為此使用或 suquery。
  2. 對於沒有下一行的行,將其帳戶日期添加一個月。
  3. 加入包含月末日期的維度表。這將消除所有不跨越至少一個月的行,並根據需要添加行以填補空白。

我稍微修改了您的測試數據以使結果具有確定性。還添加了一個索引:

create table #histories
(
   username varchar(10),
   account varchar(10),
   assigned date  
);

insert into #histories 
values 
('PHIL','ACCOUNT1','2017-01-04'),
('PETER','ACCOUNT1','2017-01-15'),
('DAVE','ACCOUNT1','2017-03-04'),
('ANDY','ACCOUNT1','2017-05-06'),
('DAVE','ACCOUNT1','2017-05-07'),
('FRED','ACCOUNT1','2017-05-08'),
('JAMES','ACCOUNT1','2017-08-05'),
('DAVE','ACCOUNT2','2017-01-02'),
('PHIL','ACCOUNT2','2017-01-18'),
('JOSH','ACCOUNT2','2017-04-08'), -- changed this date to have deterministic results
('JAMES','ACCOUNT2','2017-04-09'),
('DAVE','ACCOUNT2','2017-05-06'),
('PHIL','ACCOUNT2','2017-05-07') ;

-- make life easy
create index gotta_go_fast ON #histories (account, assigned);

這是有史以​​來最懶惰的日期維度表:

create table #date_dim_months_only (
   month_date date,
   primary key (month_date)
);

-- put 2500 month ends into table
INSERT INTO #date_dim_months_only WITH (TABLOCK)
SELECT DATEADD(DAY, -1, DATEADD(MONTH, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), '20000101'))
FROM master..spt_values;

對於第 1 步,有很多方法可以模擬LEAD. 這是一種方法:

SELECT 
 h1.username
, h1.account
, h1.assigned
, next_date.assigned
FROM #histories h1
OUTER APPLY (
   SELECT TOP 1 h2.assigned
   FROM #histories h2
   WHERE h1.account = h2.account
   AND h1.assigned < h2.assigned
   ORDER BY h2.assigned ASC
) next_date;

對於第 2 步,我們需要將 NULL 值更改為其他值。您希望包括每個帳戶的最後一個月,因此在開始日期上添加一個月就足夠了:

ISNULL(next_date.assigned, DATEADD(MONTH, 1, h1.assigned))

對於第 3 步,我們可以加入日期維度表。維度表中的列正是結果集所需的列:

INNER JOIN #date_dim_months_only dd ON
   dd.month_date >= h1.assigned AND
   dd.month_date < ISNULL(next_date.assigned, DATEADD(MONTH, 1, h1.assigned))

當我把它們放在一起時,我不喜歡我得到的查詢。OUTER APPLY組合和時,連接順序可能會出現問題INNER JOIN。為了得到我想要的連接順序,我用一個子查詢重寫了它:

SELECT 
 hist.username
, hist.account
, dd.month_date 
FROM
(
   SELECT 
     h1.username
   , h1.account
   , h1.assigned
   , ISNULL(
       (SELECT TOP 1 h2.assigned
           FROM #histories h2
           WHERE h1.account = h2.account
           AND h1.assigned < h2.assigned
           ORDER BY h2.assigned ASC
       )
       , DATEADD(MONTH, 1, h1.assigned)
   ) next_assigned
   FROM #histories h1
) hist
INNER JOIN #date_dim_months_only dd ON
   dd.month_date >= hist.assigned AND
   dd.month_date < hist.next_assigned;

我不知道你有多少數據,所以對你來說可能無關緊要。但是該計劃看起來像我想要的那樣:

好計劃

結果與您的相符:

╔══════════╦══════════╦════════════╗
║ username ║ account  ║ month_date ║
╠══════════╬══════════╬════════════╣
║ PETER    ║ ACCOUNT1 ║ 2017-01-31 ║
║ PETER    ║ ACCOUNT1 ║ 2017-02-28 ║
║ DAVE     ║ ACCOUNT1 ║ 2017-03-31 ║
║ DAVE     ║ ACCOUNT1 ║ 2017-04-30 ║
║ FRED     ║ ACCOUNT1 ║ 2017-05-31 ║
║ FRED     ║ ACCOUNT1 ║ 2017-06-30 ║
║ FRED     ║ ACCOUNT1 ║ 2017-07-31 ║
║ JAMES    ║ ACCOUNT1 ║ 2017-08-31 ║
║ PHIL     ║ ACCOUNT2 ║ 2017-01-31 ║
║ PHIL     ║ ACCOUNT2 ║ 2017-02-28 ║
║ PHIL     ║ ACCOUNT2 ║ 2017-03-31 ║
║ JAMES    ║ ACCOUNT2 ║ 2017-04-30 ║
║ PHIL     ║ ACCOUNT2 ║ 2017-05-31 ║
╚══════════╩══════════╩════════════╝

引用自:https://dba.stackexchange.com/questions/183087