Postgresql

什麼是“分區外連接”?

  • January 18, 2019

這只是在 Reddit 上的一個問題中出現的,我想知道

  • PARTITIONED OUTER JOINOracle中的 a 是什麼?(定義)
  • 一個簡單的例子是什麼樣的?(採用)
  • 你將如何用 PostgreSQL 或標準 SQL 來編寫它,否則會缺乏PARTITIONED OUTER JOIN?(等價)

{1} 分區外連接:定義

…“這種連接擴展了傳統的外連接語法,將外連接應用於查詢中定義的每個邏輯分區。Oracle 根據您在 PARTITION BY 子句中指定的表達式對查詢中的行進行邏輯分區。結果分區外連接是邏輯分區表中每個分區的外連接與連接另一側的表的聯合。” (文件

{2} 簡單範例

“數據通常以稀疏形式儲存。也就是說,如果給定的維度值組合不存在值,則事實表中不存在行。但是,您可能希望以密集形式查看數據,所有組合都有行即使不存在事實數據,也會顯示維度值的數量。”

…“例如,如果產品在特定時間段內沒有銷售,您可能仍希望在該時間段內看到該產品旁邊的銷售價值為零。” (來自文件的引用)

測試表和數據 (INSERT)

-- Oracle 12c
create table sales (
 date_ date
, location_ varchar2( 16 )
, qty_ number
);

create table locations (
 name varchar2( 16 )
);

-- dates for locations are "gappy": 
-- none of the locations has entries for all 3 dates
-- ( date range: 2019-01-15 - 2019-01-17 )
insert into sales ( date_, location_, qty_ ) 
 values ( date '2019-01-17', 'London', 11 ) ;
insert into sales ( date_, location_, qty_ ) 
 values ( date '2019-01-15', 'London', 10 ) ;
insert into sales ( date_, location_, qty_ ) 
 values ( date '2019-01-16', 'Paris', 20 ) ;
insert into sales ( date_, location_, qty_ ) 
 values ( date '2019-01-17', 'Boston', 31 ) ;
insert into sales ( date_, location_, qty_ ) 
 values ( date '2019-01-16', 'Boston', 30 ) ;

-- locations
insert into locations ( name ) values ( 'London' );
insert into locations ( name ) values ( 'Paris' );
insert into locations ( name ) values ( 'Boston' );

所需輸出

date_       location_  qty_
2019-01-15  London     10
2019-01-15  Paris       0    -- not INSERTed!
2019-01-15  Boston      0    -- not INSERTed!
2019-01-16  London      0    -- not INSERTed!
2019-01-16  Paris      20
2019-01-16  Boston     30
2019-01-17  London     11
2019-01-17  Paris       0    -- not INSERTed!
2019-01-17  Boston     31

查詢(分區外連接)

select S.date_, S.qty_, L.name
from sales S partition by ( date_ ) 
 right join locations L on S.location_ = L.name
;

-- result
DATE_      QTY_  NAME    
15-JAN-19  NULL  Boston  
15-JAN-19  10    London  
15-JAN-19  NULL  Paris   
16-JAN-19  30    Boston  
16-JAN-19  NULL  London  
16-JAN-19  20    Paris   
17-JAN-19  31    Boston  
17-JAN-19  11    London  
17-JAN-19  NULL  Paris 

查詢(版本 2,相同的連接)

-- same as above, using NVL(), column aliases, and ORDER BY ...
select S.date_, nvl( S.qty_, 0 ) as sold, L.name as location
from sales S partition by ( date_ ) 
 right join locations L on S.location_ = L.name
order by S.date_, L.name
;

DATE_           SOLD LOCATION        
--------- ---------- ----------------
15-JAN-19          0 Boston          
15-JAN-19         10 London          
15-JAN-19          0 Paris           
16-JAN-19         30 Boston          
16-JAN-19          0 London          
16-JAN-19         20 Paris           
17-JAN-19         31 Boston          
17-JAN-19         11 London          
17-JAN-19          0 Paris           

此處為 Dbfiddle (Oracle 18c)

{3} 等價

以下查詢與 PARTITION BY ( date_ ) 外連接的工作大致相同。我們使用了 CROSS JOIN(內部 SELECT)和 LEFT OUTER JOIN 的組合。(省略 NULL 到 0 的轉換)

甲骨文

select SL.*, S.qty_
from
(
 select *
 from (
   select unique date_ from sales
 ) , (
   select unique name from locations
 )
) SL left join (
 select date_, location_, qty_ from sales
) S on SL.name = S.location_ and SL.date_ = S.date_ 
order by SL.date_, SL.name
;

DATE_      NAME    QTY_  
15-JAN-19  Boston  NULL  
15-JAN-19  London  10    
15-JAN-19  Paris   NULL  
16-JAN-19  Boston  30    
16-JAN-19  London  NULL  
16-JAN-19  Paris   20    
17-JAN-19  Boston  31    
17-JAN-19  London  11    
17-JAN-19  Paris   NULL 

PostgreSQL 10 ( dbfiddle )

-- DDL and INSERTs
create table sales (
 date_ date
, location_ varchar( 16 )
, qty_ number
);

create table locations (
 name varchar( 16 )
);

insert into sales ( date_, location_, qty_ ) values 
 ( '2019-01-17', 'London', 11 )
, ( '2019-01-15', 'London', 10 )
, ( '2019-01-16', 'Paris', 20 )
, ( '2019-01-17', 'Boston', 31 )
, ( '2019-01-16', 'Boston', 30 )

insert into locations ( name ) values 
( 'London' ), ( 'Paris' ), ( 'Boston' );

查詢(Postgres)

-- SL: all date_ <-> location combinations
-- S: all location_ and qty_ values of table sales
select SL.*, S.qty_
from
(
 select *
 from (
   select distinct date_ from sales
 ) S_ cross join (
   select distinct name from locations
 ) L_
) SL left join (
 select date_, location_, qty_ from sales
) S on SL.name = S.location_ and SL.date_ = S.date_ 
order by SL.date_, SL.name
;

date_        name    qty_
2019-01-15   Boston  
2019-01-15   London  10
2019-01-15   Paris    
2019-01-16   Boston  30
2019-01-16   London    
2019-01-16   Paris   20
2019-01-17   Boston  31
2019-01-17   London  11
2019-01-17   Paris    

引用自:https://dba.stackexchange.com/questions/227069