Postgresql

如何優化從表的一列中選擇對(自連接)?

  • April 9, 2020

我正在使用 PostgreSQL 9.5.19,DBeaver 6.3.4

我有一張桌子,其中有一行 - 使用者名,他參加的地點,他在那裡的時間

我需要選擇任何使用者所在的所有地點對(如果使用者在地點 a 和地點 bi 需要這樣的行:使用者、地點 a、地點 b、地點 a 的時間、地點 b 的時間)

池塘表:

CREATE TABLE example.example (
   tm timestamp NOT NULL,
   place_name varchar NOT NULL,
   user_name varchar NOT NULL
);

一些樣本數據:

INSERT INTO example.example (tm, place_name, user_name)
values
('2020-02-25 00:00:19.000', 'place_1', 'user_1'),
('2020-03-25 00:00:19.000', 'place_2', 'user_1'),
('2020-02-25 00:00:19.000', 'place_1', 'user_2'),
('2020-03-25 00:00:19.000', 'place_1', 'user_3'),
('2020-02-25 00:00:19.000', 'place_2', 'user_3');

我正在嘗試這個腳本:

select 
  t.user_name    
 ,t.place_name as r1_place
 ,max(t.tm) as r1_tm
 ,t2.place_name as r2_place
 ,min(t2.tm) as r2_tm
from example.example as t
join example.example as t2 on t.user_name = t2.user_name 
                      and t.tm < t2.tm 
                      and t.place_name <> t2.place_name
where t.tm between '2020-02-25 00:00:00' and '2020-03-25 15:00:00' 
 and t2.tm between '2020-02-25 00:00:00' and '2020-03-25 15:00:00'
   group by t.user_name
      , t.place_name
      , t2.place_name

似乎它給了我正確的結果,但它的工作速度真的很慢。我可以以某種方式優化它嗎?

Postgresql 9.5.19 具有在這種情況下證明有用的視窗函式。Lead() 函式使您可以訪問“分區”中的下一行。

你可以嘗試這樣的事情:

SELECT
 user_name,
 place_name AS r1_place,
 tm AS r1_tm,
 lead(place_name) OVER (PARTITION BY user_name ORDER BY tm) AS r2_place,
 lead(tm) OVER (PARTITION BY user_name ORDER BY tm) AS r2_tm
FROM example
ORDER BY 1, 3;

導致 :

user_name|r1_place|r1_tm              |r2_place|r2_tm              |
---------|--------|-------------------|--------|-------------------|
user_1   |place_1 |2020-02-25 00:00:19|place_2 |2020-03-25 00:00:19|
user_1   |place_2 |2020-03-25 00:00:19|        |                   |
user_2   |place_1 |2020-02-25 00:00:19|        |                   |
user_3   |place_2 |2020-02-25 00:00:19|place_1 |2020-03-25 00:00:19|
user_3   |place_1 |2020-03-25 00:00:19|        |                   |

但是不確定性能部分……你應該做一些測試。

當然,您可以過濾掉空結果:

SELECT * FROM (
 SELECT
   user_name,
   place_name AS r1_place,
   tm AS r1_tm,
   lead(place_name) OVER (PARTITION BY user_name ORDER BY tm) AS r2_place,
   lead(tm) OVER (PARTITION BY user_name ORDER BY tm) AS r2_tm
 FROM example
 ORDER BY 1, 3) req
WHERE r2_place IS NOT null

引用自:https://dba.stackexchange.com/questions/262728