Postgresql

選擇至少有一個連結到其他表的記錄

  • December 3, 2019

我想選擇所有擁有文章、照片或影片的使用者。我想出了以下查詢。

select
 users.username
from
 users
where
 exists (select * from posts where users.id = posts.user_id)
 or exists (select * from photos where users.id = photos.user_id)
 or exists (select * from videos where users.id = videos.user_id)

一切正常,但我還需要確定使用者是否在每個表上都有特定記錄,所以我正在使用:

select
 users.username,
 EXISTS (select * from posts where users.id = posts.user_id) as has_post,
 EXISTS (select * from photos where users.id = photos.user_id) as has_photo,
 EXISTS (select * from videos where users.id = videos.user_id) as has_video
from
 users
where
 exists (select * from posts where users.id = posts.user_id)
 or exists (select * from photos where users.id = photos.user_id)
 or exists (select * from videos where users.id = videos.user_id)

上面的查詢有效,但速度很慢,我該如何優化它?我也對其他選擇持開放態度。

所有id列都是主鍵,所有user_id列都是外鍵,users.username是一unique列。


更新

下面的執行計劃是在數據少得多但數據庫模式相同的開發環境中執行的(我也能感覺到生產速度很慢,但我沒有訪問權限)

Seq Scan on users  (cost=0.00..79803.74 rows=3883 width=18) (actual time=0.799..7.069 rows=297 loops=1)
 Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR (alternatives: SubPlan 3 or hashed SubPlan 4) OR (alternatives: SubPlan 5 or hashed SubPlan 6))
 Rows Removed by Filter: 4141
 Buffers: shared hit=146
 SubPlan 1
   ->  Seq Scan on posts  (cost=0.00..17.11 rows=2 width=0) (never executed)
         Filter: (users.id = user_id)
 SubPlan 2
   ->  Seq Scan on posts posts_1  (cost=0.00..16.29 rows=329 width=4) (actual time=0.007..0.120 rows=329 loops=1)
         Buffers: shared hit=13
 SubPlan 3
   ->  Seq Scan on photos  (cost=0.00..5.49 rows=1 width=0) (never executed)
         Filter: (users.id = user_id)
 SubPlan 4
   ->  Seq Scan on photos photos_1  (cost=0.00..5.19 rows=119 width=4) (actual time=0.008..0.037 rows=119 loops=1)
         Buffers: shared hit=4
 SubPlan 5
   ->  Seq Scan on videos  (cost=0.00..7.80 rows=2 width=0) (never executed)
         Filter: (users.id = user_id)
 SubPlan 6
   ->  Seq Scan on videos videos_1  (cost=0.00..7.04 rows=304 width=4) (actual time=0.009..0.066 rows=304 loops=1)
         Buffers: shared hit=4
Planning Time: 1.229 ms
Execution Time: 7.296 ms

第二個查詢

Seq Scan on users  (cost=0.00..149479.32 rows=3883 width=21) (actual time=354.809..368.271 rows=297 loops=1)
 Filter: ((alternatives: SubPlan 7 or hashed SubPlan 8) OR (alternatives: SubPlan 9 or hashed SubPlan 10) OR (alternatives: SubPlan 11 or hashed SubPlan 12))
 Rows Removed by Filter: 4141
 Buffers: shared hit=167
 SubPlan 1
   ->  Seq Scan on posts  (cost=0.00..17.11 rows=2 width=0) (never executed)
         Filter: (users.id = user_id)
 SubPlan 2
   ->  Seq Scan on posts posts_1  (cost=0.00..16.29 rows=329 width=4) (actual time=16.649..16.776 rows=329 loops=1)
         Buffers: shared hit=13
 SubPlan 3
   ->  Seq Scan on photos  (cost=0.00..5.49 rows=1 width=0) (never executed)
         Filter: (users.id = user_id)
 SubPlan 4
   ->  Seq Scan on photos photos_1  (cost=0.00..5.19 rows=119 width=4) (actual time=12.576..12.634 rows=119 loops=1)
         Buffers: shared hit=4
 SubPlan 5
   ->  Seq Scan on videos  (cost=0.00..7.80 rows=2 width=0) (never executed)
         Filter: (users.id = user_id)
 SubPlan 6
   ->  Seq Scan on videos videos_1  (cost=0.00..7.04 rows=304 width=4) (actual time=12.815..13.606 rows=304 loops=1)
         Buffers: shared hit=4
 SubPlan 7
   ->  Seq Scan on posts posts_2  (cost=0.00..17.11 rows=2 width=0) (never executed)
         Filter: (users.id = user_id)
 SubPlan 8
   ->  Seq Scan on posts posts_3  (cost=0.00..16.29 rows=329 width=4) (actual time=15.300..15.822 rows=329 loops=1)
         Buffers: shared hit=13
 SubPlan 9
   ->  Seq Scan on photos photos_2  (cost=0.00..5.49 rows=1 width=0) (never executed)
         Filter: (users.id = user_id)
 SubPlan 10
   ->  Seq Scan on photos photos_3  (cost=0.00..5.19 rows=119 width=4) (actual time=14.130..14.184 rows=119 loops=1)
         Buffers: shared hit=4
 SubPlan 11
   ->  Seq Scan on videos videos_2  (cost=0.00..7.80 rows=2 width=0) (never executed)
         Filter: (users.id = user_id)
 SubPlan 12
   ->  Seq Scan on videos videos_3  (cost=0.00..7.04 rows=304 width=4) (actual time=18.024..18.103 rows=304 loops=1)
         Buffers: shared hit=4
Planning Time: 1.567 ms
JIT:
 Functions: 106
 Options: Inlining false, Optimization false, Expressions true, Deforming true
 Timing: Generation 49.689 ms, Inlining 0.000 ms, Optimization 15.154 ms, Emission 312.171 ms, Total 377.014 ms
Execution Time: 471.659 ms

第二個查詢是啟動JIT(即時編譯),這對您來說非常適得其反。JIT 花費了 472 毫秒中的 312 毫秒。如果您不知道您從其他計劃的 JIT 中受益,那麼我將在全球範圍內關閉 JIT。我認為他們在版本 12 中預設打開 JIT 是一個錯誤,因為它似乎傷害了至少和它有幫助的一樣多的案例。如果您不想全域關閉它,那麼您可以使用其他一些 JIT 參數。

另外,我認為您發布的計劃與您發布的查詢不匹配。您在發布之前是否簡化了查詢?

引用自:https://dba.stackexchange.com/questions/254568