Postgresql
按欄位排序,由子查詢構造很慢
我的表格(為清楚起見過於簡化):
documents: registrations: |id| |id|document_id|date|
我的簡化查詢:
SELECT * FROM (SELECT documents.*, (SELECT max(date) FROM registrations WHERE registrations.document_id = documents.id) AS register_date FROM documents) AS my_documents_view -- it's supposed to be a view ORDER BY register_date desc NULLS LAST LIMIT 20;
當我嘗試按
register_date
欄位訂購時,我得到一個大約 80 秒執行的土豆響應。
EXPLAIN ANALYSE
:Limit (cost=27237727.87..27237727.92 rows=20 width=192) (actual time=85124.599..85124.613 rows=20 loops=1) -> Sort (cost=27237727.87..27265594.16 rows=11146516 width=192) (actual time=85124.597..85124.600 rows=20 loops=1) Sort Key: ((SubPlan 2)) DESC NULLS LAST Sort Method: top-N heapsort Memory: 33kB -> Seq Scan on documents (cost=0.00..26941123.09 rows=11146516 width=192) (actual time=0.074..77874.947 rows=11153930 loops=1) SubPlan 2 -> Result (cost=2.19..2.29 rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=11153930) InitPlan 1 (returns $1) -> Limit (cost=0.43..2.19 rows=1 width=4) (actual time=0.005..0.005 rows=1 loops=11153930) -> Index Only Scan Backward using registrations_document_id_date_idx on registrations (cost=0.43..3.95 rows=2 width=4) (actual time=0.004..0.004 rows=1 loops=11153930) Index Cond: ((document_id = documents.id) AND (date IS NOT NULL)) Heap Fetches: 10337268 Planning Time: 0.381 ms Execution Time: 85124.722 ms
複雜性和成本是荒謬的,這兩個表中有很多行(實際上是數百萬),但是引擎訂購它真的那麼難嗎?是否有任何解決方法或建議來優化它?
在完整的查詢中,我有一些額外的過濾器,所以它執行得更快一些,但對於項目來說仍然是不可接受的。
我嘗試使用連接和索引但沒有任何成功。
嘗試將子查詢展平為連接:
SELECT * FROM (SELECT d.*, max(r.date) AS register_date FROM documents AS d LEFT JOIN registrations AS r ON r.document_id = d.id GROUP BY d.id) AS my_documents_view ORDER BY register_date desc NULLS LAST LIMIT 20;