Postgresql
ORDER BY 子句會影響查詢性能
語境:
PostgreSQL 10,users 表有 3667438 條記錄,users 表有一個叫做 social 的 JSONB,我們通常使用索引計算函式輸出的策略,因此我們可以將資訊聚合到單個索引中。函式的輸出
engagement(social)
是雙精度數值類型。問題:
有問題的子句是,該數據還附加了
ORDER BY engagement(social) DESC NULLS LAST
一個 btree 索引。idx_in_social_engagement with DESC NULLS LAST
快速查詢:
EXPLAIN ANALYZE SELECT "users".* FROM "users" WHERE (follower_count(social) < 500000) AND (engagement(social) > 0.03) AND (engagement(social) < 0.25) AND (peemv(social) < 533) ORDER BY "users"."created_at" ASC LIMIT 12 OFFSET 0; Limit (cost=0.43..52.25 rows=12 width=1333) (actual time=0.113..1.625 rows=12 loops=1) -> Index Scan using created_at_idx on users (cost=0.43..7027711.55 rows=1627352 width=1333) (actual time=0.112..1.623 rows=12 loops=1) Filter: ((follower_count(social) < 500000) AND (engagement(social) > '0.03'::double precision) AND (engagement(social) < '0.25'::double precision) AND (peemv(social) > '0'::double precision) AND (peemv(social) < '533'::double precision)) Rows Removed by Filter: 8 Planning time: 0.324 ms Execution time: 1.639 ms
慢查詢:
EXPLAIN ANALYZE SELECT "users".* FROM "users" WHERE (follower_count(social) < 500000) AND (engagement(social) > 0.03) AND (engagement(social) < 0.25) AND (peemv(social) > 0.0) AND (peemv(social) < 533) ORDER BY engagement(social) DESC NULLS LAST, "users"."created_at" ASC LIMIT 12 OFFSET 0; Limit (cost=2884438.00..2884438.03 rows=12 width=1341) (actual time=68011.728..68011.730 rows=12 loops=1) -> Sort (cost=2884438.00..2888506.38 rows=1627352 width=1341) (actual time=68011.727..68011.728 rows=12 loops=1) Sort Key: (engagement(social)) DESC NULLS LAST, created_at Sort Method: top-N heapsort Memory: 45kB -> Index Scan using idx_in_social_engagement on users (cost=0.43..2847131.26 rows=1627352 width=1341) (actual time=0.082..67019.102 rows=1360633 loops=1) Index Cond: ((engagement(social) > '0.03'::double precision) AND (engagement(social) < '0.25'::double precision)) Filter: ((follower_count(social) < 500000) AND (peemv(social) > '0'::double precision) AND (peemv(social) < '533'::double precision)) Rows Removed by Filter: 85580 Planning time: 0.312 ms Execution time: 68011.752 ms
選擇帶有 * 因為我需要儲存在每一行中的所有數據。
更新:
CREATE INDEX idx_in_social_engagement on influencers USING BTREE ( engagement(social) DESC NULLS LAST)
準確的索引定義
你的
ORDER BY
條款是:engagement(social) DESC NULLS LAST, "users"."created_at" ASC
但我懷疑你的索引只是在:
engagement(social) DESC NULLS LAST
所以該指數並不能完全支撐
ORDER BY
。您可以在不使用
JSONB
或表達式索引的情況下重現相同的問題。您可以通過在ORDER BY
.如果 PostgreSQL 規劃器非常聰明,它可能能夠有效地使用現有索引。它必須繼續前進,
engagement(social) DESC NULLS LAST
直到它收集到 12 個滿足所有其餘過濾器要求的元組。然後它將繼續前進該索引,直到它收集engagement(social)
與第 12 個元組相關的所有其餘元組(並且滿足其他標準)。然後它必須重新排序所有收集到的元組ORDER BY
,並將 應用於該LIMIT 12
擴展和重新排序的集合。但是 PostgreSQL 規劃器並不是無限明智的。