Index
Postgresql 9.5 BRIN 索引大大慢於預期
我有一個包含超過 1200 萬行日誌數據的表,並已遷移到 Postgres 9.5 以利用新的 BRIN 索引,因為我有磁碟空間限制。考慮到我的日誌行的自然排序,我認為我的情況是為 BRIN 索引排序的。
但是,我是從結果開始的。BRIN 比 btree 慢一個數量級以上。
原始 Btree 索引:
EXPLAIN ANALYZE SELECT COUNT(*) from logline where date BETWEEN '2016-01-15' and '2016-01-31'; Aggregate (cost=153488.38..153488.39 rows=1 width=0) (actual time=7672.508..7672.509 rows=1 loops=1) -> Index Only Scan using logline_date on logline (cost=0.43..145945.76 rows=3017046 width=0) (actual time=18.548..4084.455 rows=2977593 loops=1) Index Cond: ((date >= '2016-01-15 00:00:00-05'::timestamp with time zone) AND (date <= '2016-01-31 00:00:00-05'::timestamp with time zone)) Heap Fetches: 5809 Planning time: 0.293 ms Execution time: 7672.562 ms (6 rows) DROP index logline_date CREATE index logline_date_brin on logline using BRIN(date) EXPLAIN ANALYZE SELECT COUNT(*) from logline where date BETWEEN '2016-01-15' and '2016-01-31'; Aggregate (cost=1230518.30..1230518.31 rows=1 width=0) (actual time=105789.131..105789.133 rows=1 loops=1) -> Bitmap Heap Scan on logline (cost=31543.27..1222862.87 rows=3062173 width=0) (actual time=103.876..100675.372 rows=2977593 loops=1) Recheck Cond: ((date >= '2016-01-15 00:00:00-05'::timestamp with time zone) AND (date <= '2016-01-31 00:00:00-05'::timestamp with time zone)) Rows Removed by Index Recheck: 2899899 Heap Blocks: lossy=696832 -> Bitmap Index Scan on logline_date_brin (cost=0.00..30777.73 rows=3062173 width=0) (actual time=103.079..103.079 rows=6968320 loops=1) Index Cond: ((date >= '2016-01-15 00:00:00-05'::timestamp with time zone) AND (date <= '2016-01-31 00:00:00-05'::timestamp with time zone)) Planning time: 0.377 ms Execution time: 105805.567 ms (9 rows)
BRIN 索引比 Btree 小 600 倍以上,但我沒想到執行時間會慢這麼多。
這是否意味著 BRIN 不適合我,還是我做錯了什麼?
我猜在遷移時您沒有導入按日期排序的行。您可以通過發出來檢查這一點
select * from logline;
並檢查日期是否看起來單調增加。如果不是這種情況,您可以嘗試對錶格進行排序,例如:
select * into logline2 from logline order by date asc;
…在第二個表上創建索引…
CREATE index logline2_date_brin on logline2 using BRIN(date)
…並在第二張桌子上嘗試“運氣”:
EXPLAIN ANALYZE SELECT COUNT(*) from logline2 where date BETWEEN '2016-01-15' and '2016-01-31';
如果時間大大改善,那就完美了。
如果您有磁碟空間限制,您還應該檢查副檔名cstore_fdw。它非常適合分析,並且可以以壓縮形式儲存數據。它具有類似於 BRIN 的索引功能,但有一些限制:您只能附加數據並且不支持事務。