Index

Postgresql 9.5 BRIN 索引大大慢於預期

  • February 29, 2016

我有一個包含超過 1200 萬行日誌數據的表,並已遷移到 Postgres 9.5 以利用新的 BRIN 索引,因為我有磁碟空間限制。考慮到我的日誌行的自然排序,我認為我的情況是為 BRIN 索引排序的。

但是,我是從結果開始的。BRIN 比 btree 慢一個數量級以上。

原始 Btree 索引:

EXPLAIN ANALYZE SELECT COUNT(*) from logline where date BETWEEN '2016-01-15' and '2016-01-31';

Aggregate  (cost=153488.38..153488.39 rows=1 width=0) (actual time=7672.508..7672.509 rows=1 loops=1)
  ->  Index Only Scan using logline_date on logline  (cost=0.43..145945.76 rows=3017046 width=0) (actual time=18.548..4084.455 
rows=2977593 loops=1)
        Index Cond: ((date >= '2016-01-15 00:00:00-05'::timestamp with time zone) AND (date <= '2016-01-31 00:00:00-05'::timestamp with time zone))
        Heap Fetches: 5809
Planning time: 0.293 ms
Execution time: 7672.562 ms
(6 rows)


DROP index logline_date
CREATE index logline_date_brin on logline using BRIN(date)

EXPLAIN ANALYZE SELECT COUNT(*) from logline where date BETWEEN '2016-01-15' and '2016-01-31';

Aggregate  (cost=1230518.30..1230518.31 rows=1 width=0) (actual time=105789.131..105789.133 rows=1 loops=1)
  ->  Bitmap Heap Scan on logline  (cost=31543.27..1222862.87 rows=3062173 width=0) (actual time=103.876..100675.372 rows=2977593 loops=1)
        Recheck Cond: ((date >= '2016-01-15 00:00:00-05'::timestamp with time zone) AND (date <= '2016-01-31 00:00:00-05'::timestamp with time zone))
        Rows Removed by Index Recheck: 2899899
        Heap Blocks: lossy=696832
        ->  Bitmap Index Scan on logline_date_brin  (cost=0.00..30777.73 rows=3062173 width=0) (actual time=103.079..103.079 rows=6968320 loops=1)
              Index Cond: ((date >= '2016-01-15 00:00:00-05'::timestamp with time zone) AND (date <= '2016-01-31 00:00:00-05'::timestamp with time zone))
Planning time: 0.377 ms
Execution time: 105805.567 ms
(9 rows)

BRIN 索引比 Btree 小 600 倍以上,但我沒想到執行時間會慢這麼多。

這是否意味著 BRIN 不適合我,還是我做錯了什麼?

我猜在遷移時您沒有導入按日期排序的行。您可以通過發出來檢查這一點

select * from logline; 

並檢查日期是否看起來單調增加。如果不是這種情況,您可以嘗試對錶格進行排序,例如:

select * into logline2 from logline order by date asc;

…在第二個表上創建索引…

CREATE index logline2_date_brin on logline2 using BRIN(date)

…並在第二張桌子上嘗試“運氣”:

EXPLAIN ANALYZE SELECT COUNT(*) from logline2 where date BETWEEN '2016-01-15' and '2016-01-31';

如果時間大大改善,那就完美了。


如果您有磁碟空間限制,您還應該檢查副檔名cstore_fdw。它非常適合分析,並且可以以壓縮形式儲存數據。它具有類似於 BRIN 的索引功能,但有一些限制:您只能附加數據並​​且不支持事務。

引用自:https://dba.stackexchange.com/questions/130819