Postgresql-12
TimescaleDB 萬用字元 (%) 慢
我有一個像這樣的 timescaledb 超表:
create table logs ( time timestamp not null, partitionkey text not null, ip inet, raw text, transformed double precision );
和索引如下:
create index logs_time_idx on logs (time desc); create unique index logs_partitionkey_time_uindex on logs (partitionkey asc, time desc);
當我執行此查詢時,需要 20 分鐘才能完成:
SELECT * FROM data.logs WHERE partitionkey LIKE '%m.60.05482730' AND time > NOW() - INTERVAL '3 days'
但是當我執行這個時,它需要 2 秒:
SELECT * FROM data.logs WHERE partitionkey LIKE '865617033605366.m.60.05482730' AND time > NOW() - INTERVAL '3 days'
我嘗試僅索引分區鍵以幫助萬用字元查詢找到匹配值,但這沒有效果。
-- created this index later to try and fix the slow wildcard query create index logs_partitionkey_index on logs (partitionkey);
解釋萬用字元查詢的計劃:
Gather (cost=1000.57..525711.89 rows=1219 width=81) Workers Planned: 2 -> Parallel Custom Scan (ChunkAppend) on logs (cost=0.57..524589.99 rows=509 width=82) Chunks excluded during startup: 2 -> Parallel Index Scan using _hyper_2_10_chunk_logs_time_idx on _hyper_2_10_chunk (cost=0.57..263956.91 rows=255 width=81) Index Cond: ("time" > (now() - '3 days'::interval)) Filter: (partitionkey ~~ '%m.60.05482730'::text) -> Parallel Index Scan using _hyper_2_9_chunk_logs_time_idx on _hyper_2_9_chunk (cost=0.57..260629.72 rows=252 width=83) Index Cond: ("time" > (now() - '3 days'::interval)) Filter: (partitionkey ~~ '%m.60.05482730'::text) JIT: Functions: 8 Options: Inlining true, Optimization true, Expressions true, Deforming true
解釋具體的partionkey值:
Custom Scan (ChunkAppend) on logs (cost=0.44..903.08 rows=790 width=82) Chunks excluded during startup: 2 -> Index Scan using _hyper_2_9_chunk_logs_partitionkey_time_uindex on _hyper_2_9_chunk (cost=0.57..447.44 rows=392 width=83) Index Cond: ((partitionkey = '865617033605366.m.60.05482730'::text) AND ("time" > (now() - '3 days'::interval))) Filter: (partitionkey ~~ '865617033605366.m.60.05482730'::text) -> Index Scan using _hyper_2_10_chunk_logs_partitionkey_time_uindex on _hyper_2_10_chunk (cost=0.57..452.27 rows=396 width=81) Index Cond: ((partitionkey = '865617033605366.m.60.05482730'::text) AND ("time" > (now() - '3 days'::interval))) Filter: (partitionkey ~~ '865617033605366.m.60.05482730'::text)
TimescaleDB 是否無法執行萬用字元 (%) 查詢,還是我錯過了索引?
B-Tree 索引不能用於帶有右錨定萬用字元的 LIKE 條件。
'%...'
它只能用於左錨萬用字元'...%'
。你需要一個三元組索引來改進它。如果子字元串搜尋的長度始終相同,則可以在該表達式上創建索引。在該索引中包含時間列可能也會有所幫助
create index logs_partitionkey_index on logs ( (right(partitionkey,13), "time" );
並將您的查詢更改為:
SELECT * FROM data.logs WHERE right(partitionkey,13) = 'm.60.05482730' AND "time" > NOW() - INTERVAL '3 days'
或者,在反轉的字元串上創建一個索引:
create index logs_partitionkey_index on logs ( (reverse(partitionkey) varchar_pattern_ops);
然後將您的查詢更改為:
SELECT * FROM data.logs WHERE reverse(partitionkey) like reverse('m.60.05482730')||'%' AND "time" > NOW() - INTERVAL '3 days'