Postgresql-12

TimescaleDB 萬用字元 (%) 慢

  • March 16, 2021

我有一個像這樣的 timescaledb 超表:

create table logs
(
   time         timestamp not null,
   partitionkey text      not null,
   ip           inet,
   raw          text,
   transformed  double precision
);

和索引如下:

create index logs_time_idx
   on logs (time desc);

create unique index logs_partitionkey_time_uindex
   on logs (partitionkey asc, time desc);

當我執行此查詢時,需要 20 分鐘才能完成:

SELECT * FROM data.logs 
WHERE partitionkey LIKE '%m.60.05482730' 
AND time > NOW() - INTERVAL '3 days'

但是當我執行這個時,它需要 2 秒:

SELECT * FROM data.logs 
WHERE partitionkey LIKE '865617033605366.m.60.05482730'
AND time > NOW() - INTERVAL '3 days'

我嘗試僅索引分區鍵以幫助萬用字元查詢找到匹配值,但這沒有效果。

-- created this index later to try and fix the slow wildcard query
create index logs_partitionkey_index
   on logs (partitionkey);

解釋萬用字元查詢的計劃:

Gather  (cost=1000.57..525711.89 rows=1219 width=81)
 Workers Planned: 2
 ->  Parallel Custom Scan (ChunkAppend) on logs  (cost=0.57..524589.99 rows=509 width=82)
       Chunks excluded during startup: 2
       ->  Parallel Index Scan using _hyper_2_10_chunk_logs_time_idx on _hyper_2_10_chunk  (cost=0.57..263956.91 rows=255 width=81)
             Index Cond: ("time" > (now() - '3 days'::interval))
             Filter: (partitionkey ~~ '%m.60.05482730'::text)
       ->  Parallel Index Scan using _hyper_2_9_chunk_logs_time_idx on _hyper_2_9_chunk  (cost=0.57..260629.72 rows=252 width=83)
             Index Cond: ("time" > (now() - '3 days'::interval))
             Filter: (partitionkey ~~ '%m.60.05482730'::text)
JIT:
 Functions: 8
 Options: Inlining true, Optimization true, Expressions true, Deforming true

解釋具體的partionkey值:

Custom Scan (ChunkAppend) on logs  (cost=0.44..903.08 rows=790 width=82)
 Chunks excluded during startup: 2
 ->  Index Scan using _hyper_2_9_chunk_logs_partitionkey_time_uindex on _hyper_2_9_chunk  (cost=0.57..447.44 rows=392 width=83)
       Index Cond: ((partitionkey = '865617033605366.m.60.05482730'::text) AND ("time" > (now() - '3 days'::interval)))
       Filter: (partitionkey ~~ '865617033605366.m.60.05482730'::text)
 ->  Index Scan using _hyper_2_10_chunk_logs_partitionkey_time_uindex on _hyper_2_10_chunk  (cost=0.57..452.27 rows=396 width=81)
       Index Cond: ((partitionkey = '865617033605366.m.60.05482730'::text) AND ("time" > (now() - '3 days'::interval)))
       Filter: (partitionkey ~~ '865617033605366.m.60.05482730'::text)

TimescaleDB 是否無法執行萬用字元 (%) 查詢,還是我錯過了索引?

B-Tree 索引不能用於帶有右錨定萬用字元的 LIKE 條件。'%...'它只能用於左錨萬用字元'...%'。你需要一個三元組索引來改進它。

如果子字元串搜尋的長度始終相同,則可以在該表達式上創建索引。在該索引中包含時間列可能也會有所幫助

create index logs_partitionkey_index
   on logs ( (right(partitionkey,13), "time" );

並將您的查詢更改為:

SELECT * 
FROM data.logs 
WHERE right(partitionkey,13) = 'm.60.05482730' 
 AND "time" > NOW() - INTERVAL '3 days'

或者,在反轉的字元串上創建一個索引:

create index logs_partitionkey_index
   on logs ( (reverse(partitionkey) varchar_pattern_ops);

然後將您的查詢更改為:

SELECT * 
FROM data.logs 
WHERE reverse(partitionkey) like reverse('m.60.05482730')||'%'
 AND "time" > NOW() - INTERVAL '3 days'

引用自:https://dba.stackexchange.com/questions/287098