Postgresql
無法讓這個 PostgreSQL 查詢執行得更快
我正在使用 PostGIS 2.3.3 執行 PostgreSQL 9.6
我正在嘗試使這個相當關鍵的查詢更快(在位置 1000 米內尋找使用者),但我在設置正確的索引時遇到了麻煩。
有人能指出我正確的方向嗎?
users
是 200k 行,locations
是 1200 行,user_push_tokens
是 155k 行使用者表:
create table users ( id serial not null constraint users_pkey primary key, (an additional 20-ish columns), geo_point geometry(Point,4326) );` create unique index users_id on users (id); create index users_geo_point_idx on users using gist(geo_point);
user_push_tokens 表:
create table user_push_tokens ( user_id integer not null constraint push_tokens_user_id_fkey references users on delete cascade, push_token varchar(255) not null, push_provider varchar(64) not null, app varchar(64) default 'app'::character varying not null, id integer default nextval('user_push_tokens_id_seq'::regclass) not null constraint user_push_tokens_pkey primary key, active boolean default true not null, ); create index trinity_unique_index on user_push_tokens (user_id, app, push_token); create index user_push_tokens_token_fetch_idx on user_push_tokens (user_id, app, active); create index user_id_index on user_push_tokens (user_id);
位置表:
create table locations ( id integer default nextval('locations_id_seq'::regclass) not null constraint locations_pkey primary key, (another 25 columns), geo_point geometry(Point,4326) ); create unique index locations_id_key on locations (id); create index locations_geo_point_idx on locations using gist(geo_point);
查詢
EXPLAIN ANALYSE SELECT u.id AS user_id, upt.push_provider, upt.push_token FROM users u JOIN locations l ON l.id = 3896 JOIN user_push_tokens upt ON upt.user_id = u.id AND upt.active = true AND upt.app = 'app' WHERE ST_DistanceSphere(u.geo_point, l.geo_point) <= 1000;
結果是
Nested Loop (cost=26087.06..63658.87 rows=30605 width=107) (actual time=353.304..887.371 rows=2498 loops=1) Join Filter: (_st_distance(geography(u.geo_point), geography(l.geo_point), '0'::double precision, false) <= '1000'::double precision) Rows Removed by Join Filter: 89539 -> Index Scan using locations_id_key on locations l (cost=0.28..8.29 rows=1 width=32) (actual time=0.009..0.014 rows=1 loops=1) Index Cond: (id = 3896) -> Hash Join (cost=26086.78..39090.07 rows=91815 width=139) (actual time=352.437..657.228 rows=92037 loops=1) Hash Cond: (upt.user_id = u.id) -> Seq Scan on user_push_tokens upt (cost=0.00..7162.82 rows=91815 width=107) (actual time=0.032..103.512 rows=92037 loops=1) Filter: (active AND ((app)::text = 'app'::text)) Rows Removed by Filter: 62437 -> Hash (cost=22114.46..22114.46 rows=195546 width=36) (actual time=352.199..352.199 rows=195589 loops=1) Buckets: 65536 Batches: 4 Memory Usage: 3563kB -> Seq Scan on users u (cost=0.00..22114.46 rows=195546 width=36) (actual time=0.014..214.976 rows=195589 loops=1)
感謝您的時間。
ST_D內
ST_DistanceSphere(x,y)<t
無法使用您的空間索引。ST_DWithin
可以,取決於選擇性。而是使用,ST_DWithin(u.geo_point, l.geo_point, 1000);
user_push_tokens
此外,出於測試目的,一旦您為更簡單的查詢制定了更好的計劃,您應該刪除連接並將其添加回來。
架構更改
除了提到的
jjanes
,
varchar(x)
: 不要varhcar(x)
在 PostgreSQL 中使用,除非你必須或有充分的理由這樣做。查看您的架構,所有這些都應該是text
. 它更快。NOT NULL
使用預設字元串:這裡可能的值是什麼,app varchar(64) default 'app'::character varying not null,
你的查詢
app = 'app'::text
上面有,這也不好active AND (app)::text = 'app'::text
您可能想要
nullable
然後測試該值是否IS NULL
geography
類型:你也可能不想要geometry(Point,4326)
相反,使用
geography(Point,4326)