Postgresql

無法讓這個 PostgreSQL 查詢執行得更快

  • January 24, 2018

我正在使用 PostGIS 2.3.3 執行 PostgreSQL 9.6

我正在嘗試使這個相當關鍵的查詢更快(在位置 1000 米內尋找使用者),但我在設置正確的索引時遇到了麻煩。

有人能指出我正確的方向嗎?

users是 200k 行,locations是 1200 行,user_push_tokens是 155k 行

使用者表:

create table users
(
 id serial not null
   constraint users_pkey
     primary key,
 (an additional 20-ish columns),
 geo_point geometry(Point,4326)
);`

create unique index users_id
 on users (id);

create index users_geo_point_idx
 on users using gist(geo_point);

user_push_tokens 表:

create table user_push_tokens
(
 user_id integer not null
   constraint push_tokens_user_id_fkey
     references users
       on delete cascade,
 push_token varchar(255) not null,
 push_provider varchar(64) not null,
 app varchar(64) default 'app'::character varying not null,
 id integer default nextval('user_push_tokens_id_seq'::regclass) not null
   constraint user_push_tokens_pkey
     primary key,
 active boolean default true not null,
);

create index trinity_unique_index
 on user_push_tokens (user_id, app, push_token);

create index user_push_tokens_token_fetch_idx
 on user_push_tokens (user_id, app, active);

create index user_id_index
 on user_push_tokens (user_id);

位置表:

create table locations
(
 id integer default nextval('locations_id_seq'::regclass) not null
   constraint locations_pkey
     primary key,
 (another 25 columns),
 geo_point geometry(Point,4326)
);

create unique index locations_id_key
 on locations (id);

create index locations_geo_point_idx
 on locations using gist(geo_point);

查詢

EXPLAIN ANALYSE SELECT
   u.id AS user_id,
   upt.push_provider,
   upt.push_token
 FROM users u
   JOIN locations l ON l.id = 3896
   JOIN user_push_tokens upt
     ON upt.user_id = u.id AND upt.active = true AND upt.app = 'app'
 WHERE ST_DistanceSphere(u.geo_point, l.geo_point) <= 1000;

結果是

Nested Loop  (cost=26087.06..63658.87 rows=30605 width=107) (actual time=353.304..887.371 rows=2498 loops=1)
 Join Filter: (_st_distance(geography(u.geo_point), geography(l.geo_point), '0'::double precision, false) <= '1000'::double precision)
 Rows Removed by Join Filter: 89539
 ->  Index Scan using locations_id_key on locations l  (cost=0.28..8.29 rows=1 width=32) (actual time=0.009..0.014 rows=1 loops=1)
       Index Cond: (id = 3896)
 ->  Hash Join  (cost=26086.78..39090.07 rows=91815 width=139) (actual time=352.437..657.228 rows=92037 loops=1)
       Hash Cond: (upt.user_id = u.id)
       ->  Seq Scan on user_push_tokens upt  (cost=0.00..7162.82 rows=91815 width=107) (actual time=0.032..103.512 rows=92037 loops=1)
             Filter: (active AND ((app)::text = 'app'::text))
             Rows Removed by Filter: 62437
       ->  Hash  (cost=22114.46..22114.46 rows=195546 width=36) (actual time=352.199..352.199 rows=195589 loops=1)
             Buckets: 65536  Batches: 4  Memory Usage: 3563kB
             ->  Seq Scan on users u  (cost=0.00..22114.46 rows=195546 width=36) (actual time=0.014..214.976 rows=195589 loops=1)

更易讀的解釋輸出

感謝您的時間。

ST_D內

ST_DistanceSphere(x,y)<t無法使用您的空間索引。ST_DWithin可以,取決於選擇性。而是使用,

ST_DWithin(u.geo_point, l.geo_point, 1000);

user_push_tokens此外,出於測試目的,一旦您為更簡單的查詢制定了更好的計劃,您應該刪除連接並將其添加回來。

架構更改

除了提到的jjanes

  • varchar(x): 不要varhcar(x)在 PostgreSQL 中使用,除非你必須或有充分的理由這樣做。查看您的架構,所有這些都應該是text. 它更快。
  • NOT NULL使用預設字元串:這裡可能的值是什麼,
app varchar(64) default 'app'::character varying not null,

你的查詢app = 'app'::text上面有,這也不好

active AND (app)::text = 'app'::text

您可能想要nullable然後測試該值是否IS NULL

  • geography類型:你也可能不想要
geometry(Point,4326)

相反,使用

geography(Point,4326)

引用自:https://dba.stackexchange.com/questions/195768