Postgresql

帶限制的索引查詢,對一列排序,對另一列進行謂詞

  • May 30, 2017

幾年前我問了這個問題,得到了一個驚人的答案,解決了我所有的問題。

但是,我對我的 Postgres9.6.3數據也有一個非常相似的查詢,它的性能幾乎沒有。

這是做得很好的查詢:

EXPLAIN ANALYZE SELECT posts.* FROM unnest('{39303,39302,39304,70255,70256}'::int[]) s(source_id), LATERAL (SELECT "posts".* FROM "posts" WHERE (source_id = s.source_id) AND posts.deleted_at IS NULL ORDER BY posts.position, posts.external_created_at desc) posts ORDER BY posts.position, posts.external_created_at desc;
                                                                        QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort  (cost=356674.93..356771.23 rows=192600 width=1061) (actual time=2.715..2.731 rows=357 loops=1)
  Sort Key: posts."position", posts.external_created_at
  Sort Method: quicksort  Memory: 257kB
  ->  Nested Loop  (cost=2880.92..289343.79 rows=192600 width=1061) (actual time=0.545..1.938 rows=357 loops=1)
        ->  Function Scan on unnest s  (cost=0.00..0.30 rows=100 width=4) (actual time=0.007..0.008 rows=5 loops=1)
        ->  Sort  (cost=2880.92..2881.88 rows=1926 width=1061) (actual time=0.359..0.362 rows=71 loops=5)
              Sort Key: posts."position", posts.external_created_at
              Sort Method: quicksort  Memory: 27kB
              ->  Index Scan using index_posts_on_source_id on posts  (cost=0.11..2859.90 rows=1926 width=1061) (actual time=0.048..0.253 rows=71 loops=5)
                    Index Cond: (source_id = s.source_id)
                    Filter: (deleted_at IS NULL)
                    Rows Removed by Filter: 165
Planning time: 0.207 ms
Execution time: 2.793 ms
(14 rows)

這是我認為需要索引的類似查詢:

EXPLAIN ANALYZE SELECT posts.* FROM unnest('{70256, 70255, 39304, 39303, 39302}'::int[]) s(source_id), LATERAL (SELECT  "posts".* FROM "posts" WHERE (source_id = s.source_id) AND ("posts"."deleted_at" IS NOT NULL) AND "posts"."rejected_at" IS NULL ORDER BY posts.external_created_at desc LIMIT 100) posts ORDER BY posts.external_created_at desc LIMIT 100 OFFSET 0;
                                                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit  (cost=15879.66..15879.71 rows=100 width=1061) (actual time=325.798..325.814 rows=84 loops=1)
  ->  Sort  (cost=15879.66..15884.66 rows=10000 width=1061) (actual time=325.798..325.806 rows=84 loops=1)
        Sort Key: posts.external_created_at
        Sort Method: quicksort  Memory: 68kB
        ->  Nested Loop  (cost=0.11..15803.23 rows=10000 width=1061) (actual time=1.647..325.634 rows=84 loops=1)
              ->  Function Scan on unnest s  (cost=0.00..0.30 rows=100 width=4) (actual time=0.010..0.015 rows=5 loops=1)
              ->  Limit  (cost=0.11..157.43 rows=100 width=1061) (actual time=36.246..65.110 rows=17 loops=5)
                    ->  Index Scan using index_posts_on_source_id_and_external_created_at on posts  (cost=0.11..841.76 rows=535 width=1061) (actual time=36.244..65.105 rows=17 loops=5)
                          Index Cond: (source_id = s.source_id)
                          Filter: (rejected_at IS NULL)
                          Rows Removed by Filter: 148
Planning time: 12.438 ms
Execution time: 325.893 ms

至於我的設置:

CREATE TABLE posts (
   id integer NOT NULL,
   source_id integer,
   message text,
   image text,
   external_id text,
   created_at timestamp without time zone,
   updated_at timestamp without time zone,
   external text,
   like_count integer DEFAULT 0 NOT NULL,
   comment_count integer DEFAULT 0 NOT NULL,
   external_created_at timestamp without time zone,
   deleted_at timestamp without time zone,
   poster_name character varying(255),
   poster_image text,
   poster_url character varying(255),
   poster_id text,
   position integer,
   location character varying(255),
   description text,
   video text,
   rejected_at timestamp without time zone,
   deleted_by character varying(255),
   height integer,
   width integer
);

CREATE INDEX index_posts_on_source_id_and_external_created_at 
   ON posts USING btree (source_id, external_created_at DESC) 
   WHERE deleted_at IS NOT NULL;

Postgres 記憶體設置:

name, setting, unit
'default_statistics_target','100',''
'effective_cache_size','16384','8kB'
'maintenance_work_mem','16384','kB'
'max_connections','100',''
'random_page_cost','4',NULL
'seq_page_cost','1',NULL
'shared_buffers','16384','8kB'
'work_mem','1024','kB'

數據庫統計:

Total Posts: 20,997,027
Posts where deleted_at is null: 15,665,487
Distinct source_id's: 22,245
Max number of rows per single source_id: 1,543,950
Min number of rows per single source_id: 1
Most source_ids in a single query: 21
Distinct external_created_at: 11,146,151

關於如何建構我的索引的任何建議?

該索引可能會有所幫助:

CREATE INDEX index_posts_on_source_id_and_external_created_at ON posts 
 (source_id, external_created_at)
 WHERE deleted_at IS NOT NULL AND rejected_at IS NULL

可能會有所幫助。

如果您有其他類似的查詢,但rejected_at 測試是倒置的,那麼可能:

CREATE INDEX index_posts_on_source_id_and_external_created_at ON posts 
 (source_id, rejected_at, external_created_at)
 WHERE deleted_at IS NOT NULL

引用自:https://dba.stackexchange.com/questions/174830