Postgresql

PostgreSQL 查詢分析慢、記憶體使用量低、寫入量高

  • September 4, 2018

我在 PostgreSQL 10.5 中遇到以下查詢問題。

在查詢執行時使用pg_activity表明僅使用了約 400 MB 的 RAM。

我還注意到規劃師的估計似乎很遙遠。

這是查詢:

explain analyze SELECT 
   Event.data AS "Event_data", Event_entity_identifiers.identifier AS "Event_rootIdentifier", 
   expeditions.expedition_code AS "expeditionCode", expeditions.project_id AS "projectId", 
   Sample.data AS "Sample_data", Sample_entity_identifiers.identifier AS "Sample_rootIdentifier", 
   fastqMetadata.data AS "fastqMetadata_data", fastqMetadata_entity_identifiers.identifier AS "fastqMetadata_rootIdentifier" 
FROM network_1.Event AS Event 
   JOIN expeditions ON expeditions.id = Event.expedition_id 
   LEFT JOIN network_1.Sample AS Sample ON Sample.parent_identifier = Event.local_identifier and Sample.expedition_id = Event.expedition_id 
   LEFT JOIN network_1.Tissue AS Tissue ON Tissue.parent_identifier = Sample.local_identifier and Tissue.expedition_id = Sample.expedition_id 
   LEFT JOIN network_1.fastqMetadata AS fastqMetadata ON fastqMetadata.parent_identifier = Tissue.local_identifier and fastqMetadata.expedition_id = Tissue.expedition_id 
   LEFT JOIN entity_identifiers AS Event_entity_identifiers ON Event_entity_identifiers.expedition_id = Event.expedition_id and Event_entity_identifiers.concept_alias = 'Event' 
   LEFT JOIN entity_identifiers AS Sample_entity_identifiers ON Sample_entity_identifiers.expedition_id = Sample.expedition_id and Sample_entity_identifiers.concept_alias = 'Sample' 
   LEFT JOIN entity_identifiers AS fastqMetadata_entity_identifiers ON fastqMetadata_entity_identifiers.expedition_id = fastqMetadata.expedition_id and fastqMetadata_entity_identifiers.concept_alias = 'fastqMetadata' 
WHERE (expeditions.project_id = 2 AND Event.data ? 'urn:decimalLatitude' AND Event.data ? 'urn:decimalLongitude') AND expeditions.public = true 
ORDER BY Event.local_identifier, Event.expedition_id 
OFFSET 0 LIMIT 10000;

解釋分析結果:

Limit  (cost=3158.61..3158.64 rows=12 width=1658) (actual time=104656.385..104692.387 rows=10000 loops=1)
 ->  Sort  (cost=3158.61..3158.64 rows=12 width=1658) (actual time=104656.383..104682.873 rows=10000 loops=1)
       Sort Key: event.local_identifier, event.expedition_id
       Sort Method: external merge  Disk: 40616kB
       ->  Nested Loop Left Join  (cost=1627.71..3158.39 rows=12 width=1658) (actual time=6.389..104128.516 rows=38826 loops=1)
             ->  Nested Loop Left Join  (cost=20.42..40.63 rows=1 width=625) (actual time=1.329..105.572 rows=5650 loops=1)
                   ->  Nested Loop  (cost=20.15..32.26 rows=1 width=607) (actual time=1.300..61.179 rows=5650 loops=1)
                         ->  Bitmap Heap Scan on event  (cost=20.00..24.02 rows=1 width=588) (actual time=1.286..17.407 rows=6025 loops=1)
                               Recheck Cond: ((data ? 'urn:decimalLatitude'::text) AND (data ? 'urn:decimalLongitude'::text))
                               Heap Blocks: exact=793
                               ->  Bitmap Index Scan on idx_network_1_event_data  (cost=0.00..20.00 rows=1 width=0) (actual time=1.168..1.169 rows=6025 loops=1)
                                     Index Cond: ((data ? 'urn:decimalLatitude'::text) AND (data ? 'urn:decimalLongitude'::text))
                         ->  Index Scan using expeditions_pkey on expeditions  (cost=0.14..8.16 rows=1 width=23) (actual time=0.004..0.004 rows=1 loops=6025)
                               Index Cond: (id = event.expedition_id)
                               Filter: (public AND (project_id = 2))
                               Rows Removed by Filter: 0
                   ->  Index Scan using entitiy_identifiers_expediton_id_concept_alias_uniq on entity_identifiers event_entity_identifiers  (cost=0.28..8.30 rows=1 width=22) (actual time=0.004..0.004 rows=1 loops=5650)
                         Index Cond: ((expedition_id = event.expedition_id) AND (concept_alias = 'Event'::text))
             ->  Nested Loop Left Join  (cost=1607.29..3117.75 rows=1 width=1060) (actual time=3.372..18.399 rows=7 loops=5650)
                   ->  Nested Loop Left Join  (cost=1607.01..3117.36 rows=1 width=1046) (actual time=3.369..18.374 rows=7 loops=5650)
                         ->  Nested Loop Left Join  (cost=1606.74..3117.06 rows=1 width=1028) (actual time=3.360..18.316 rows=7 loops=5650)
                               ->  Nested Loop Left Join  (cost=1606.46..3109.90 rows=1 width=480) (actual time=3.325..18.079 rows=7 loops=5650)
                                     ->  Bitmap Heap Scan on sample  (cost=1606.05..2651.62 rows=1 width=476) (actual time=2.389..2.498 rows=7 loops=5650)
                                           Recheck Cond: (expedition_id = event.expedition_id)
                                           Filter: (parent_identifier = event.local_identifier)
                                           Rows Removed by Filter: 430
                                           Heap Blocks: exact=392809
                                           ->  Bitmap Index Scan on idx_network_1_sample_local_identifier_expedition_id  (cost=0.00..1606.05 rows=312 width=0) (actual time=2.334..2.334 rows=437 loops=5650)
                                                 Index Cond: (expedition_id = event.expedition_id)
                                     ->  Index Scan using idx_network_1_tissue_local_identifier_expedition_id on tissue  (cost=0.41..458.27 rows=1 width=22) (actual time=1.067..2.262 rows=1 loops=38826)
                                           Index Cond: (expedition_id = sample.expedition_id)
                                           Filter: (parent_identifier = sample.local_identifier)
                                           Rows Removed by Filter: 484
                               ->  Index Scan using idx_network_1_fastqmetadata_local_identifier_expedition_id on fastqmetadata  (cost=0.28..7.15 rows=1 width=572) (actual time=0.031..0.031 rows=0 loops=38826)
                                     Index Cond: (expedition_id = tissue.expedition_id)
                                     Filter: (parent_identifier = tissue.local_identifier)
                                     Rows Removed by Filter: 3
                         ->  Index Scan using entitiy_identifiers_expediton_id_concept_alias_uniq on entity_identifiers sample_entity_identifiers  (cost=0.28..0.30 rows=1 width=22) (actual time=0.005..0.005 rows=1 loops=38826)
                               Index Cond: ((expedition_id = sample.expedition_id) AND (concept_alias = 'Sample'::text))
                   ->  Index Scan using entitiy_identifiers_expediton_id_concept_alias_uniq on entity_identifiers fastqmetadata_entity_identifiers  (cost=0.28..0.39 rows=1 width=22) (actual time=0.001..0.001 rows=0 loops=38826)
                         Index Cond: ((expedition_id = fastqmetadata.expedition_id) AND (concept_alias = 'fastqMetadata'::text))
Planning time: 6.090 ms
Execution time: 104704.794 ms

編輯:

CREATE TABLE public.expeditions (
   id integer NOT NULL,
   project_id integer NOT NULL,
   expedition_code text NOT NULL,
   public boolean DEFAULT true NOT NULL,
);
Indexes:
   "expeditions_pkey" PRIMARY KEY, btree (id)
   "expeditions_code_project_id_uniq" UNIQUE CONSTRAINT, btree (expedition_code, project_id)
   "expeditions_project_id_idx" btree (project_id)

CREATE TABLE public.entity_identifiers (
   id integer NOT NULL,
   expedition_id integer NOT NULL,
   concept_alias text NOT NULL,
   identifier text NOT NULL
);
Indexes:
   "entitiy_identifiers_expediton_id_concept_alias_uniq" UNIQUE CONSTRAINT, btree (expedition_id, concept_alias)
   "entity_identifiers_expedition_id" btree (expedition_id)

CREATE TABLE network_1.event (
   id integer NOT NULL,
   local_identifier text NOT NULL,
   expedition_id integer NOT NULL,
   data jsonb NOT NULL,
);
Indexes:
   "event_pkey" PRIMARY KEY, btree (id)
   "idx_network_1_event_local_identifier_expedition_id" UNIQUE CONSTRAINT, btree (local_identifier, expedition_id)
   "idx_network_1_event_data" gin (data)
Foreign-key constraints:
   "event_expedition_id_fkey" FOREIGN KEY (expedition_id) REFERENCES expeditions(id) ON DELETE CASCADE

CREATE TABLE network_1.sample (
   id integer NOT NULL,
   local_identifier text NOT NULL,
   expedition_id integer NOT NULL,
   data jsonb NOT NULL,
   parent_identifier text NOT NULL
);
Indexes:
   "sample_pkey" PRIMARY KEY, btree (id)
   "idx_network_1_sample_local_identifier_expedition_id" UNIQUE CONSTRAINT, btree (local_identifier, expedition_id)
   "idx_network_1_sample_data" gin (data)
Foreign-key constraints:
   "network_1_sample_parent_fkey" FOREIGN KEY (parent_identifier, expedition_id) REFERENCES network_1.event(local_identifier, expedition_id) ON DELETE CASCADE
   "sample_expedition_id_fkey" FOREIGN KEY (expedition_id) REFERENCES expeditions(id) ON DELETE CASCADE

CREATE TABLE network_1.tissue (
   id integer NOT NULL,
   local_identifier text NOT NULL,
   expedition_id integer NOT NULL,
   data jsonb NOT NULL,
   parent_identifier text NOT NULL
);
Indexes:
   "tissue_pkey" PRIMARY KEY, btree (id)
   "idx_network_1_tissue_local_identifier_expedition_id" UNIQUE CONSTRAINT, btree (local_identifier, expedition_id)
   "idx_network_1_tissue_data" gin (data)
Foreign-key constraints:
   "network_1_tissue_parent_fkey" FOREIGN KEY (parent_identifier, expedition_id) REFERENCES network_1.sample(local_identifier, expedition_id) ON DELETE CASCADE
   "tissue_expedition_id_fkey" FOREIGN KEY (expedition_id) REFERENCES expeditions(id) ON DELETE CASCADE

CREATE TABLE network_1.fastqmetadata (
   id integer NOT NULL,
   local_identifier text NOT NULL,
   expedition_id integer NOT NULL,
   data jsonb NOT NULL,
   parent_identifier text NOT NULL
);
Indexes:
   "fastqmetadata_pkey" PRIMARY KEY, btree (id)
   "idx_network_1_fastqmetadata_local_identifier_expedition_id" UNIQUE CONSTRAINT, btree (local_identifier, expedition_id)
   "idx_network_1_fastqmetadata_data" gin (data)
Foreign-key constraints:
   "fastqmetadata_expedition_id_fkey" FOREIGN KEY (expedition_id) REFERENCES expeditions(id) ON DELETE CASCADE
   "network_1_fastqmetadata_parent_fkey" FOREIGN KEY (parent_identifier, expedition_id) REFERENCES network_1.tissue(local_identifier, expedition_id) ON DELETE CASCADE

我認為您對錯誤的估計無能為力,因為缺乏有關 jsonb 數據內部結構的統計數據。特別是 ‘urn:decimalLatitude’ 和 ‘urn:decimalLongitude’ 的流行程度以及它們之間的相關性。

但是您應該能夠通過添加索引來加速查詢,同時仍然有錯誤的估計:

->  Index Scan using idx_network_1_tissue_local_identifier_expedition_id on tissue  (cost=0.41..458.27 rows=1 width=22) (actual time=1.067..2.262 rows=1 loops=38826)
   Index Cond: (expedition_id = sample.expedition_id)
   Filter: (parent_identifier = sample.local_identifier)
   Rows Removed by Filter: 484

如果你有一個索引,這個索引掃描應該更有效(parent_identifier, expedition_id),或者(expedition_id, parent_identifier)

引用自:https://dba.stackexchange.com/questions/216606