類似查詢的執行性能問題
我在 PostgreSQL 上對一個大約有 300 萬行的表執行一個簡單的查詢時遇到了一些性能問題,並且在一個大約有 120 行的表上進行了連接。
如果我只在較大的表上執行查詢(沒有過濾器),結果將立即返回,但使用連接,結果最多需要 2 分鐘才能返回相同的結果。
這是查詢:
SELECT foi.fk_dim_product, count(1) FROM evino_bi.fact_order_item AS foi INNER JOIN dim_order_item_status AS dois ON dois.id_dim_order_item_status = foi.fk_dim_order_item_status WHERE dois.is_reserved = '1' GROUP BY foi.fk_dim_product;
執行 EXPLAIN (ANALYZE, BUFFER),返回以下內容:
HashAggregate (cost=1379364.80..1379391.01 rows=2621 width=4) (actual time=84822.667..84822.863 rows=630 loops=1) Group Key: foi.fk_dim_product Buffers: shared hit=181461 read=1061877 -> Hash Join (cost=6.38..1360785.99 rows=3715762 width=4) (actual time=0.180..84764.538 rows=96703 loops=1) Hash Cond: (foi.fk_dim_order_item_status = dois.id_dim_order_item_status) Buffers: shared hit=181461 read=1061877 -> Seq Scan on fact_order_item foi (cost=0.00..1301725.54 rows=5839054 width=8) (actual time=0.002..81484.109 rows=5837655 loops=1) Buffers: shared hit=181458 read=1061877 -> Hash (cost=5.06..5.06 rows=105 width=4) (actual time=0.053..0.053 rows=70 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 11kB Buffers: shared hit=3 -> Seq Scan on dim_order_item_status dois (cost=0.00..5.06 rows=105 width=4) (actual time=0.005..0.038 rows=70 loops=1) Filter: (is_reserved = 1) Rows Removed by Filter: 40 Buffers: shared hit=3 Planning time: 0.623 ms Execution time: 84836.100 ms
最讓我困惑的問題是以下查詢以毫秒為單位執行並返回相同的數據:
SELECT foi.fk_dim_product, count(1) FROM evino_bi.fact_order_item foi INNER JOIN dim_order_item_status dois ON dois.id_dim_order_item_status = foi.fk_dim_order_item_status WHERE dois.is_reserved || '' = '1' GROUP BY 1
這是第二個語句的 EXPLAIN (ANALYZE, BUFFER):
HashAggregate (cost=555597.70..555623.91 rows=2621 width=4) (actual time=249.523..249.673 rows=630 loops=1) Group Key: foi.fk_dim_product Buffers: shared hit=134117 -> Nested Loop (cost=8172.60..555420.76 rows=35388 width=4) (actual time=2.971..219.564 rows=96860 loops=1) Buffers: shared hit=134117 -> Seq Scan on dim_order_item_status dois (cost=0.00..6.30 rows=1 width=4) (actual time=0.011..0.101 rows=70 loops=1) Filter: (((is_reserved)::text || ''::text) = '1'::text) Rows Removed by Filter: 40 Buffers: shared hit=3 -> Bitmap Heap Scan on fact_order_item foi (cost=8172.60..553329.08 rows=208538 width=8) (actual time=1.205..2.484 rows=1384 loops=70) Recheck Cond: (fk_dim_order_item_status = dois.id_dim_order_item_status) Heap Blocks: exact=132362 Buffers: shared hit=134114 -> Bitmap Index Scan on fact_order_item_fk_dim_order_item_status (cost=0.00..8120.47 rows=208538 width=0) (actual time=0.467..0.467 rows=3903 loops=70) Index Cond: (fk_dim_order_item_status = dois.id_dim_order_item_status) Buffers: shared hit=1752 Planning time: 0.691 ms Execution time: 249.917 ms
那麼,PostgreSQL 沒有充分規劃我的查詢?我可以在我的伺服器上執行任何性能調整,以避免出現第一條語句中的問題並幫助 PostgreSQL 更好地規劃嗎?
另外,為什麼第二個語句執行得快得離譜?
編輯
我更新了添加所需資訊的問題。
編輯 2
這是
\d
兩個表的。我從 fact_order_item 中隱藏了一些列,因為該表有近 150 列,大多數是簡單的數值。dim_order_item_status:
Table "evino_bi.dim_order_item_status" Column | Type | Modifiers --------------------------+------------------------+------------------------------------------------------------------------------------------ id_dim_order_item_status | integer | not null default nextval('dim_order_item_status_id_dim_order_item_status_seq'::regclass) src_id_order_item_status | integer | name | character varying(100) | name_pt | character varying(100) | is_reserved | smallint | is_problem | smallint | payment_status | character varying(15) | macro_status | character varying(100) | macro_status_pt | character varying(100) | is_solid | smallint | Indexes: "dim_order_item_status_pkey" PRIMARY KEY, btree (id_dim_order_item_status) "src_id_order_item_status_idx" UNIQUE, btree (src_id_order_item_status)
fact_order_item:
Table "evino_bi.fact_order_item" Column | Type | Modifiers ------------------------------------------+-----------------------------+------------------------------------------------------------------------------ id_fact_order_item | integer | not null default nextval('fact_order_item_id_fact_order_item_seq'::regclass) src_id_order_item | integer | src_fk_order | integer | order_increment_id | character varying(50) | order_type | character varying(50) | is_instant_buy | smallint | nfe_number | integer | nfe_serie | integer | nfe_key | character varying(50) | nfe_created_at | integer | nfe_created_at_time | integer | nf_created_at_datetime | timestamp without time zone | fk_dim_city | integer | fk_dim_product | integer | fk_dim_product_bundle | integer | fk_dim_customer | integer | fk_dim_logistics_provider | integer | fk_dim_channel_last_click | integer | fk_dim_source_medium_last_click | integer | fk_dim_content_last_click | integer | fk_dim_campaign_last_click | integer | fk_dim_channel_lead | integer | fk_dim_source_medium_lead | integer | fk_dim_content_lead | integer | fk_dim_campaign_lead | integer | fk_dim_order_item_status | integer | fk_dim_payment_method | integer | fk_dim_subscription | integer | fk_dim_order_volume_status | integer | fk_dim_order_volume_micro_status | integer | fk_dim_sales_rule | integer | fk_dim_region | integer | platform | character varying(40) | created_at | integer | created_at_time | integer | created_at_datetime | timestamp without time zone | updated_at | integer | updated_at_time | integer | updated_at_datetime | timestamp without time zone | payment_confirmed_at | integer | payment_confirmed_at_time | integer | payment_confirmed_at_datetime | timestamp without time zone | cm2 | numeric(10,4) | etl_updated_at | timestamp without time zone | variations | json | Indexes: "fact_order_item_pkey" PRIMARY KEY, btree (id_fact_order_item) "fk_fact_order_item_src_id_order_item" UNIQUE, btree (src_id_order_item) "fact_order_item_fk_dim_campaign_last_click" btree (fk_dim_campaign_last_click) "fact_order_item_fk_dim_campaign_lead" btree (fk_dim_campaign_lead) "fact_order_item_fk_dim_channel_last_click" btree (fk_dim_channel_last_click) "fact_order_item_fk_dim_channel_lead" btree (fk_dim_channel_lead) "fact_order_item_fk_dim_city" btree (fk_dim_city) "fact_order_item_fk_dim_content_last_click" btree (fk_dim_content_last_click) "fact_order_item_fk_dim_content_lead" btree (fk_dim_content_lead) "fact_order_item_fk_dim_customer" btree (fk_dim_customer) "fact_order_item_fk_dim_logistics_provider" btree (fk_dim_logistics_provider) "fact_order_item_fk_dim_order_item_status" btree (fk_dim_order_item_status) "fact_order_item_fk_dim_order_volume_status" btree (fk_dim_order_volume_status) "fact_order_item_fk_dim_payment_method" btree (fk_dim_payment_method) "fact_order_item_fk_dim_product" btree (fk_dim_product) "fact_order_item_fk_dim_product_bundle" btree (fk_dim_product_bundle) "fact_order_item_fk_dim_region" btree (fk_dim_region) "fact_order_item_fk_dim_sales_rule" btree (fk_dim_sales_rule) "fact_order_item_fk_dim_source_medium_last_click" btree (fk_dim_source_medium_last_click) "fact_order_item_fk_dim_source_medium_lead" btree (fk_dim_source_medium_lead) "fact_order_item_fk_dim_subscription" btree (fk_dim_subscription) "fk_fact_order_item_created_at" btree (created_at) "fk_fact_order_item_delivered_at" btree (delivered_at) "fk_fact_order_item_nfe_number_id" btree (nfe_number) "fk_fact_order_item_order_increment_id" btree (order_increment_id) "fk_fact_order_item_payment_confirmed_at" btree (payment_confirmed_at) "fk_fact_order_item_ready_for_picking_at" btree (ready_for_picking_at) "fk_fact_order_item_shipped_at" btree (shipped_at) "fk_fact_order_item_src_fk_order" btree (src_fk_order) "fk_fact_order_item_updated_at" btree (updated_at)
編輯 3
根據要求,這是我的數據庫規格:
Amazon RDS SSD db.m4.xlarge, PostgreSQL 9.5.2, Memory 16 GB, 4 cores.
這些配置應該設置為預設值,因為我沒有調整它們中的任何一個:
cpu_tuple_cost: 0.01 random_page_cost: 4 shared_buffers: 4058056kB work_mem: 64000kB effective_cache_size: 8116112kB MAINTENANCE_WORK_MEM: 259MB
您在該查詢中有兩個 Seq Scans。
- 超過
dim_order_item_status.is_reserved
- 又一個結束
foi.fk_dim_order_item_status
這些中的任何一個都有索引嗎?如果沒有,則創建索引和
VACUUM ANALYZE
表。其他一些注意事項,
- 在 PostgreSQL 中,我們不使用
1
布爾值。ALTER TABLE dim_order_item_status ALTER COLUMN is_reserved SET DATA TYPE bool;
- 也
count(1)
最好寫成count(*)
至於為什麼第二個查詢更快,連接正在減慢它的速度,因此它將索引掃描進一步推低。我的假設是您的統計資訊對於
fk_dim_order_item_status = dois.id_dim_order_item_status
. 通常,如果您執行EXPLAIN ANALYZE
而不只是執行,我們可以驗證這一點EXPLAIN
。但是無論如何,如果在您創建上述索引後問題仍然存在,請在這些表中執行VACUUM ANALYZE
或在這些表上執行。ANALYZE
如果你回來我們需要:
VACUUM ANALYZE
對於所有查詢。\d dim_order_item_status
\d dim_order_item_foi
在更快的計劃中,它低估了 seq 掃描返回的行數 70 倍,它高估了點陣圖掃描返回的行數 50 倍。這兩個估計錯誤大多抵消,留下你有一個很好的查詢。
高估的原因似乎是一個表中 is_reversed 的項目在另一個表中優先很少見。PostgreSQL 的計劃者無法知道這一點。對此沒有令人滿意的解決方案。我會採用你已經做過的方法——故意引入一個錯誤估計來彌補另一個無法修復的錯誤估計。