如何針對非常高頻的更新優化表?
我有一個表,其中包含需要定期執行的任務列表:
applaudience=> \d+ maintenance_task Table "public.maintenance_task" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description ------------------------------------+--------------------------+-----------+----------+----------------------------------------------+----------+--------------+------------- id | integer | | not null | nextval('maintenance_task_id_seq'::regclass) | plain | | nid | citext | | not null | | extended | | execution_interval | interval | | not null | | plain | | last_attempted_at | timestamp with time zone | | | now() | plain | | last_maintenance_task_execution_id | integer | | | | plain | | disabled_at | timestamp with time zone | | | | plain | | maximum_execution_duration | interval | | not null | '00:05:00'::interval | plain | | maximum_concurrent_execution_count | integer | | not null | 0 | plain | | last_exhausted_at | timestamp with time zone | | not null | now() | plain | | Indexes: "maintenance_task_pkey" PRIMARY KEY, btree (id) "maintenance_task_name_idx" UNIQUE, btree (nid) Foreign-key constraints: "maintenance_task_last_maintenance_task_execution_id_fkey" FOREIGN KEY (last_maintenance_task_execution_id) REFERENCES maintenance_task_execution(id) ON DELETE SET NULL Referenced by: TABLE "maintenance_task_execution" CONSTRAINT "maintenance_task_execution_maintenance_task_id_fkey" FOREIGN KEY (maintenance_task_id) REFERENCES maintenance_task(id) ON DELETE CASCADE Options: autovacuum_vacuum_threshold=0, autovacuum_analyze_threshold=0, fillfactor=50
每次選擇要執行的任務時,我們都會更新 的值
last_attempted_at
。以下查詢用於安排新任務:CREATE OR REPLACE FUNCTION schedule_maintenance_task() RETURNS table(maintenance_task_id int) AS $$ BEGIN RETURN QUERY EXECUTE $q$ UPDATE maintenance_task SET last_attempted_at = now() WHERE id = ( WITH active_maintenance_task_execution_count AS ( SELECT DISTINCT ON (maintenance_task_id) maintenance_task_id, execution_count FROM ( SELECT id maintenance_task_id, 0 execution_count FROM maintenance_task UNION SELECT mte1.maintenance_task_id, count(*) execution_count FROM maintenance_task_execution mte1 WHERE mte1.ended_at IS NULL GROUP BY mte1.maintenance_task_id ) AS t ORDER BY maintenance_task_id, execution_count DESC ) SELECT mt1.id FROM maintenance_task mt1 INNER JOIN active_maintenance_task_execution_count amtec1 ON amtec1.maintenance_task_id = mt1.id WHERE mt1.disabled_at IS NULL AND mt1.maximum_concurrent_execution_count >= amtec1.execution_count AND ( mt1.last_attempted_at < now() - mt1.execution_interval OR mt1.last_exhausted_at < now() - mt1.execution_interval ) ORDER BY mt1.last_attempted_at ASC LIMIT 1 FOR UPDATE OF mt1 SKIP LOCKED ) RETURNING id $q$; END $$ LANGUAGE plpgsql SET work_mem='50MB';
schedule_maintenance_task
查詢正在以大約 600/分鐘的速度執行。大約 24 小時後問題開始出現:
applaudience=> EXPLAIN (analyze, buffers) applaudience-> SELECT id applaudience-> FROM maintenance_task; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Seq Scan on maintenance_task (cost=0.00..7715.86 rows=286886 width=4) (actual time=3.675..385.042 rows=31 loops=1) Buffers: shared hit=9455 Planning time: 0.236 ms Execution time: 385.060 ms (4 rows) applaudience=> SELECT * applaudience-> FROM pg_stat_all_tables applaudience-> WHERE schemaname = 'public' AND relname = 'maintenance_task'; relid | schemaname | relname | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | autovacuum_count | analyze_count | autoanalyze_count ----------+------------+------------------+----------+--------------+----------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+-------------+-------------------------------+--------------+-------------------------------+--------------+------------------+---------------+------------------- 22903432 | public | maintenance_task | 163230 | 5060130 | 5571795 | 7988441 | 0 | 185359 | 0 | 172989 | 148568 | 138285 | 9733 | | 2018-12-09 11:00:33.978177+00 | | 2018-12-09 10:01:07.945327+00 | 0 | 6922 | 0 | 1416 (1 row)
死元組的數量增長到 100k+。一個簡單的 seq 掃描需要讀取 9k+ 緩衝區來獲取 31 行。
這是一個
VACUUM VERBOSE maintenance_task
日誌:INFO: vacuuming "public.maintenance_task" INFO: index "maintenance_task_pkey" now contains 9555 row versions in 331 pages DETAIL: 0 index row versions were removed. 282 index pages have been deleted, 282 are currently reusable. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. INFO: index "maintenance_task_name_idx" now contains 9555 row versions in 787 pages DETAIL: 0 index row versions were removed. 690 index pages have been deleted, 690 are currently reusable. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. INFO: "maintenance_task": found 0 removable, 145247 nonremovable row versions in 2459 out of 4847 pages DETAIL: 145217 dead row versions cannot be removed yet, oldest xmin: 928967630 There were 180 unused item pointers. Skipped 1 page due to buffer pins, 2387 frozen pages. 0 pages are entirely empty. CPU: user: 0.05 s, system: 0.00 s, elapsed: 0.34 s. INFO: vacuuming "pg_toast.pg_toast_22903432" INFO: index "pg_toast_22903432_index" now contains 0 row versions in 1 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. INFO: "pg_toast_22903432": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 928967630 There were 0 unused item pointers. Skipped 0 pages due to buffer pins, 0 frozen pages. 0 pages are entirely empty. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. VACUUM
可以做些什麼來防止死元組數量的增長/調度查詢的速度變慢?
old_snapshot_threshold
Postgres 9.6 中針對此類情況添加了該設置。不允許長時間打開的查詢無限期地阻止真空。如果查詢從不需要可能已被清理掉的數據,它將正常完成。如果發現它確實需要這樣的數據,它會拋出一個錯誤。如果它是被遺忘的連接的一部分,它會無限期地掛起,但不會導致表膨脹。但是請注意,此設置可防止
autovacuum
在關係結束時將釋放的空間返回給作業系統,因為這是檢測錯誤條件所必需的。只有手冊VACUUM FULL
仍然會強制它。因此,雖然它可以幫助對抗一種類型的表格膨脹(正是您的問題),但它可能導致另一種(通常不太重要)。或者考慮
idle_in_transaction_session_timeout
,它會終止空閒時間過長的會話。明智地選擇您的設置。
我最初的評估是錯誤的。
感謝我在 Freenode 上獲得的幫助,我能夠了解根本原因並解決了不斷膨脹的表格問題。
我需要糾正自己的第一件事是對
VACUUM
工作原理的主要理解。VACUUM
無法從不在與表關聯的磁碟關係文件末尾的緩衝區中回收磁碟空間。但是,VACUUM
可以在與表關聯的磁碟關係文件的末尾重新組織緩衝區,即如果有許多更新並且VACUUM
在創建新緩衝區之前正在執行,那麼新行將就地儲存在同一個緩衝區中刪除的行數和新緩衝區的數量不會增加。為了
VACUUM
能夠將空間返回給作業系統,需要滿足以下條件:
- 表格末尾的一個或多個頁面完全空閒
- 可以輕鬆獲得排他表鎖
- 死行不再與任何現有事務相關
我的緩衝區不斷增長的事實表明其中一個條件沒有得到滿足。
因此,首先要檢查的是最古老的實時交易:
applaudience=> SELECT age(backend_xmin), (now() - xact_start), query applaudience-> FROM pg_stat_activity applaudience-> WHERE backend_xmin IS NOT NULL applaudience-> ORDER BY age(backend_xmin) DESC applaudience-> LIMIT 1; age | ?column? | query ---------+-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 4644352 | 12:31:38.895198 | -- Metabase + | | SELECT "public"."http_response"."body" AS "body" FROM "public"."http_response" GROUP BY "public"."http_response"."body" ORDER BY "public"."http_response"."body" ASC LIMIT 5000 (1 row)
事實證明,在我的情況下,有一個執行時間很長的查詢阻止 VACUUM 從緩衝區中刪除死行。這也可以通過查看
vacuum verbose
日誌來推測:DETAIL: 145217 dead row versions cannot be removed yet, oldest xmin: 928967630
日誌中的此條目指向頻率極高的更新或阻止清理的長時間執行的事務。
為了解決這個問題,我不得不:
- 終止長時間執行的事務。
- 執行
VACUUM FULL maintenance_task
一次。- 確保沒有長時間執行的事務阻塞 VACUUM 程序。
老答案:
似乎沒有辦法在不引起膨脹的情況下更新元組。
VACUUM FULL
根據我讀過的所有內容,執行常式或等效的表重寫變體似乎是不可避免的。
VACUUM
當由於大量更新或刪除活動而導致表包含大量死行版本時,純文字可能無法令人滿意。如果您有這樣的表,並且需要回收它佔用的多餘磁碟空間VACUUM FULL
,則需要使用. 這些命令重寫表的全新副本並為其建構新索引。所有這些選項都需要排他鎖。請注意,它們還臨時使用大約等於表大小的額外磁碟空間,因為表和索引的舊副本在新副本完成之前無法釋放。CLUSTER``ALTER TABLE
https://www.postgresql.org/docs/current/routine-vacuuming.html
假設我對上述內容是正確的,那麼解決方案是盡量減少全表重寫的影響。在這個程度上,我發現了
pg_repack
。pg_repack
– 用最少的鎖重新組織 PostgreSQL 數據庫中的表。由於
maintenance_task
表很小(少於 50 行),我應該能夠pg_repack
每小時執行一次,而對調度工作人員的影響最小。不幸的是,當表包含大量定期更新的行時,此解決方案效果不佳。