PostgreSQL RDS：當大表/索引不在記憶體中時，性能比預期慢

October 25, 2021

我有一個相當大的表，43GB 和 14GB 索引，其中包含時間序列成本數據。我正在按日期查詢並彙總金額。當此數據不在記憶體中（作業系統或 Postgres）時，對於在特定時間段內擁有數百萬行數據的使用者，查詢可能需要長達 50 秒的時間才能執行，但通常會被其他過濾器過濾到數千行應用。據我所知，索引已根據查詢模式進行了很好的優化。我已經執行 EXPLAIN (ANALYZE, BUFFERS) 並且可以清楚地看到從磁碟讀取速度變慢了。

我的工作量有點奇怪，因為我正在做大批量寫入和大批量刪除，所以我相信 VACUUM 工作做了很多工作。我根本沒有對此進行調整，但實際上我認為根據我使用的索引這不會有幫助。我正在跟踪“活動進口”，然後刪除舊的非活動進口。活動導入 ID 包含在查詢和索引中，因此我不應該掃描以前導入的死元組。

我還嘗試將 random_page_cost 調低到 1.1，它將使用索引掃描而不是點陣圖堆掃描，但性能最終大致相同。

我在 RDS db.r6g.2xlarge（8vCPU，64GB 的 RAM）上執行 Postgres 13.4，並預配了 IOPS（11,000 - 我通常最多讀取 + 寫入 9,000）。

我不希望非記憶體查詢這麼慢。我在這裡有錯誤的期望嗎？我已經將 shared_buffers 調整為 40%，並意識到根據表和索引的大小，我可能應該跳到 128G 版本，這將是我的下一步。

                                                       Table 

"public.service_costs"
        Column          |              Type              | Collation | Nullable |         Default          | Storage  | Stats target | Description
-------------------------+--------------------------------+-----------+----------+--------------------------+----------+--------------+-------------
id                      | uuid                           |           | not null | public.gen_random_uuid() | plain    |              |
date                    | timestamp without time zone    |           | not null |                          | plain    |              |
cost_type               | character varying              |           | not null |                          | extended |              |
service                 | character varying              |           | not null |                          | extended |              |
amount                  | numeric                        |           | not null |                          | main     |              |
cost_category           | character varying              |           |          |                          | extended |              |
cost_sub_category       | character varying              |           |          |                          | extended |              |
service_costs_import_id | bigint                         |           | not null |                          | plain    |              |

Indexes:
   "service_costs_pkey" PRIMARY KEY, btree (id)
   "indx_srvc_csts_on_cst_type__dt__srvc__cst_ctgry__cst_sb_ctgry" btree (service_costs_import_id, cost_type, date, service, cost_category, cost_sub_category)
Access method: heap

…

   schema_name     |                             relname                             |    size    | table_size
--------------------+-----------------------------------------------------------------+------------+-------------
public             | service_costs                                                   | 43 GB      | 46511259648
public             | indx_srvc_csts_on_cst_type__dt__srvc__cst_ctgry__cst_sb_ctgry   | 14 GB      | 15080833024

…

EXPLAIN (ANALYZE,BUFFERS) SELECT SUM ( service_costs . amount )
FROM service_costs WHERE service_costs . cost_type IN (...)
AND service_costs . service_costs_import_id IN (2066, 2067, 1267, 1269, 1268, 1270, 2068, 1273, 4996, 5047)
AND service_costs.service = '....'
AND "service_costs"."date" BETWEEN '2021-10-01' AND '2021-10-31 23:59:59.999999';

…

Aggregate  (cost=1390974.93..1390974.94 rows=1 width=32) (actual time=17067.830..17067.831 rows=1 loops=1)
  Buffers: shared hit=6854 read=80448 dirtied=754
  I/O Timings: read=16236.006
  -&gt;  Bitmap Heap Scan on service_costs  (cost=351286.12..1390173.71 rows=320487 width=5) (actual time=4827.074..16996.060 rows=323382 loops=1)
        Recheck Cond: ((service_costs_import_id = ANY ('{2066,2067,1267,1269,1268,1270,2068,1273,4996,5047}'::bigint[])) AND ((cost_type)::text = ANY ('{...}'::text[])) AND (date &gt;= '2021-10-01 00:00:00'::timestamp without time zone) AND (date &lt;= '2021-10-31 23:59:59.999999'::timestamp without time zone) AND ((service)::text = '...'::text))
        Heap Blocks: exact=70327
        Buffers: shared hit=6854 read=80448 dirtied=754
        I/O Timings: read=16236.006
        -&gt;  Bitmap Index Scan on indx_srvc_csts_on_cst_type__dt__srvc__cst_ctgry__cst_sb_ctgry  (cost=0.00..351206.00 rows=320487 width=0) (actual time=4815.759..4815.759 rows=323382 loops=1)
              Index Cond: ((service_costs_import_id = ANY ('{2066,2067,1267,1269,1268,1270,2068,1273,4996,5047}'::bigint[])) AND ((cost_type)::text = ANY ('{...}'::text[])) AND (date &gt;= '2021-10-01 00:00:00'::timestamp without time zone) AND (date &lt;= '2021-10-31 23:59:59.999999'::timestamp without time zone) AND ((service)::text = '...'::text))
              Buffers: shared hit=159 read=16816
              I/O Timings: read=4575.310
Planning Time: 0.159 ms
Execution Time: 17067.865 ms
(14 rows)

…

Aggregate  (cost=1390974.93..1390974.94 rows=1 width=32) (actual time=403.002..403.003 rows=1 loops=1)
  Buffers: shared hit=87302
  -&gt;  Bitmap Heap Scan on service_costs  (cost=351286.12..1390173.71 rows=320487 width=5) (actual time=206.128..338.491 rows=323382 loops=1)
        Recheck Cond: ((service_costs_import_id = ANY ('{2066,2067,1267,1269,1268,1270,2068,1273,4996,5047}'::bigint[])) AND ((cost_type)::text = ANY ('{....}'::text[])) AND (date &gt;= '2021-10-01 00:00:00'::timestamp without time zone) AND (date &lt;= '2021-10-31 23:59:59.999999'::timestamp without time zone) AND ((service)::text = '...'::text))
        Heap Blocks: exact=70327
        Buffers: shared hit=87302
        -&gt;  Bitmap Index Scan on indx_srvc_csts_on_cst_type__dt__srvc__cst_ctgry__cst_sb_ctgry  (cost=0.00..351206.00 rows=320487 width=0) (actual time=195.167..195.167 rows=323382 loops=1)
              Index Cond: ((service_costs_import_id = ANY ('{...}'::bigint[])) AND ((cost_type)::text = ANY ('{...}'::text[])) AND (date &gt;= '2021-10-01 00:00:00'::timestamp without time zone) AND (date &lt;= '2021-10-31 23:59:59.999999'::timestamp without time zone) AND ((service)::text = '....'::text))
              Buffers: shared hit=16975
Planning Time: 0.168 ms
Execution Time: 403.042 ms

（11 行）

Index Cond: ((service_costs_import_id = ANY ('{2066,2067,1267,1269,1268,1270,2068,1273,4996,5047}'::bigint[])) AND ((cost_type)::text = ANY ('{...}'::text[])) AND (date &gt;= '2021-10-01 00:00:00'::timestamp without time zone) AND (date &lt;= '2021-10-31 23:59:59.999999'::timestamp without time zone) AND ((service)::text = '...'::text))
看起來您的索引是 on (service_costs_import_id, cost_type, date, service)，儘管它可能具有比查詢中未使用的列更多的列。如果這些是按順序排列的列，那麼該索引的問題是“服務”無法有效使用，因為它遵循“日期”列，該列用於範圍而不是相等。所以“服務”只能用於過濾掉行，不能跳轉到索引中的特定位置。如果您顛倒索引中最後兩列的順序，那麼它將能夠有效地利用所有列。
更好的是，如果您顛倒順序然後將“數量”添加到末尾，您可以獲得僅索引掃描。但有效地做到這一點可能需要更高水平的吸塵。
在配置 11,000 IOPS 的情況下，讀取 80000 個塊不應超過 16000 毫秒。但這忽略了延遲，除了 AWS 文件中最模糊的術語外，我沒有發現任何描述。如果您需要在發送下一個請求之前等待一個塊返回，您將不可能獲得最高的可用 IOPS。您可以提高 Effective_io_concurrency 以查看是否可以通過一次處理多個未完成的請求來改善情況。（它不會改進點陣圖索引掃描部分，只是點陣圖堆掃描。）當然，這種分析取決於一次只有一個查詢。如果多個查詢必須共享吞吐量，則必須對其進行劃分。

引用自：https://dba.stackexchange.com/questions/301637

PostgreSQL RDS：當大表/索引不在記憶體中時，性能比預期慢

相關問答

優化查詢以在分區表中跨多天獲取數據

AWS RDS Postgresql 和 max_worker_processes

具有大 IN 的 Postgres 查詢，並且在臨時表上加入似乎不起作用

排除目前行會削弱視窗函式的性能

這是 DELETE … WHERE EXISTS 查詢計劃 O(n*2) 還是 O(n^2)？

Postgres 命中率：是否應該從生產數據庫中刪除大歷史表？