postgres_fdw 性能很慢

September 28, 2016

以下對外部的查詢大約需要 5 秒才能在 320 萬行上執行：

SELECT x."IncidentTypeCode", COUNT(x."IncidentTypeCode") 
FROM "IntterraNearRealTimeUnitReflexes300sForeign" x 
WHERE x."IncidentDateTime" &gt;= '05/01/2016' 
GROUP BY x."IncidentTypeCode" 
ORDER BY 1;

當我在普通表上執行相同的查詢時，它會在 0.6 秒內返回。執行計劃完全不同：

普通表

Sort  (cost=226861.20..226861.21 rows=4 width=4) (actual time=646.447..646.448 rows=7 loops=1) 
 Sort Key: "IncidentTypeCode" 
 Sort Method: quicksort  Memory: 25kB 
 -&gt; HashAggregate (cost=226861.12..226861.16 rows=4 width=4) (actual  time=646.433..646.434 rows=7 loops=1)
    Group Key: "IncidentTypeCode"
    -&gt; Bitmap Heap Scan on "IntterraNearRealTimeUnitReflexes300s" x  (cost=10597.63..223318.41 rows=708542 width=4) (actual time=74.593..342.110 rows=709376 loops=1) 
       Recheck Cond: ("IncidentDateTime" &gt;= '2016-05-01 00:00:00'::timestamp without time zone) 
       Rows Removed by Index Recheck: 12259 
       Heap Blocks: exact=27052 lossy=26888
       -&gt; Bitmap Index Scan on idx_incident_date_time_300  (cost=0.00..10420.49 rows=708542 width=0) (actual time=69.722..69.722 rows=709376 loops=1) 
          Index Cond: ("IncidentDateTime" &gt;= '2016-05-01 00:00:00'::timestamp without time zone) 

Planning time: 0.165 ms 
Execution time: 646.512 ms

外表

Sort  (cost=241132.04..241132.05 rows=4 width=4) (actual time=4782.110..4782.112 rows=7 loops=1)   
 Sort Key: "IncidentTypeCode" 
 Sort Method: quicksort  Memory: 25kB
 -&gt; HashAggregate  (cost=241131.96..241132.00 rows=4 width=4) (actual time=4782.097..4782.100 rows=7 loops=1)
    Group Key: "IncidentTypeCode"
    -&gt; Foreign Scan on "IntterraNearRealTimeUnitReflexes300sForeign" x  (cost=10697.63..237589.25 rows=708542 width=4) (actual time=1.916..4476.946 rows=709376 loops=1) 

Planning time: 1.413 ms 
Execution time: 4782.660 ms

我認為我為該GROUP BY條款付出了高昂的代價，當我EXPLAIN VERBOSE：

SELECT
   "IncidentTypeCode"
FROM
   PUBLIC ."IntterraNearRealTimeUnitReflexes300s"
WHERE
   (
       (
           "IncidentDateTime" &gt;= '2016-05-01 00:00:00' :: TIMESTAMP WITHOUT TIME ZONE
       )
   )

這將返回 700k 行。有沒有解決的辦法？

昨天我花了很多時間閱讀這個文件頁面，並認為我已經找到了設置use_remote_estimate為 true 的答案，但它沒有任何效果。

如有必要，我確實可以訪問外部伺服器來創建對象。子句中的時間戳值WHERE可以是任何值；它不是來自預定義值的列表。

如果您使用use_remote_estimate一定要執行ANALYZE外部表（我看到估計與返回的表非常接近，您可能已經這樣做了）。此外，下推改進在 <9.5 版本中不可用。我還假設您在遠端伺服器上也有相同的表結構（包括索引）。如果由於基數低而需要點陣圖，由於下推機制的限制，它不會使用索引。您可能希望減少返回的行數以強制執行 BTREE 索引掃描（時間戳範圍）。不幸的是，如果過濾器返回表中 +10% 的行（如果計劃者認為掃描整個表比查找讀取更便宜，則可能會改變這個百分比），那麼沒有乾淨的方法可以避免遠端伺服器上的 SeqScan。如果您使用的是 SSD，您可能會發現調整有用random_page_cost）。
您可以使用 CTE 來隔離 GROUP BY 行為：
WITH atable AS (
   SELECT "IncidentTypeCode"
   FROM PUBLIC ."IntterraNearRealTimeUnitReflexes300s"
   WHERE 
      ("IncidentDateTime" 
             BETWEEN '2016-05-01 00:00:00'::TIMESTAMP WITHOUT TIME ZONE 
                 AND '2016-05-02 00:00:00'::TIMESTAMP WITHOUT TIME ZONE)
)
SELECT atable."IncidentTypeCode", COUNT(atable.IncidentTypeCode) 
FROM atable
GROUP BY atable."IncidentTypeCode" 
ORDER BY atable."IncidentTypeCode";

引用自：https://dba.stackexchange.com/questions/149838

postgres_fdw 性能很慢

普通表

外表

相關問答

PostgreSQL：WHERE 子句中的 ANY (VALUES(…)) 導致速度急劇下降

我的大表的 Postgresql 查詢非常慢

當查詢有許多（超過 18 個）OR 語句時忽略 GIN 索引

Postgresql 極慢計數（帶索引，簡單查詢）

使用 GiST 索引的 Postgres LIKE 查詢與全掃描一樣慢

如何提高大銷售表中不同供應商的查詢性能？