Hibernate

為什麼這個快速的 postgres 查詢會出現在慢查詢日誌中?

  • July 30, 2015

在調查我們的 Postgres 日誌中的條目時,我注意到報告的持續時間和查詢速度之間存在很大差異。

2015-07-28 17:27:26 UTC [dms_segment_3] LOG:  duration: 2924.763 ms  bind <unnamed>: SELECT  prospect.id AS prospect_id,  0,  am.id, am.customer_id, 0 FROM xxxxx.audience_member am LEFT OUTER JOIN xxxxx.campaign campaign     ON campaign.id = $1 LEFT OUTER JOIN xxxxx.end_user prospect     ON campaign.id=prospect.campaign_id        AND prospect.email_id=am.customer_id        AND prospect.end_user_type != 1 WHERE am.audience_id = $2  ORDER BY am.id limit $3
2015-07-28 17:27:26 UTC [dms_segment_3] DETAIL:  parameters: $1 = '4000013578869', $2 = '4000013744916', $3 = '500'

對該查詢執行解釋會產生更快的速度:

explain analyze
SELECT  prospect.id AS prospect_id
,  0,  am.id, am.customer_id, 0
FROM xxxxx.audience_member am
LEFT OUTER JOIN xxxxx.campaign campaign
    ON campaign.id = 4000013578869 
LEFT OUTER JOIN xxxxx.end_user prospect
    ON campaign.id = prospect.campaign_id
   AND prospect.email_id = am.customer_id
   AND prospect.end_user_type != 1
WHERE am.audience_id = 4000013744916
ORDER BY am.id
limit 500;
                                                                                    QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------
Limit  (cost=326.61..326.67 rows=26 width=24) (actual time=0.500..0.517 rows=4 loops=1)
  ->  Sort  (cost=326.61..326.67 rows=26 width=24) (actual time=0.491..0.498 rows=4 loops=1)
        Sort Key: am.id
        Sort Method: quicksort  Memory: 25kB
        ->  Nested Loop Left Join  (cost=0.00..326.00 rows=26 width=24) (actual time=0.260..0.402 rows=4 loops=1)
              Join Filter: (campaign.id = prospect.campaign_id)
              ->  Nested Loop Left Join  (cost=0.00..107.25 rows=26 width=24) (actual time=0.237..0.346 rows=4 loops=1)
                    ->  Index Scan using idx_audience_member_audience_id on audience_member am  (cost=0.00..99.62 rows=26 width=16) (actual time=0.062.
.0.071 rows=4 loops=1)
                          Index Cond: (audience_id = 4000013744916::bigint)
                    ->  Materialize  (cost=0.00..7.30 rows=1 width=8) (actual time=0.042..0.060 rows=1 loops=4)
                          ->  Seq Scan on campaign  (cost=0.00..7.30 rows=1 width=8) (actual time=0.154..0.219 rows=1 loops=1)
                                Filter: (id = 4000013578869::bigint)
                                Rows Removed by Filter: 23
              ->  Index Scan using idx_enduser_emailaddress on end_user prospect  (cost=0.00..8.40 rows=1 width=24) (actual time=0.006..0.006 rows=0 loops=4
)
                    Index Cond: (email_id = am.customer_id)
                    Filter: ((end_user_type <> 1) AND (campaign_id = 4000013578869::bigint))
Total runtime: 0.701 ms
(17 rows)

有什麼我想念的嗎?什麼可以解釋日誌與“真實”查詢持續時間之間的 2+ 秒差距?

我們正在使用 Postgres 9.2。在生產中,我們在 Tomcat 6 應用程序中混合使用 Hibernate 查詢和本機 SQL。

更新:

jpmc26 的鷹眼發現參數在日誌中作為字元串傳遞,但在解釋中沒有,所以我再次執行它:

explain analyze SELECT  prospect.id AS prospect_id,  0,  am.id, am.customer_id, 0 FROM xxxxx.audience_member am LEFT OUTER JOIN xxxxx.campaign campaign     ON campaign.id = '4000013578869' LEFT OUTER JOIN xxxxx.end_user prospect     ON campaign.id=prospect.campaign_id        AND prospect.email_id=am.customer_id        AND prospect.end_user_type != 1 WHERE am.audience_id = '4000013744916'  ORDER BY am.id limit '500';
                                                                                    QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------
Limit  (cost=326.62..326.68 rows=26 width=24) (actual time=0.168..0.186 rows=4 loops=1)
  ->  Sort  (cost=326.62..326.68 rows=26 width=24) (actual time=0.164..0.171 rows=4 loops=1)
        Sort Key: am.id
        Sort Method: quicksort  Memory: 25kB
        ->  Nested Loop Left Join  (cost=0.00..326.01 rows=26 width=24) (actual time=0.065..0.145 rows=4 loops=1)
              Join Filter: (campaign.id = prospect.campaign_id)
              ->  Nested Loop Left Join  (cost=0.00..107.25 rows=26 width=24) (actual time=0.051..0.102 rows=4 loops=1)
                    ->  Index Scan using idx_audience_member_audience_id on audience_member am  (cost=0.00..99.62 rows=26 width=16) (actual time=0.017.
.0.025 rows=4 loops=1)
                          Index Cond: (audience_id = 4000013744916::bigint)
                    ->  Materialize  (cost=0.00..7.30 rows=1 width=8) (actual time=0.009..0.013 rows=1 loops=4)
                          ->  Seq Scan on campaign  (cost=0.00..7.30 rows=1 width=8) (actual time=0.023..0.031 rows=1 loops=1)
                                Filter: (id = 4000013578869::bigint)
                                Rows Removed by Filter: 23
              ->  Index Scan using idx_enduser_emailaddress on end_user prospect  (cost=0.00..8.40 rows=1 width=24) (actual time=0.005..0.005 rows=0 loops=4
)
                    Index Cond: (email_id = am.customer_id)
                    Filter: ((end_user_type <> 1) AND (campaign_id = 4000013578869::bigint))
Total runtime: 0.259 ms
(17 rows)

這一次,將查詢放在記憶體中的好處是顯而易見的,但除此之外什麼都沒有改變。

您在日誌中看到的是準備好的語句執行。使用準備好的語句是從應用程序層與數據庫互動的常用方法。

例如,使用 Hibernate,可以寫出這樣的東西(希望下面的程式碼片段是有效的,沒有測試過):

String sql = "SELECT first_name, last_name FROM customer WHERE email = :customer_email";
...
query.setParameter("customer_email", Customer.email);

其中,這是避免SQL 注入的好方法- 與使用連接建構完整的查詢文本(包括參數)不同。

也可以從 PostgreSQL 客戶端執行此操作:

PREPARE fetch_customer (text) AS
   SELECT first_name, last_name FROM customer WHERE email = $1;

[EXPLAIN ANALYZE] EXECUTE fetch_customer ('john@gmail.com');

另一方面,這就是您所經歷的,準備好的查詢可能會導致性能下降。準備好的語句不能從知道傳遞給它的值是什麼中獲益——僅僅是因為它們在準備時是未知的。因此,它必須選擇一個足夠通用的查詢計劃,以便在可接受的時間內獲得任何可能的結果。

例如,假設您有一個部分索引,例如

CREATE INDEX ON xxxx.campaign (campaign_id) WHERE campaign.id > 4000000000000;

(這在現實生活中沒有多大意義,但沒關係)。

當規劃器知道它有 4000013578869 ascampaign_id時,它可以選擇使用這個索引,可能會顯著減少要獲取的數據頁的數量。但是,準備好的聲明不能做到這一點,因為一些(大多數?)可能的活動將被排除在外。

引用自:https://dba.stackexchange.com/questions/108482