Db2

如何在 Db2/Linux 中獲取最近 10 分鐘內消耗最多 CPU 的應用程序的 IP 地址?

  • March 18, 2022

Linux 上的 Db2 v11.1 我想獲取過去 10 分鐘內消耗最多 CPU 的前 5 個 SQL 語句以及相應的 IP 地址。

我查詢了“包記憶體”表函式,得到了 SQL 語句。這可以。

SELECT
   A.STMT_TEXT
FROM
   TABLE(MON_GET_PKG_CACHE_STMT (NULL, NULL, NULL, -2)) AS A
WHERE
   A.LAST_METRICS_UPDATE > (CURRENT_TIMESTAMP) - 10 MINUTES
ORDER BY TOTAL_CPU_TIME DESC
LIMIT 5

現在我想獲取執行這些 SQL 的 Db2 客戶端的 IP 地址。

我試圖將上面的“包記憶體”數據與儲存 IP 地址的“連接”連接起來:

SELECT
   B.APPLICATION_ID,
   A.STMT_TEXT
FROM
   TABLE(MON_GET_PKG_CACHE_STMT (NULL, NULL, NULL, -2)) AS A
   LEFT OUTER JOIN TABLE(MON_GET_CONNECTION(CAST(NULL AS BIGINT), -2)) AS B
      ON A.EXECUTABLE_ID = B.LAST_EXECUTABLE_ID
WHERE
   A.LAST_METRICS_UPDATE > (CURRENT_TIMESTAMP) - 10 MINUTES
ORDER BY TOTAL_CPU_TIME DESC
LIMIT 5

在大多數情況下,APPLICATION_ID 欄位返回 null。這是有道理的,因為“連接”表函式包含有關目前連接的應用程序的資訊,但是當我每 10 分鐘執行一次上述 select 語句時,大多數應用程序已經與數據庫斷開連接,因此連接資訊失去了。

**問題:**如何獲取最近 10 分鐘內消耗 CPU 最多的 SQL 的 IP 地址?

就像 mustaccio 所說的那樣,您可能應該看看事件監視器。也就是說,另一種方法是將快照放入臨時表,等待一段時間,然後將新快照與臨時表進行比較。下面是一個 sh 腳本範例,您可以根據自己的需要進行調整:

#!/bin/sh

OPTS=`getopt d:t: "$@"`
eval set -- "$OPTS"

tm=10
while true ; do
   case "$1" in
       -d) db="$2"; shift 2;;
       -t) tm="$2"; shift 2;;
       --) shift; break;;
   esac
done

db2 connect to $db
if [ $? -ne 0 ]; then
   echo "Unable to connect to $db"
   exit 1
fi

db2 +c "create view stmt_metrics
 (executable_id, num_executions, rows_read, rows_modified, stmt_text)
as select executable_id, num_executions, rows_read, rows_modified, stmt_text
  from TABLE (MON_GET_PKG_CACHE_STMT('d', null, null, -1))"

db2 +c "create global temporary table stmt_samples as (
   select * from stmt_metrics) 
definition only on commit delete rows"

db2 +c "create view stmts_delta (executable_id, num_executions, rows_read, rows_modified, stmt_text)
as
select t2.executable_id
    , t2.num_executions - t1.num_executions as num_executions
    , t2.rows_read - t1.rows_read as rows_read
    , t2.rows_modified - t1.rows_modified as rows_modified
    , t2.stmt_text
from stmt_metrics as t2
  , stmt_samples as t1
where t2.executable_id = t1.executable_id"

db2 +c "insert into stmt_samples select * from stmt_metrics"

sleep $tm

db2 -x +c "select * from stmts_delta order by rows_modified desc fetch first 10 rows only"


db2 rollback

以下是活動事件監視器的範例:

db2 connect to <db>
db2 "alter service class sysdefaultsubclass under
    sysdefaultuserclass collect activity data on all database
    partitions with details and values"

db2 "create event monitor act_stmt for activities write to table manualstart"

db2 "set event monitor act_stmt state = 1"

您將創建許多表,我相信您最感興趣的是 ACTIVITYMETRICS_ACT_STMT。

您可以執行您的工作負載,然後查詢該表(例如,您可能感興趣的許多其他指標):

db2 "select APPL_ID, sum(TOTAL_CPU_TIME) as SUM_TOTAL_CPU_TIME
    from ACTIVITYMETRICS_ACT_STMT 
    where EVENT_TIMESTAMP > current_timestamp - 10 minutes      
    group by APPL_ID"

APPL_ID                            SUM_TOTAL_CPU_TIME  
*LOCAL.db2inst1.220316094506                       2027
*LOCAL.db2inst1.220316094705                        162
13x.23x.82.116.58680.220316094933                    27
13x.23x.82.116.58788.220316095429                    22
13x.23x.82.116.58790.220316095430                 17367
13x.23x.82.116.58794.220316095447                   429

6 record(s) selected.

我用 x 替換了兩位數。如果要隔離IP:

db2 "select REGEXP_SUBSTR(APPL_ID, '^\d+\.\d+\.\d+\.\d+', 1, 1), sum(TOTAL_CPU_TIME) as SUM_TOTAL_CPU_TIME 
    from ACTIVITYMETRICS_ACT_STMT 
    where EVENT_TIMESTAMP > current_timestamp - 10 minutes      
    group by REGEXP_SUBSTR(APPL_ID, '^\d+\.\d+\.\d+\.\d+', 1, 1)"

1                             SUM_TOTAL_CPU_TIME  
13x.23x.82.116                                  17845
-                                               22640

完成後記得停止顯示器

將使用其文本和 CPU 消耗到應用程序的語句“綁定”執行的唯一方法是活動的事件監視器。

但是為數據庫中的所有應用程序收集此類資訊可能非常昂貴 - 預設情況下,將大量數據收集到事件監視器表中。

建議是使用以下技術(非分區數據庫 env 的範例)盡可能限制收集的數據量,這會顯著限制使用為每個邏輯數據組指定的精確列集收集的數據。

CREATE TABLE ACTIVITYSTMT_ACT_LIM  
(
   ACTIVITY_ID BIGINT NOT NULL , 
   APPL_ID VARCHAR(64) NOT NULL , 
   UOW_ID INTEGER NOT NULL ,  
   STMT_TEXT CLOB(2M) INLINE LENGTH 2681 NOT LOGGED COMPACT 
)   
IN SYSTOOLSPACE;

CREATE TABLE ACTIVITY_ACT_LIM  
(
   ACTIVITY_ID BIGINT NOT NULL , 
   APPL_ID VARCHAR(64) NOT NULL , 
   UOW_ID INTEGER NOT NULL,
   ADDRESS VARCHAR(128) NOT NULL ,
   TIME_COMPLETED TIMESTAMP NOT NULL , 
   TIME_STARTED TIMESTAMP NOT NULL   
)   
IN SYSTOOLSPACE;

CREATE TABLE ACTIVITYMETRICS_ACT_LIM
(
   ACTIVITY_ID BIGINT NOT NULL ,
   APPL_ID VARCHAR(64 OCTETS) NOT NULL ,
   UOW_ID INTEGER NOT NULL ,
   TOTAL_CPU_TIME BIGINT NOT NULL
)   
IN SYSTOOLSPACE;

CREATE EVENT MONITOR ACT_LIM
      FOR ACTIVITIES
      WRITE TO TABLE
        ACTIVITY        (TABLE ACTIVITY_ACT_LIM ),
        ACTIVITYSTMT    (TABLE ACTIVITYSTMT_ACT_LIM ),
        ACTIVITYMETRICS (TABLE ACTIVITYMETRICS_ACT_LIM )
MANUALSTART;

-- activate activity data collection at the default user workload level
ALTER WORKLOAD SYSDEFAULTUSERWORKLOAD COLLECT ACTIVITY DATA ON COORDINATOR WITH DETAILS;

-- switch the event monitor on
SET EVENT MONITOR ACT_LIM STATE 1;
-- wait some time

-- switch the event monitor off
SET EVENT MONITOR ACT_LIM STATE 0;
-- deactivate activity data collection at the default user workload level
ALTER WORKLOAD SYSDEFAULTUSERWORKLOAD COLLECT ACTIVITY DATA NONE;

-- Analysis
SELECT 
 A.ADDRESS
, M.TOTAL_CPU_TIME
, A.APPL_ID, A.UOW_ID, A.ACTIVITY_ID
, VARCHAR(S.STMT_TEXT, 256) stmt_text
FROM ACTIVITY_ACT_LIM A
JOIN ACTIVITYSTMT_ACT_LIM S 
   ON (S.APPL_ID, S.UOW_ID, S.ACTIVITY_ID) = (A.APPL_ID, A.UOW_ID, A.ACTIVITY_ID) 
JOIN ACTIVITYMETRICS_ACT_LIM M
   ON (M.APPL_ID, M.UOW_ID, M.ACTIVITY_ID) = (A.APPL_ID, A.UOW_ID, A.ACTIVITY_ID)
--WHERE A.TIME_STARTED BETWEEN ... AND ...
--ORDER BY A.TIME_STARTED DESC
;

引用自:https://dba.stackexchange.com/questions/308732