Mysql

CPU 高使用率使我們的伺服器崩潰

  • February 4, 2022

執行我的數據庫的伺服器正遭受 CPU 峰值的困擾。我們無法確定導致這些 CPU 峰值的原因,以及如何緩解它們。

我試過添加一些索引,但也許我忘了一兩個。

如何檢查哪個表有問題?

每天一次,我有一個巨大的 CPU 跳躍到 700%。迄今為止,我們一直在通過重新啟動伺服器來解決它。

我可以提供必要的資訊來查找問題,但我不知道需要什麼資訊。

這是 MySQLTuner 報告:

>>  MySQLTuner 1.6.18 - Major Hayden <major@mhtx.net>
>>  Bug reports, feature requests, and downloads at http://mysqltuner.com/
>>  Run with '--help' for additional options and output filtering

[[0;34m--[0m] Skipped version check for MySQLTuner script
[[0;32mOK[0m] Logged in using credentials from debian maintenance account.
[[0;32mOK[0m] Currently running supported MySQL version 10.1.47-MariaDB-0+deb9u1
[[0;32mOK[0m] Operating on 64-bit architecture

-------- Storage Engine Statistics -----------------------------------------------------------------
[[0;34m--[0m] Status: [0;32m+Aria [0m[0;32m+CSV [0m[0;32m+InnoDB [0m[0;32m+MEMORY [0m[0;32m+MRG_MyISAM [0m[0;32m+MyISAM [0m[0;32m+PERFORMANCE_SCHEMA [0m[0;32m+SEQUENCE [0m
[[0;34m--[0m] Data in InnoDB tables: 1G (Tables: 387)
[[0;34m--[0m] Data in MyISAM tables: 1K (Tables: 1)
[[0;32mOK[0m] Total fragmented tables: 0

-------- Security Recommendations ------------------------------------------------------------------
[[0;32mOK[0m] There are no anonymous accounts for any database users
[[0;32mOK[0m] All database users have passwords assigned
[[0;31m!![0m] User 'kacper@%' hasn't specific host restriction.
[[0;34m--[0m] There are 612 basic passwords in the list.

-------- CVE Security Recommendations --------------------------------------------------------------
[[0;32mOK[0m] NO SECURITY CVE FOUND FOR YOUR VERSION

-------- Performance Metrics -----------------------------------------------------------------------
[[0;34m--[0m] Up for: 2h 17m 38s (1M q [236.530 qps], 53K conn, TX: 1G, RX: 285M)
[[0;34m--[0m] Reads / Writes: 71% / 29%
[[0;34m--[0m] Binary logging is disabled
[[0;34m--[0m] Physical Memory     : 62.8G
[[0;34m--[0m] Max MySQL memory    : 12.6G
[[0;34m--[0m] Other process memory: 209.7M
[[0;34m--[0m] Total buffers: 328.0M global + 2.8M per thread (4096 max threads)
[[0;34m--[0m] P_S Max memory usage: 1G
[[0;34m--[0m] Galera GCache Max memory usage: 0B
[[0;32mOK[0m] Maximum reached memory usage: 7.1G (11.38% of installed RAM)
[[0;32mOK[0m] Maximum possible memory usage: 12.6G (20.10% of installed RAM)
[[0;32mOK[0m] Overall possible memory usage with other process is compatible with memory available
[[0;32mOK[0m] Slow queries: 0% (0/1M)
[[0;32mOK[0m] Highest usage of available connections: 51% (2102/4096)
[[0;32mOK[0m] Aborted connections: 0.02%  (13/53000)
[[0;31m!![0m] name resolution is active : a reverse name resolution is made for each new connection and can reduce performance
[[0;32mOK[0m] Query cache efficiency: 30.4% (505K cached / 1M selects)
[[0;31m!![0m] Query cache prunes per day: 511609
[[0;32mOK[0m] Sorts requiring temporary tables: 5% (470 temp sorts / 9K sorts)
[[0;31m!![0m] Joins performed without indexes: 48
[[0;31m!![0m] Temporary tables created on disk: 76% (39K on disk / 51K total)
[[0;32mOK[0m] Thread cache hit rate: 89% (5K created / 53K connections)
[[0;32mOK[0m] Table cache hit rate: 81% (697 open / 860 opened)
[[0;32mOK[0m] Open file limit used: 0% (61/16K)
[[0;32mOK[0m] Table locks acquired immediately: 100% (998K immediate / 998K locks)

-------- Performance schema ------------------------------------------------------------------------
[[0;34m--[0m] Performance schema is enabled.
[[0;34m--[0m] Memory used by P_S: 1.1G
[[0;34m--[0m] Sys schema isn't installed.

-------- ThreadPool Metrics ------------------------------------------------------------------------
[[0;34m--[0m] ThreadPool stat is enabled.
[[0;34m--[0m] Thread Pool Size: 8 thread(s).
[[0;34m--[0m] Using default value is good enough for your version (10.1.47-MariaDB-0+deb9u1)

-------- MyISAM Metrics ----------------------------------------------------------------------------
[[0;31m!![0m] Key buffer used: 18.3% (3M used / 16M cache)
[[0;32mOK[0m] Key buffer size / total MyISAM indexes: 16.0M/124.0K
[[0;32mOK[0m] Read Key buffer hit rate: 97.5% (162 cached / 4 reads)

-------- AriaDB Metrics ----------------------------------------------------------------------------
[[0;34m--[0m] AriaDB is enabled.
[[0;32mOK[0m] Aria pagecache size / total Aria indexes: 128.0M/1B
[[0;31m!![0m] Aria pagecache hit rate: 83.9% (241K cached / 38K reads)

-------- InnoDB Metrics ----------------------------------------------------------------------------
[[0;34m--[0m] InnoDB is enabled.
[[0;31m!![0m] InnoDB buffer pool / data size: 128.0M/1.9G
[[0;31m!![0m] InnoDB buffer pool <= 1G and innodb_buffer_pool_instances(!=1).
[[0;34m--[0m] InnoDB Buffer Pool Chunk Size not used or defined in your version
[[0;32mOK[0m] InnoDB Read buffer efficiency: 99.87% (15073372350 hits/ 15093280124 total)
[[0;31m!![0m] InnoDB Write Log efficiency: 15.47% (49147 hits/ 317764 total)
[[0;32mOK[0m] InnoDB log waits: 0.00% (0 waits / 366911 writes)

-------- TokuDB Metrics ----------------------------------------------------------------------------
[[0;34m--[0m] TokuDB is disabled.

-------- Galera Metrics ----------------------------------------------------------------------------
[[0;34m--[0m] Galera is disabled.

-------- Replication Metrics -----------------------------------------------------------------------
[[0;34m--[0m] Galera Synchronous replication: NO
[[0;34m--[0m] No replication slave(s) for this server.
[[0;34m--[0m] This is a standalone server.

-------- Recommendations ---------------------------------------------------------------------------
General recommendations:
   Restrict Host for user@% to user@SpecificDNSorIp
   MySQL started within last 24 hours - recommendations may be inaccurate
   Configure your accounts with ip or subnets only, then update your configuration with skip-name-resolve=1
   Adjust your join queries to always utilize indexes
   When making adjustments, make tmp_table_size/max_heap_table_size equal
   Reduce your SELECT DISTINCT queries which have no LIMIT clause
Variables to adjust:
   query_cache_size (> 16M)
   join_buffer_size (> 256.0K, or always use indexes with joins)
   tmp_table_size (> 16M)
   max_heap_table_size (> 16M)
   innodb_buffer_pool_size (>= 1G) if possible.
   innodb_buffer_pool_instances (=1)

全球狀態和變數分析:

觀察:

  • 版本:10.1.47-MariaDB-0+deb9u1
  • 64 GB 記憶體
  • 正常執行時間 = 02:27:19;請在幾個小時後重新執行 SHOW GLOBAL STATUS。
  • 您沒有在 Windows 上執行。
  • 執行 64 位版本
  • 您似乎完全(或大部分)執行 InnoDB。

更重要的問題:

**非常重要:**將 innodb_buffer_pool_size 增加到至少 2G,但不超過可用RAM 的 70%(在為您的應用留出空間之後)

**非常重要:**增加 OS 中的打開文件限制,以便 MariaDB 可以增加一些設置。

如果你有 SSD,將 innodb_io_capacity 和 innodb_io_capacity_max 增加到 1000 和 2000。

innodb_log_file_size = 4G –(改變這個可能有困難。如果是這樣,現在跳過。)

查詢記憶體——目前無法判斷是否OFF會更好或更改某些設置會更好。先處理另一個;回到這個。

有一些緩慢的查詢。找到它們,讓我們努力加快速度。見http://mysql.rjweb.org/doc.php/mysql_analysis#slow_queries_and_slowlog

大約 1/3 的查詢是“管理員”類型。到底是怎麼回事?

如果您在修復上述問題後再次擷取 GLOBAL STATUS 和 VARIABLES,請確保至少保持 24 小時。

細節和其他觀察:

( (key_buffer_size / 0.20 + innodb_buffer_pool_size / 0.70) ) = ((16M / 0.20 + 128M / 0.70)) / 65536M = 0.40%– 大部分可用的 ram 應可用於記憶體。– http://mysql.rjweb.org/doc.php/memory

( Key_reads + Key_writes + Innodb_pages_read + Innodb_pages_written + Innodb_dblwr_writes + Innodb_buffer_pool_pages_flushed ) = (4 + 0 + 40835633 + 1094901 + 53370 + 1094779) / 8839 = 4873 /sec——眼壓?– 如果硬體可以處理,將 innodb_io_capacity(現在為 200)設置為這個值。

( ( Key_reads + Key_writes + Innodb_pages_read + Innodb_pages_written + Innodb_dblwr_writes + Innodb_buffer_pool_pages_flushed ) / innodb_io_capacity / Uptime ) = ( 4 + 0 + 40835633 + 1094901 + 53370 + 1094779 ) / 200 / 8839 = 2436.9%– 這可能是一個指標,表明合理設置了 innodb_io_capacity。– 如果硬體可以處理,則增加 innodb_io_capacity(現在為 200)。

( innodb_buffer_pool_size ) = 128M– InnoDB 數據 + 索引記憶體 – 128M(舊的預設值)小得可憐。

( innodb_lru_scan_depth ) = 1,024 – “InnoDB: page_cleaner: 1000ms 預期循環佔用了……”可以通過降低 lru_scan_depth 來修復

( Innodb_pages_written / Innodb_buffer_pool_write_requests ) = 1,094,901 / 3914981 = 28.0%– 必須命中磁碟的寫入請求 – 檢查 innodb_buffer_pool_size(現在為 134217728)

( Innodb_buffer_pool_reads ) = 20,609,583 / 8839 = 2331 /sec– 在 buffer_pool 中記憶體未命中。– 增加innodb_buffer_pool_size(現在是134217728)?(~100 是 HDD 的限制,~1000 是 SSD 的限制。)

( (Innodb_buffer_pool_reads + Innodb_buffer_pool_pages_flushed) ) = ((20609583 + 1094779) ) / 8839 = 2455 /sec– InnoDB I/O – 增加 innodb_buffer_pool_size(現在是 134217728)?

( Innodb_buffer_pool_pages_flushed ) = 1,094,779 / 8839 = 123 /sec– 寫入(刷新) – 增加 innodb_buffer_pool_size(現在為 134217728)?

( Innodb_buffer_pool_read_ahead_evicted ) = 47,759 / 8839 = 5.4 /sec

( Innodb_log_writes ) = 380,439 / 8839 = 43 /sec

( innodb_flush_method ) = innodb_flush_method = – InnoDB 應該如何要求作業系統寫入塊。建議使用 O_DIRECT 或 O_ALL_DIRECT (Percona) 以避免雙重緩衝。(至少對於 Unix。)有關 O_ALL_DIRECT 的警告,請參閱 chrischandler

( default_tmp_storage_engine ) = default_tmp_storage_engine =

( Innodb_row_lock_waits ) = 4,671 / 8839 = 0.53 /sec– 獲取行鎖延遲的頻率。– 可能是由可以優化的複雜查詢引起的。

( Innodb_row_lock_waits/Innodb_rows_inserted ) = 4,671/14717 = 31.7%– 不得不等待一排的頻率。

( Innodb_dblwr_writes ) = 53,370 / 8839 = 6 /sec– “雙寫緩衝區”寫入磁碟。“雙寫”是一項可靠性功能。一些較新的版本/配置不需要它們。–(其他問題的症狀)

( innodb_flush_neighbors ) = 1– 將塊寫入磁碟時的小優化。– 使用 0 表示 SSD 驅動器;1 用於硬碟。

( ( Innodb_pages_read + Innodb_pages_written ) / Uptime / innodb_io_capacity ) = ( 40835633 + 1094901 ) / 8839 / 200 = 2371.9%– 如果 > 100%,需要更多的 io_capacity。– 如果驅動器可以處理,則增加 innodb_io_capacity(現在為 200)。

( innodb_io_capacity ) = 200- 磁碟上每秒的 I/O 操作數。100 用於慢速驅動器;200 用於旋轉驅動器;SSD 1000-2000;乘以 RAID 係數。

( sync_binlog ) = 0– 使用 1 來增加安全性,以 I/O 為代價 =1 可能會導致大量“查詢結束”;=0 可能會導致“binlog 在不可能的位置”並在崩潰中失去事務,但速度更快。

( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF– 是否記錄所有死鎖。– 如果你被死鎖困擾,打開它。注意:如果你有很多死鎖,這可能會寫入很多磁碟。

( max_connections ) = 4,096– 最大連接數(執行緒)。影響各種分配。– 如果 max_connections(現在是 4096)太高並且各種記憶體設置很高,您可能會用完 RAM。

( innodb_buffer_pool_populate ) = OFF = 0– NUMA 控制

( (Com_show_create_table + Com_show_fields) / Questions ) = (82 + 38505) / 2038452 = 1.9%– 頑皮的框架 – 花費大量精力重新發現模式。– 向第 3 方供應商投訴。

( local_infile ) = local_infile = ON – local_infile (now ON) = ON 是一個潛在的安全問題

( bulk_insert_buffer_size ) = 8M / 65536M = 0.01%– 用於多行插入和載入數據的緩衝區 – 太大可能會威脅 RAM 大小。太小可能會阻礙此類操作。

( Qcache_hits / Qcache_inserts ) = 530,104 / 525802 = 1.01– 命中插入率 – 高是好的 – 考慮關閉查詢記憶體。

( (query_cache_size - Qcache_free_memory) / Qcache_queries_in_cache / query_alloc_block_size ) = (16M - 7255496) / 5280 / 16384 = 0.11– query_alloc_block_size vs formula – 調整 query_alloc_block_size (現在 16384)

( Created_tmp_disk_tables ) = 40,027 / 8839 = 4.5 /sec– 作為複雜 SELECT 的一部分創建磁碟“臨時”表的頻率 – 增加 tmp_table_size(現在為 16777216)和 max_heap_table_size(現在為 16777216)。檢查何時使用 MEMORY 而不是 MyISAM 的臨時表規則。也許較小的模式或查詢更改可以避免 MyISAM。更好的索引和查詢的重新制定更有可能有所幫助。

( Created_tmp_disk_tables / Created_tmp_tables ) = 40,027 / 52361 = 76.4%– 溢出到磁碟的臨時表的百分比 – 可能增加 tmp_table_size(現在為 16777216)和 max_heap_table_size(現在為 16777216);改進指標;避免斑點等

( Handler_read_rnd_next ) = 23,650,792,212 / 8839 = 2675731 /sec– 大量表掃描時高 – 可能鍵不足

( Handler_read_rnd_next / Com_select ) = 23,650,792,212 / 1204725 = 19,631– 每個 SELECT 掃描的平均行數。(大約)——考慮提高 read_buffer_size(現在為 131072)

( (Com_insert + Com_update + Com_delete + Com_replace) / Com_commit ) = (12162 + 457998 + 1321 + 675) / 3 = 157,385– 每個送出的語句(假設所有 InnoDB) – 低:可能有助於在事務中將查詢分組;高:長期交易使各種事情緊張。

( Select_scan ) = 183,655 / 8839 = 21 /sec– 全表掃描 – 添加索引/優化查詢(除非它們是小表)

( Select_scan / Com_select ) = 183,655 / 1204725 = 15.2%– % 的選擇進行全表掃描。(可能被儲存常式愚弄。)——添加索引/優化查詢

( Com_insert + Com_delete + Com_delete_multi + Com_replace + Com_update + Com_update_multi ) = (12162 + 1321 + 0 + 675 + 457998 + 0) / 8839 = 53 /sec– writes/sec – 50 writes/sec + 日誌刷新可能會最大化普通驅動器的 I/O 寫入容量

( ( Com_stmt_prepare - Com_stmt_close ) / ( Com_stmt_prepare + Com_stmt_close ) ) = ( 1657 - 0 ) / ( 1657 + 0 ) = 100.0%——你正在結束你準備好的陳述嗎?- 添加關閉。

( Com_stmt_close / Com_stmt_prepare ) = 0 / 1657 = 0– 準備好的報表應該是關閉的。– 檢查是否所有Prepared 語句都“關閉”。

( Com_alter_table ) = 575 / 8839 = 0.065 /sec ——為什麼會有這麼多ALTER?

( Com_admin_commands ) = 664,278 / 8839 = 75 /sec – 為什麼有這麼多的 DDL 語句?

( Com_admin_commands / Queries ) = 664,278 / 2040106 = 32.6%– “管理員”命令的查詢百分比。 - 這是怎麼回事?

( binlog_format ) = binlog_format = STATEMENT– 聲明/行/混合。– ROW 是 5.7 (10.3) 的首選

( innodb_autoinc_lock_mode ) = 1– Galera: 慾望 2 – 2 = “interleaved”; 1 =“連續”是典型的;0 =“傳統”。– 加萊拉的願望 2;2 需要 BINLOG_FORMAT=ROW 或 MIXED

( slow_query_log ) = slow_query_log = OFF– 是否記錄慢查詢。(5.1.12)

( long_query_time ) = 10– 用於定義“慢”查詢的截止時間(秒)。– 建議 2

( log_slow_slave_statements ) = log_slow_slave_statements = OFF– (5.6.11, 5.7.1) 預設情況下,複製的語句不會出現在慢日誌中;這導致他們顯示。– 在慢日誌中查看可能干擾副本讀取的寫入會很有幫助。

( Max_used_connections ) = 2,102– 連接的高水位線 – 大量不活動的連接是可以的;超過 100 個活動連接可能是個問題。Max_used_connections(現在是 2102)不區分它們;Threads_running(現在為 1)是瞬時的。

( Max_used_connections / host_cache_size ) = 2,102 / 807 = 260.5% – 增加host_cache_size(現在是807)

( Connections ) = 57,093 / 8839 = 6.5 /sec– Connections – 增加wait_timeout(現在是28800);使用池化?

異常小:

Acl_users = 3
Innodb_mem_adaptive_hash = 2.25e+6
Innodb_mem_dictionary = 1.75e+6

異常大:

( Innodb_pages_read + Innodb_pages_written ) / Uptime = 4,743
Aria_pagecache_reads = 4.5 /sec
Com_alter_table + Com_flush = 0.065 /sec
Com_create_db = 10 /HR
Com_create_index = 0.15 /sec
Com_create_table = 0.071 /sec
Com_replace_select = 77 /HR
Com_show_databases = 40 /HR
Com_show_fields = 4.4 /sec
Com_show_keys = 0.036 /sec
Com_show_master_status = 0.093 /sec
Com_show_processlist = 0.089 /sec
Com_show_slave_hosts = 1.2 /HR
Com_show_slave_status = 0.093 /sec
Com_show_storage_engines = 1.2 /HR
Com_show_tables = 0.61 /sec
Feature_locale = 0.12 /sec
Handler_discover = 21 /HR
Handler_read_rnd_next / Handler_read_rnd = 284,534
Innodb_buffer_pool_pages_made_not_young = 264922 /sec
Innodb_buffer_pool_pages_made_young = 1164 /sec
Innodb_buffer_pool_read_ahead = 2288 /sec
Innodb_buffer_pool_read_requests = 1744557 /sec
Innodb_buffer_pool_reads * innodb_page_size / innodb_buffer_pool_size = 251581.8%
Innodb_buffered_aio_submitted = 2288 /sec
Innodb_data_read = 75693018 /sec
Innodb_data_reads = 4619 /sec
Innodb_mutex_spin_waits = 2698 /sec
Innodb_num_index_pages_written = 48 /sec
Innodb_num_non_index_pages_written = 125 /sec
Innodb_pages_read = 4619 /sec
Innodb_pages_read + Innodb_pages_written = 4743 /sec
Innodb_rows_read = 2676665 /sec
Innodb_s_lock_spin_rounds = 6709 /sec
Innodb_s_lock_spin_waits = 1998 /sec
Prepared_stmt_count = 170
Rows_read = 2676658 /sec
Tc_log_page_size = 4,096
Threads_connected = 1,988
host_cache_size = 807
innodb_background_scrub_data_check_interval = 0.41 /sec
innodb_background_scrub_data_interval = 68 /sec
performance_schema_max_cond_instances = 17,684
performance_schema_max_socket_instances = 8,212
performance_schema_max_thread_instances = 8,292

異常字元串:

innodb_default_row_format = compact
innodb_fast_shutdown = 1
myisam_stats_method = NULLS_UNEQUAL

根據您的 ulimit -打開文件 1024 的報告,您的作業系統設置正在使 MySQL 無法打開文件句柄。這可以使用 ulimit -n 40000 動態更正,以允許比 MySQL 和其他程序需要更多的文件句柄。停止/啟動實例或重新啟動將允許 MySQL 使用額外的容量。

要使此更改在作業系統停止和啟動中持續存在,請參閱此 url - https://glassonionblog.wordpress.com/2013/01/27/increase-ulimit-and-file-descriptors-limit/ 請使用 40000 進行設置,不是範例中使用的 500000。對於您的特定作業系統,您的具體情況可能略有不同。

每秒速率 = RPS

為您的 my.cnf 考慮的建議

$$ mysqld $$基於可用數據的部分

remove thread_cache_size to allow MariaDB to auto size your thread cache size, expected to be above 256.
innodb_buffer_pool_size=4G  # from 128M to reduce innodb_buffer_pool_reads RPS of 2,332
innodb_io_capacity=2000  # from 200 to allow higher IOPS to your NVME data storage
read_rnd_buffer_size=128K  # from 256K to reduce handler_read_rnd_next RPS of 2,676.034

這只是提高系統性能的眾多機會中的前 4 個。查看配置文件、網路以獲取聯繫資訊和免費下載的實用程序腳本,以提高您的實例性能。

引用自:https://dba.stackexchange.com/questions/280772