從infile載入Mysql卡在硬碟上等待
我有一台 Windows 7 64 位機器,用於對 mysql 數據庫進行一些負載測試。我的程序使用 sqlalchemy 連接並
load from infile
在所述數據庫上執行多個語句。這些批量載入都發生在單個事務中,所有鍵都預先禁用,每個 csv 文件只有幾兆字節大。我遇到的問題是測試機綁定了 IO。它有足夠的可用記憶體 (12G) 來將整個事務保存在記憶體中並在另一端進行一次刷新。據我了解手冊,innodb 表不應該接觸硬碟驅動器,直到它在事務完成時刷新臟頁。
要載入的總數據約為 1G,分佈在不同的表中。最終需要 37 分鐘才能載入所有內容。這是我目前的閱讀測試設置。
show engine innodb status
如有必要,我也很樂意報告來自或類似查詢的結果。回顧一下,我需要知道 37 分鐘對於這個數據大小是否是一個快速的插入速度,以及我可以做些什麼來提高插入速度。
編輯:
哎呀!我忘記了一些重要資訊。
Mysql version 5.5 Server has 12G total ram Total rows inserted ~2,597,240
仔,
對不起,我不能早點回到這裡,事情最終變得非常忙碌了一段時間。我感謝大家的幫助。問題的一部分似乎是某些選項不接受
OFF|ON
作為它們的值。除此之外,只是將我的
innodb_io_capacity
價值設置得更高似乎就可以了。我還必須有足夠大的設置,innodb_buffer_pool_size
否則 Mysql 沒有足夠的空間來進行批量插入,因為大約。我認為允許 25% 的緩衝區大小。使用這些設置,它會保持刷新,直到完成整個批量插入,然後刷新到硬碟驅動器。當它等待下一次插入時,我得到了大約 20-25MB 的寫入速度,而不是我曾經擁有的恆定的 1MB 寫入速度。
剩下的唯一改進是為盒子添加足夠的記憶體以支持工作負載。
如果有人感興趣,這是我的設置:
# MySQL Server Instance Configuration File # ---------------------------------------------------------------------- # Generated by the MySQL Server Instance Configuration Wizard # # # Installation Instructions # ---------------------------------------------------------------------- # # On Linux you can copy this file to /etc/my.cnf to set global options, # mysql-data-dir/my.cnf to set server-specific options # (@localstatedir@ for this installation) or to # ~/.my.cnf to set user-specific options. # # On Windows you should keep this file in the installation directory # of your server (e.g. C:\Program Files\MySQL\MySQL Server X.Y). To # make sure the server reads the config file use the startup option # "--defaults-file". # # To run run the server from the command line, execute this in a # command line shell, e.g. # mysqld --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini" # # To install the server as a Windows service manually, execute this in a # command line shell, e.g. # mysqld --install MySQLXY --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini" # # And then execute this in a command line shell to start the server, e.g. # net start MySQLXY # # # Guildlines for editing this file # ---------------------------------------------------------------------- # # In this file, you can use all long options that the program supports. # If you want to know the options a program supports, start the program # with the "--help" option. # # More detailed information about the individual options can also be # found in the manual. # # # CLIENT SECTION # ---------------------------------------------------------------------- # # The following options will be read by MySQL client applications. # Note that only client applications shipped by MySQL are guaranteed # to read this section. If you want your own MySQL client program to # honor these values, you need to specify it as an option during the # MySQL client library initialization. # [client] port=3306 [mysql] default-character-set=utf8 # SERVER SECTION # ---------------------------------------------------------------------- # # The following options will be read by the MySQL Server. Make sure that # you have installed the server correctly (see above) so it reads this # file. # [mysqld] # The TCP/IP Port the MySQL Server will listen on port=3306 #Path to installation directory. All paths are usually resolved relative to this. basedir="C:/Program Files/MySQL/MySQL Server 5.5/" #Path to the database root datadir="C:/ProgramData/MySQL/MySQL Server 5.5/Data/" # The default character set that will be used when a new schema or table is # created and no character set is defined character-set-server=utf8 # The default storage engine that will be used when create new tables when default-storage-engine=INNODB # The maximum amount of concurrent sessions the MySQL server will # allow. One of these connections will be reserved for a user with # SUPER privileges to allow the administrator to login even if the # connection limit has been reached. max_connections=100 # Query cache is used to cache SELECT results and later return them # without actual executing the same query once again. Having the query # cache enabled may result in significant speed improvements, if your # have a lot of identical queries and rarely changing tables. See the # "Qcache_lowmem_prunes" status variable to check if the current value # is high enough for your load. # Note: In case your tables change very often or if your queries are # textually different every time, the query cache may result in a # slowdown instead of a performance improvement. query_cache_size=0 # The number of open tables for all threads. Increasing this value # increases the number of file descriptors that mysqld requires. # Therefore you have to make sure to set the amount of open files # allowed to at least 4096 in the variable "open-files-limit" in # section [mysqld_safe] table_cache=256 # Maximum size for internal (in-memory) temporary tables. If a table # grows larger than this value, it is automatically converted to disk # based table This limitation is for a single table. There can be many # of them. tmp_table_size=369M # How many threads we should keep in a cache for reuse. When a client # disconnects, the client's threads are put in the cache if there aren't # more than thread_cache_size threads from before. This greatly reduces # the amount of thread creations needed if you have a lot of new # connections. (Normally this doesn't give a notable performance # improvement if you have a good thread implementation.) thread_cache_size=8 #*** MyISAM Specific options # The maximum size of the temporary file MySQL is allowed to use while # recreating the index (during REPAIR, ALTER TABLE or LOAD DATA INFILE. # If the file-size would be bigger than this, the index will be created # through the key cache (which is slower). myisam_max_sort_file_size=100G # If the temporary file used for fast index creation would be bigger # than using the key cache by the amount specified here, then prefer the # key cache method. This is mainly used to force long character keys in # large tables to use the slower key cache method to create the index. myisam_sort_buffer_size=738M # Size of the Key Buffer, used to cache index blocks for MyISAM tables. # Do not set it larger than 30% of your available memory, as some memory # is also required by the OS to cache rows. Even if you're not using # MyISAM tables, you should still set it to 8-64M as it will also be # used for internal temporary disk tables. key_buffer_size=2G # Size of the buffer used for doing full table scans of MyISAM tables. # Allocated per thread, if a full scan is needed. read_buffer_size=64K read_rnd_buffer_size=256K # This buffer is allocated when MySQL needs to rebuild the index in # REPAIR, OPTIMZE, ALTER table statements as well as in LOAD DATA INFILE # into an empty table. It is allocated per thread so be careful with # large settings. sort_buffer_size=256K #*** INNODB Specific options *** innodb_data_home_dir="C:/Data/" # Use this option if you have a MySQL server with InnoDB support enabled # but you do not plan to use it. This will save memory and disk space # and speed up some things. #skip-innodb # Additional memory pool that is used by InnoDB to store metadata # information. If InnoDB requires more memory for this purpose it will # start to allocate it from the OS. As this is fast enough on most # recent operating systems, you normally do not need to change this # value. SHOW INNODB STATUS will display the current amount used. innodb_additional_mem_pool_size=32M # If set to 1, InnoDB will flush (fsync) the transaction logs to the # disk at each commit, which offers full ACID behavior. If you are # willing to compromise this safety, and you are running small # transactions, you may set this to 0 or 2 to reduce disk I/O to the # logs. Value 0 means that the log is only written to the log file and # the log file flushed to disk approximately once per second. Value 2 # means the log is written to the log file at each commit, but the log # file is only flushed to disk approximately once per second. innodb_flush_log_at_trx_commit=1 # The size of the buffer InnoDB uses for buffering log data. As soon as # it is full, InnoDB will have to flush it to disk. As it is flushed # once per second anyway, it does not make sense to have it very large # (even with long transactions). innodb_log_buffer_size=16M # InnoDB, unlike MyISAM, uses a buffer pool to cache both indexes and # row data. The bigger you set this the less disk I/O is needed to # access data in tables. On a dedicated database server you may set this # parameter up to 80% of the machine physical memory size. Do not set it # too large, though, because competition of the physical memory may # cause paging in the operating system. Note that on 32bit systems you # might be limited to 2-3.5G of user level memory per process, so do not # set it too high. innodb_buffer_pool_size=2385M # Size of each log file in a log group. You should set the combined size # of log files to about 25%-100% of your buffer pool size to avoid # unneeded buffer pool flush activity on log file overwrite. However, # note that a larger logfile size will increase the time needed for the # recovery process. innodb_log_file_size=1G # Number of threads allowed inside the InnoDB kernel. The optimal value # depends highly on the application, hardware as well as the OS # scheduler properties. A too high value may lead to thread thrashing. innodb_thread_concurrency=14 # Added by RCW, 2012.07.13 bulk_insert_buffer_size=4G myisam_sort_buffer_size=256M key_buffer_size=2G innodb_additional_mem_pool_size=2G innodb_buffer_pool_size=10G innodb_file_per_table=1 join_buffer_size=2G tmp_table_size=2G max_heap_table_size=2G table_open_cache=2G innodb_thread_concurrency=0 preload_buffer_size=128MB read_buffer_size=128MB read_rnd_buffer_size=128MB sort_buffer_size=1024MB query_cache_size=2G query_cache_limit=64MB thread_cache_size=32 max_allowed_packet=256M group_concat_max_len=8M innodb_table_locks=0 innodb_doublewrite=0 innodb_buffer_pool_instances=1 innodb_adaptive_flushing=1 innodb_flush_log_at_trx_commit=2 innodb_io_capacity=20000 innodb_use_sys_malloc=0
您的批量插入緩衝區為 4G。太好了……對於MyISAM!
InnoDB 不使用批量插入緩衝區。
您可能需要讓 sqlalchemy 將
load data infile
呼叫限制為多個事務。您可能還想禁用innodb_change_buffering,將其設置為
inserts
. 不幸的是,你不能這樣做SET GLOBAL innodb_change_buffering = 'inserts';
。如果你是 dom,你可能需要在 my.cnf 中設置它並重新啟動 mysql。更新 2012-07-13 16:53 EDT
我剛剛注意到您在 my.cnf 中有兩個值
innodb_buffer_pool_size
。第一個是2385M,最後一個是14G。如果 Windows 版 MySQL 接受 14G 而您只有 12G 的 RAM,那麼您的伺服器必須進行良好的舊時交換。您可以驗證緩衝池大小
SHOW VARIABLES LIKE 'innodb_buffer_pool_size';
更新 2012-07-13 16:58 EDT
您可能還想檢查緩衝池的滿載情況
SELECT FORMAT(A.num * 100.0 / B.num,2) BufferPoolFullPct FROM (SELECT variable_value num FROM information_schema.global_status WHERE variable_name = 'Innodb_buffer_pool_pages_data') A, (SELECT variable_value num FROM information_schema.global_status WHERE variable_name = 'Innodb_buffer_pool_pages_total') B;