Mysql

從infile載入Mysql卡在硬碟上等待

  • September 4, 2017

我有一台 Windows 7 64 位機器,用於對 mysql 數據庫進行一些負載測試。我的程序使用 sqlalchemy 連接並load from infile在所述數據庫上執行多個語句。這些批量載入都發生在單個事務中,所有鍵都預先禁用,每個 csv 文件只有幾兆字節大。

我遇到的問題是測試機綁定了 IO。它有足夠的可用記憶體 (12G) 來將整個事務保存在記憶體中並在另一端進行一次刷新。據我了解手冊,innodb 表不應該接觸硬碟驅動器,直到它在事務完成時刷新臟頁。

要載入的總數據約為 1G,分佈在不同的表中。最終需要 37 分鐘才能載入所有內容。這是我目前的閱讀測試設置show engine innodb status如有必要,我也很樂意報告來自或類似查詢的結果。

回顧一下,我需要知道 37 分鐘對於這個數據大小是否是一個快速的插入速度,以及我可以做些什麼來提高插入速度。

編輯:

哎呀!我忘記了一些重要資訊。

Mysql version 5.5
Server has 12G total ram
Total rows inserted ~2,597,240

仔,

對不起,我不能早點回到這裡,事情最終變得非常忙碌了一段時間。我感謝大家的幫助。問題的一部分似乎是某些選項不接受OFF|ON作為它們的值。

除此之外,只是將我的innodb_io_capacity價值設置得更高似乎就可以了。我還必須有足夠大的設置,innodb_buffer_pool_size否則 Mysql 沒有足夠的空間來進行批量插入,因為大約。我認為允許 25% 的緩衝區大小。

使用這些設置,它會保持刷新,直到完成整個批量插入,然後刷新到硬碟驅動器。當它等待下一次插入時,我得到了大約 20-25MB 的寫入速度,而不是我曾經擁有的恆定的 1MB 寫入速度。

剩下的唯一改進是為盒子添加足夠的記憶體以支持工作負載。

如果有人感興趣,這是我的設置:

# MySQL Server Instance Configuration File
# ----------------------------------------------------------------------
# Generated by the MySQL Server Instance Configuration Wizard
#
#
# Installation Instructions
# ----------------------------------------------------------------------
#
# On Linux you can copy this file to /etc/my.cnf to set global options,
# mysql-data-dir/my.cnf to set server-specific options
# (@localstatedir@ for this installation) or to
# ~/.my.cnf to set user-specific options.
#
# On Windows you should keep this file in the installation directory 
# of your server (e.g. C:\Program Files\MySQL\MySQL Server X.Y). To
# make sure the server reads the config file use the startup option 
# "--defaults-file". 
#
# To run run the server from the command line, execute this in a 
# command line shell, e.g.
# mysqld --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini"
#
# To install the server as a Windows service manually, execute this in a 
# command line shell, e.g.
# mysqld --install MySQLXY --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini"
#
# And then execute this in a command line shell to start the server, e.g.
# net start MySQLXY
#
#
# Guildlines for editing this file
# ----------------------------------------------------------------------
#
# In this file, you can use all long options that the program supports.
# If you want to know the options a program supports, start the program
# with the "--help" option.
#
# More detailed information about the individual options can also be
# found in the manual.
#
#
# CLIENT SECTION
# ----------------------------------------------------------------------
#
# The following options will be read by MySQL client applications.
# Note that only client applications shipped by MySQL are guaranteed
# to read this section. If you want your own MySQL client program to
# honor these values, you need to specify it as an option during the
# MySQL client library initialization.
#
[client]

port=3306

[mysql]

default-character-set=utf8


# SERVER SECTION
# ----------------------------------------------------------------------
#
# The following options will be read by the MySQL Server. Make sure that
# you have installed the server correctly (see above) so it reads this 
# file.
#
[mysqld]

# The TCP/IP Port the MySQL Server will listen on
port=3306


#Path to installation directory. All paths are usually resolved relative to this.
basedir="C:/Program Files/MySQL/MySQL Server 5.5/"

#Path to the database root
datadir="C:/ProgramData/MySQL/MySQL Server 5.5/Data/"

# The default character set that will be used when a new schema or table is
# created and no character set is defined
character-set-server=utf8

# The default storage engine that will be used when create new tables when
default-storage-engine=INNODB


# The maximum amount of concurrent sessions the MySQL server will
# allow. One of these connections will be reserved for a user with
# SUPER privileges to allow the administrator to login even if the
# connection limit has been reached.
max_connections=100

# Query cache is used to cache SELECT results and later return them
# without actual executing the same query once again. Having the query
# cache enabled may result in significant speed improvements, if your
# have a lot of identical queries and rarely changing tables. See the
# "Qcache_lowmem_prunes" status variable to check if the current value
# is high enough for your load.
# Note: In case your tables change very often or if your queries are
# textually different every time, the query cache may result in a
# slowdown instead of a performance improvement.
query_cache_size=0

# The number of open tables for all threads. Increasing this value
# increases the number of file descriptors that mysqld requires.
# Therefore you have to make sure to set the amount of open files
# allowed to at least 4096 in the variable "open-files-limit" in
# section [mysqld_safe]
table_cache=256

# Maximum size for internal (in-memory) temporary tables. If a table
# grows larger than this value, it is automatically converted to disk
# based table This limitation is for a single table. There can be many
# of them.
tmp_table_size=369M


# How many threads we should keep in a cache for reuse. When a client
# disconnects, the client's threads are put in the cache if there aren't
# more than thread_cache_size threads from before.  This greatly reduces
# the amount of thread creations needed if you have a lot of new
# connections. (Normally this doesn't give a notable performance
# improvement if you have a good thread implementation.)
thread_cache_size=8

#*** MyISAM Specific options

# The maximum size of the temporary file MySQL is allowed to use while
# recreating the index (during REPAIR, ALTER TABLE or LOAD DATA INFILE.
# If the file-size would be bigger than this, the index will be created
# through the key cache (which is slower).
myisam_max_sort_file_size=100G

# If the temporary file used for fast index creation would be bigger
# than using the key cache by the amount specified here, then prefer the
# key cache method.  This is mainly used to force long character keys in
# large tables to use the slower key cache method to create the index.
myisam_sort_buffer_size=738M

# Size of the Key Buffer, used to cache index blocks for MyISAM tables.
# Do not set it larger than 30% of your available memory, as some memory
# is also required by the OS to cache rows. Even if you're not using
# MyISAM tables, you should still set it to 8-64M as it will also be
# used for internal temporary disk tables.
key_buffer_size=2G

# Size of the buffer used for doing full table scans of MyISAM tables.
# Allocated per thread, if a full scan is needed.
read_buffer_size=64K
read_rnd_buffer_size=256K

# This buffer is allocated when MySQL needs to rebuild the index in
# REPAIR, OPTIMZE, ALTER table statements as well as in LOAD DATA INFILE
# into an empty table. It is allocated per thread so be careful with
# large settings.
sort_buffer_size=256K


#*** INNODB Specific options ***
innodb_data_home_dir="C:/Data/"

# Use this option if you have a MySQL server with InnoDB support enabled
# but you do not plan to use it. This will save memory and disk space
# and speed up some things.
#skip-innodb

# Additional memory pool that is used by InnoDB to store metadata
# information.  If InnoDB requires more memory for this purpose it will
# start to allocate it from the OS.  As this is fast enough on most
# recent operating systems, you normally do not need to change this
# value. SHOW INNODB STATUS will display the current amount used.
innodb_additional_mem_pool_size=32M

# If set to 1, InnoDB will flush (fsync) the transaction logs to the
# disk at each commit, which offers full ACID behavior. If you are
# willing to compromise this safety, and you are running small
# transactions, you may set this to 0 or 2 to reduce disk I/O to the
# logs. Value 0 means that the log is only written to the log file and
# the log file flushed to disk approximately once per second. Value 2
# means the log is written to the log file at each commit, but the log
# file is only flushed to disk approximately once per second.
innodb_flush_log_at_trx_commit=1

# The size of the buffer InnoDB uses for buffering log data. As soon as
# it is full, InnoDB will have to flush it to disk. As it is flushed
# once per second anyway, it does not make sense to have it very large
# (even with long transactions).
innodb_log_buffer_size=16M

# InnoDB, unlike MyISAM, uses a buffer pool to cache both indexes and
# row data. The bigger you set this the less disk I/O is needed to
# access data in tables. On a dedicated database server you may set this
# parameter up to 80% of the machine physical memory size. Do not set it
# too large, though, because competition of the physical memory may
# cause paging in the operating system.  Note that on 32bit systems you
# might be limited to 2-3.5G of user level memory per process, so do not
# set it too high.
innodb_buffer_pool_size=2385M

# Size of each log file in a log group. You should set the combined size
# of log files to about 25%-100% of your buffer pool size to avoid
# unneeded buffer pool flush activity on log file overwrite. However,
# note that a larger logfile size will increase the time needed for the
# recovery process.
innodb_log_file_size=1G

# Number of threads allowed inside the InnoDB kernel. The optimal value
# depends highly on the application, hardware as well as the OS
# scheduler properties. A too high value may lead to thread thrashing.
innodb_thread_concurrency=14


# Added by RCW, 2012.07.13
bulk_insert_buffer_size=4G
myisam_sort_buffer_size=256M
key_buffer_size=2G
innodb_additional_mem_pool_size=2G
innodb_buffer_pool_size=10G
innodb_file_per_table=1
join_buffer_size=2G
tmp_table_size=2G
max_heap_table_size=2G
table_open_cache=2G
innodb_thread_concurrency=0
preload_buffer_size=128MB
read_buffer_size=128MB
read_rnd_buffer_size=128MB
sort_buffer_size=1024MB

query_cache_size=2G

query_cache_limit=64MB

thread_cache_size=32
max_allowed_packet=256M
group_concat_max_len=8M
innodb_table_locks=0

innodb_doublewrite=0
innodb_buffer_pool_instances=1
innodb_adaptive_flushing=1

innodb_flush_log_at_trx_commit=2
innodb_io_capacity=20000

innodb_use_sys_malloc=0

您的批量插入緩衝區為 4G。太好了……對於MyISAM

InnoDB 不使用批量插入緩衝區。

您可能需要讓 sqlalchemy 將load data infile呼叫限制為多個事務。

您可能還想禁用innodb_change_buffering,將其設置為inserts. 不幸的是,你不能這樣做SET GLOBAL innodb_change_buffering = 'inserts';。如果你是 dom,你可能需要在 my.cnf 中設置它並重新啟動 mysql。

更新 2012-07-13 16:53 EDT

我剛剛注意到您在 my.cnf 中有兩個值innodb_buffer_pool_size。第一個是2385M,最後一個是14G。如果 Windows 版 MySQL 接受 14G 而您只有 12G 的 RAM,那麼您的伺服器必須進行良好的舊時交換。

您可以驗證緩衝池大小

SHOW VARIABLES LIKE 'innodb_buffer_pool_size';

更新 2012-07-13 16:58 EDT

您可能還想檢查緩衝池的滿載情況

SELECT FORMAT(A.num * 100.0 / B.num,2) BufferPoolFullPct FROM
(SELECT variable_value num FROM information_schema.global_status
WHERE variable_name = 'Innodb_buffer_pool_pages_data') A,
(SELECT variable_value num FROM information_schema.global_status
WHERE variable_name = 'Innodb_buffer_pool_pages_total') B;

引用自:https://dba.stackexchange.com/questions/20862