组提交性能问题

在MySQL5.6之前只支持redo日志组提交,如果在环境中开启了binlog,那binlog只能只能串行写入,无法进行组提交,导致数据库整体性能下降非常严重。

开启binglog

[ 10s ] thds: 100 tps: 1391.16 qps: 1391.16 (r/w/o: 0.00/1391.16/0.00) lat (ms,95%): 150.29 err/s: 0.00 reconn/s: 0.00
[ 20s ] thds: 100 tps: 1302.81 qps: 1302.81 (r/w/o: 0.00/1302.81/0.00) lat (ms,95%): 167.44 err/s: 0.00 reconn/s: 0.00
[ 30s ] thds: 100 tps: 1305.70 qps: 1305.70 (r/w/o: 0.00/1305.70/0.00) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 40s ] thds: 100 tps: 1354.09 qps: 1354.09 (r/w/o: 0.00/1354.09/0.00) lat (ms,95%): 158.63 err/s: 0.00 reconn/s: 0.00
[ 50s ] thds: 100 tps: 1354.20 qps: 1354.20 (r/w/o: 0.00/1354.20/0.00) lat (ms,95%): 158.63 err/s: 0.00 reconn/s: 0.00
[ 60s ] thds: 100 tps: 1188.90 qps: 1188.90 (r/w/o: 0.00/1188.90/0.00) lat (ms,95%): 196.89 err/s: 0.00 reconn/s: 0.00
[ 70s ] thds: 100 tps: 1060.10 qps: 1060.10 (r/w/o: 0.00/1060.10/0.00) lat (ms,95%): 235.74 err/s: 0.00 reconn/s: 0.00
[ 80s ] thds: 100 tps: 1163.70 qps: 1163.70 (r/w/o: 0.00/1163.70/0.00) lat (ms,95%): 200.47 err/s: 0.00 reconn/s: 0.00
[ 90s ] thds: 100 tps: 1216.50 qps: 1216.50 (r/w/o: 0.00/1216.50/0.00) lat (ms,95%): 189.93 err/s: 0.00 reconn/s: 0.00
[ 100s ] thds: 100 tps: 1250.09 qps: 1250.09 (r/w/o: 0.00/1250.09/0.00) lat (ms,95%): 183.21 err/s: 0.00 reconn/s: 0.00
SQL statistics:
queries performed:
read: 0
write: 125975
other: 0
total: 125975
transactions: 125975 (1257.66 per sec.)
queries: 125975 (1257.66 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)

General statistics:
total time: 100.1647s
total number of events: 125975

Latency (ms):
min: 1.17
avg: 79.43
max: 788.50
95th percentile: 179.94
sum: 10006517.26

Threads fairness:
events (avg/stddev): 1259.7500/20.46
execution time (avg/stddev): 100.0652/0.04

未开启binlog

[ 10s ] thds: 100 tps: 11771.70 qps: 11771.70 (r/w/o: 0.00/11771.70/0.00) lat (ms,95%): 24.83 err/s: 0.00 reconn/s: 0.00
[ 20s ] thds: 100 tps: 10423.24 qps: 10423.34 (r/w/o: 0.00/10423.34/0.00) lat (ms,95%): 32.53 err/s: 0.00 reconn/s: 0.00
[ 30s ] thds: 100 tps: 9948.20 qps: 9948.10 (r/w/o: 0.00/9948.10/0.00) lat (ms,95%): 35.59 err/s: 0.00 reconn/s: 0.00
[ 40s ] thds: 100 tps: 9280.40 qps: 9280.40 (r/w/o: 0.00/9280.40/0.00) lat (ms,95%): 41.85 err/s: 0.00 reconn/s: 0.00
[ 50s ] thds: 100 tps: 10392.88 qps: 10392.88 (r/w/o: 0.00/10392.88/0.00) lat (ms,95%): 34.95 err/s: 0.00 reconn/s: 0.00
[ 60s ] thds: 100 tps: 9799.44 qps: 9799.44 (r/w/o: 0.00/9799.44/0.00) lat (ms,95%): 42.61 err/s: 0.00 reconn/s: 0.00
[ 70s ] thds: 100 tps: 9832.77 qps: 9832.77 (r/w/o: 0.00/9832.77/0.00) lat (ms,95%): 42.61 err/s: 0.00 reconn/s: 0.00
[ 80s ] thds: 100 tps: 10348.78 qps: 10348.78 (r/w/o: 0.00/10348.78/0.00) lat (ms,95%): 34.33 err/s: 0.00 reconn/s: 0.00
[ 90s ] thds: 100 tps: 10284.11 qps: 10284.11 (r/w/o: 0.00/10284.11/0.00) lat (ms,95%): 36.24 err/s: 0.00 reconn/s: 0.00
[ 100s ] thds: 100 tps: 9858.21 qps: 9858.21 (r/w/o: 0.00/9858.21/0.00) lat (ms,95%): 41.85 err/s: 0.00 reconn/s: 0.00
SQL statistics:
queries performed:
read: 0
write: 1019515
other: 0
total: 1019515
transactions: 1019515 (10187.78 per sec.)
queries: 1019515 (10187.78 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)

General statistics:
total time: 100.0708s
total number of events: 1019515

Latency (ms):
min: 0.22
avg: 9.81
max: 1577.13
95th percentile: 36.24
sum: 9998817.18

Threads fairness:
events (avg/stddev): 10195.1500/131.28
execution time (avg/stddev): 99.9882/0.01

可以从sysbench的结果看出开启binlog和不开启binlog写入性能相差快10倍了,这是因为在MySQL5.6之前binlog不支持组提交,在开启binlog后会通过prepare_commit_mutex和sync_binlog保证二进制日志和存储引擎顺序一致,只能串行化写入。当sync_binlog=1时,prepare_commit_mutex的锁机制造成高并发提交事务的性能急剧下降,因此建议尽快升级到5.6及以上的版本。

复制同步问题

在MySQL5.6之前Slave不支持Crash Safe,经常会出现重复主键等同步问题,导致同步中断,只能设置参数跳过指定错误

IO-Thread

IO-Thread负责接收master传输过来的二进制日志内容并生成relay log了,IO-Thread接收的信息都保存在master.info文件中,当然这些信息并不是实时写入到磁盘上的,默认sync_master_info为10000,当Slave异常重启后这部分信息就会丢失,启动后会重复拉取相同操作记录。当然也可以将参数设置为1,每次都直接写入到文件中,但是频繁的fsync必然导致性能下降。MySQL5.6中通过设置master-info-repository=table将信息存到表中,这样就提供Crash Safe的功能。

SQL-Thread

SQL-Thread负责操作回放,SQL thread的执行状态信息保存在relay-log.info文件,也存在上述文件与实际不一致的情况。在MySQL5.6中通过参数relay-log-info-repository=table将信息保存到数据表中,利用事务的原子性来保证其操作一致性

从上面的改变解决了SQL thread记录执行状态可能导致不一致的风险,但是对于IO thread依然存在问题 。IO thread 从master上拷贝binlog写入relay log中,每个二进制日志由多个log event组成,所以每接收到一个log event就需要更新master-info.log而且该是写入操作系统缓存。从IO thread的工作原理来看,它没有办法通过数据表来保证两个文件的原子性,因此IO thread还是会存在一些影响,这里我们可以设置relay-log-recover=1,当slave发生crash重启后,slave不根据master info的信息进行重连,而是根据relay-info中SQL-Thread执行到master的位置信息重新开始拉master上的日志数据。