Troubleshooting of master-slave delay issues when upgrading MySQL 5.6 to 5.7

Troubleshooting of master-slave delay issues when upgrading MySQL 5.6 to 5.7

Recently, when upgrading the Zabbix database from MySQL 5.6 to 5.7, a master-slave delay problem occurred. This problem has troubled me for a long time and has not been solved. It was finally solved yesterday. I sorted out the entire troubleshooting process and shared it with everyone.

Environmental Description:

The MySQL master database is version 5.6, and there are four slave databases, three of which are version 5.6 and one is version 5.7. The library and table structures of all masters and slaves are consistent. The 5.7 slave database has a lot of delays, while the 5.6 slave database has no problem. The business is monitored by Zabbix, and basically all of them are insert batch operations. Each insert SQL inserts about 400-1000 rows of data.

question:

The slave database of MySQL5.7 has a large number of delays, the relaylog is written to the disk normally, and the application to the database is slow. There is no pressure on disk IO and CPU. There is no difference between sync_binlog being 20000 or 0, max_allowed_packet=128M, innodb_flush_log_at_trx_commit=0, bulk_insert_buffer_size = 128M, binlog_format=row, sync_relay_log=10000, parallel replication is not used, SSL is not enabled, GDID is not enabled, and semi-synchronization is not enabled.

Troubleshooting process:

1: Check each performance-related parameter and find no abnormality.

2: Checking the network card, hard disk, changing the server, and restarting the database server had no effect. The 5.7 delay still existed, so hardware problems were ruled out.

3: 5.7 synchronizes the binlog of the main database 5.6 to the relaylog quickly and normally, but the efficiency of replaying the relaylog in the 5.7 database is extremely low.

4: Compare the show engine innodb status results of 5.6 and 5.7 slaves:

=============5.6===============================
---BUFFER POOL 1
Buffer pool size 655359
Buffer pool size, bytes 10737401856
Free buffers 1019
Database pages 649599
Old database pages 239773
Modified db pages 119309
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 10777670, not young 181119246
13.90 youngs/s, 157.51 non-youngs/s
Pages read 8853516, created 135760152, written 784514803
20.96 reads/s, 58.17 creates/s, 507.02 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 2 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 649599, unzip_LRU len: 0
I/O sum[209618]:cur[2], unzip sum[0]:cur[0]
=============5.7==============================
---BUFFER POOL 1
Buffer pool size 819100
Buffer pool size, bytes 13420134400
Free buffers 1018
Database pages 722328
Old database pages 266620
Modified db pages 99073
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 37153, not young 795
0.00 youngs/s, 0.00 non-youngs/s
Pages read 149632, created 572696, written 2706369
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 722328, unzip_LRU len: 453903
I/O sum[98685]:cur[0], unzip sum[882]:cur[6]
+++++++++++++++++++++++

By comparison, we found that unzip has a value in 5.7, but not in 5.6. We initially suspected that the cause of the delay is related to compression and decompression.

5: Use perf top -p pidof mysqld to view the 5.7 slave library

It is found that libz.so.1.2.7[.]cc32 accounts for a higher proportion than mysqld, at about 6%. This library is related to compression and decompression.

6: Change the innodb_compression_level to 0 (that is, do not enable compression, the default is 6, the range is 0-9), and observe that there is no effect and the delay still exists. only

The share of libz has gone down, but the share of libc-2.17.so has gone up, higher than mysqld, at around 9%. Use pstack to view the waiting issues of the decompression in the research institute.

7: Check the history tables of zabbix. In order to save disk space, these tables were compressed:

CREATE TABLE trends (
itemid bigint(20) unsigned NOT NULL,
clock int(11) NOT NULL DEFAULT '0',
num int(11) NOT NULL DEFAULT '0',
value_min double(16,4) NOT NULL DEFAULT '0.0000',
value_avg double(16,4) NOT NULL DEFAULT '0.0000',
value_max double(16,4) NOT NULL DEFAULT '0.0000',
PRIMARY KEY (itemid,clock),
KEY clock (clock)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8

I suspect it is related to the compression parameter ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8.

8: Rebuild all historical tables, remove ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8, resynchronize, gradually reduce the delay, and recover.

Question: Why does the same table structure cause master-slave delay in 5.7 but not in 5.6? This may be caused by the backward compatibility issues of compression and decompression in MySQL 5.7. I didn't investigate it further, but I reported a bug to the official and asked them to check the source code: http://bugs.mysql.com/100702.

Please use ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8 with caution in production. After communicating with several experts in the industry, they said that the compression of MySQL versions before 8.0 is not very reliable, and using ZSTD for 8.0 is better.

This is the end of this article about the process of troubleshooting the master-slave delay problem when upgrading MySQL 5.6 to 5.7. For more information about the master-slave delay when upgrading MySQL 5.6 to 5.7, please search for previous articles on 123WORDPRESS.COM or continue to browse the related articles below. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • MySQL master-slave replication delay causes and solutions
  • Causes and solutions for MySQL master-slave synchronization delay
  • Detailed analysis of MySQL master-slave delay phenomenon and principle
  • Analysis and solution of MYSQL master-slave asynchrony delay principle
  • Detailed explanation of reducing MySQL master-slave data synchronization delay
  • In-depth explanation of MySQL master-slave replication delay problem
  • Solution to MySQL master-slave delay problem

<<:  Vue implements star rating with decimal points

>>:  js development plug-in to achieve tab effect

Recommend

Detailed explanation of the fish school algorithm in CocosCreator game

Preface I recently wanted to learn CocosCreator, ...

How MySQL handles implicit default values

Some students said that they encountered the prob...

Example of how to implement MySQL cascading replication

The so-called cascading replication is that the m...

About React Native unable to link to the simulator

React Native can develop iOS and Android native a...

Example analysis of the use of GROUP_CONCAT in MySQL

This article uses an example to describe how to u...

A super detailed Vue-Router step-by-step tutorial

Table of contents 1. router-view 2. router-link 3...

How to reset MySQL root password

Table of contents 1. Forgot the root password and...

An article to deal with Mysql date and time functions

Table of contents Preface 1. Get the current time...

Problems and pitfalls of installing Mysql5.7.23 in Win10 environment

I read many tutorials, but found that I could nev...

js array fill() filling method

Table of contents 1. fill() syntax 2. Use of fill...

Analysis and solution of abnormal problem of loading jar in tomcat

Description of the phenomenon: The project uses s...

Detailed process of SpringBoot integrating Docker

Table of contents 1. Demo Project 1.1 Interface P...

Complete steps to achieve high availability with nginx combined with keepalived

Preface In order to meet the high availability of...

Detailed example of changing Linux account password

Change personal account password If ordinary user...

Win10 configuration tomcat environment variables tutorial diagram

Before configuration, we need to do the following...