A simple explanation of MySQL parallel replication

A simple explanation of MySQL parallel replication

1. Background of Parallel Replication

First of all, why is there this concept of parallel replication?

1. DBAs should know that MySQL replication is based on binlog.

2. MySQL replication consists of two parts, IO thread and SQL thread.

3. The IO thread is mainly used to pull the binlog passed by the receiving Master and write it to the relay log

4. The SQL thread is mainly responsible for parsing the relay log and applying it to the slave

5. In any case, the IO and SQL threads are single-threaded, while the master is multi-threaded, so there will inevitably be delays. In order to solve this problem, multi-threading came into being.

6. IO multithreading?

6.1 There is no need for multithreading of IO, because IO threads are not the bottleneck.

7. SQL multithreading?

7.1 Yes, the latest 5.6, 5.7, and 8.0 all implement multi-threading on the SQL thread to improve the concurrency of the slave.

Next, let’s take a look at MySQL’s efforts and achievements in parallel replication.

II. Key Points

Whether it can be done in parallel depends on whether there are lock conflicts between multiple transactions. This is the key. The following parallel replication principle is to see how to avoid lock conflicts

MySQL 5.6 schema-based parallel replication

slave-parallel-type=DATABASE (transactions in different databases, no lock conflicts)

As mentioned before, the purpose of parallel replication is to make the slave run as multi-threaded as possible. Of course, multi-threading based on the library level is also a way (transactions in different libraries, no lock conflicts)

Let's talk about the advantages first: It is relatively simple to implement and easy to use for users. Then let's talk about the disadvantages: Because it is based on the library, the granularity of parallelism is very coarse. Now the architecture of many companies is one library and one instance. For such an architecture, 5.6 parallel replication is powerless. Of course, there is also the order of master and slave transactions, which is also a big problem for 5.6

Without further ado, here are some pictures

4. MySQL 5.7 Parallel replication based on group commit

slave-parallel-type=LOGICAL_CLOCK : Commit-Parent-Based mode (transactions in the same group [same last-commit], no lock conflicts. In the same group, there must be no conflicts, otherwise they cannot be in the same group)
slave-parallel-type=LOGICAL_CLOCK : Lock-Based mode (even if the transactions are not in the same group, as long as there is no lock conflict [prepare phase] between the transactions, they can be concurrent. If they are not in the same group, as long as the prepare phases of N transactions can overlap, there is no lock conflict)

Group commit is described in detail in previous articles and will not be explained here. When MySQL5.7 commits a group, it also marks the transactions of each group. Now I think it is for the convenience of MTS.

Let's look at a set of binlogs first

last_committed=0 sequence_number=1
last_committed=1 sequence_number=2
last_committed=2 sequence_number=3
last_committed=3 sequence_number=4
last_committed=4 sequence_number=5
last_committed=4 sequence_number=6
last_committed=4 sequence_number=7
last_committed=6 sequence_number=8
last_committed=6 sequence_number=9
last_committed=9 sequence_number=10

4.1 Commit-Parent-Based Mode

4.2 Lock-Based Mode


5. MySQL 8.0 parallel replication based on write-set

Conflict detection based on primary key (binlog_transaction_depandency_tracking = COMMIT_ORDERE|WRITESET|WRITESET_SESSION, parallelization is possible if the primary key or non-empty unique key of the modified row does not conflict)
5.7.22 also supports the write-set mechanism

Transaction dependency: binlog_transaction_depandency_tracking = COMMIT_ORDERE|WRITESET|WRITESET_SESSION

COMMIT_ORDERE: Continue with group-based commit
WRITESET: Determine transaction dependencies based on write sets
WRITESET_SESSION: Based on the write set, but transactions in the same session will not have the same last_committed

Transaction detection algorithm: transaction_write_set_extraction = OFF| XXHASH64 | MURMUR32

MySQL will have a variable to store the HASH value of the submitted transaction. The values ​​of the primary key (or unique key) modified by all submitted transactions will be compared with the set of that variable after hashing to determine whether the modified row conflicts with it, and thus determine the dependency relationship.

The variable mentioned here can be set in size by: binlog_transaction_dependency_history_size

This granularity is at the row level. At this time, the parallel granularity is finer and the parallel speed is faster. In some cases, it is not an exaggeration to say that the parallelism of the slave exceeds that of the master (the master is a single-threaded write, and the slave can also play back in parallel)

6. How to make the slave's parallel replication and the master's transaction execution order consistent?

After 5.7.19, you can set slave_preserve_commit_order = 1

Official explanation:

For multithreaded slaves, enabling this variable ensures that transactions are externalized on the slave in the same order as they appear in the slave's relay log.
Setting this variable has no effect on slaves for which multithreading is not enabled.
All replication threads (for all replication channels if you are using multiple replication channels) must be stopped before changing this variable.
--log-bin and --log-slave-updates must be enabled on the slave.
In addition --slave-parallel-type must be set to LOGICAL_CLOCK.
Once a multithreaded slave has been started, transactions can begin to execute in parallel.
With slave_preserve_commit_order enabled, the executing thread waits until all previous transactions are committed before committing.
While the slave thread is waiting for other workers to commit their transactions it reports its status as Waiting for preceding transaction to commit.

The general implementation principle is: the excecution phase can be executed in parallel, and the binlog flush is performed in sequence. When the engine layer commits, it is also completed in the queue order according to binlog_order_commit

In other words, if this parameter is set, the slave will parallelize as the master does.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. If you have any questions, you can leave a message to communicate. Thank you for your support for 123WORDPRESS.COM.

You may also be interested in:
  • A brief analysis of MySQL's WriteSet parallel replication
  • A brief analysis of MySQL parallel replication
  • MySQL5.7 parallel replication principle and implementation

<<:  30 minutes to give you a comprehensive understanding of React Hooks

>>:  Tomcat components illustrate the architectural evolution of a web server

Recommend

jQuery implements employee management registration page

This article example shares the specific code of ...

Database query optimization: subquery optimization

1. Case Take all employees who are not the head o...

Calling the search engine in the page takes Baidu as an example

Today, it suddenly occurred to me that it would be...

Implementation of CSS text shadow gradually blurring effect

text-shadow Add a shadow to the text. You can add...

In-depth study of how to use positioning in CSS (summary)

Introduction to Positioning in CSS position attri...

Gallery function implemented by native Js

Table of contents The first The second Native Js ...

How to use type enhancement without typingscript

Preface Due to the weak typing of JS, loose writi...

Vue template compilation details

Table of contents 1. parse 1.1 Rules for intercep...

Importance of background color declaration when writing styles

As the title says, otherwise when the page is revi...

Use personalized search engines to find the personalized information you need

Many people now live on the Internet, and searchin...

Detailed explanation of Kubernetes pod orchestration and lifecycle

Table of contents K8S Master Basic Architecture P...

Example of disabling browser cache configuration in Vue project

When releasing a project, you will often encounte...

Solution to MySQL unable to read table error (MySQL 1018 error)

1. Error reproduction I can access the MySQL data...

Three ways to delete a table in MySQL (summary)

drop table Drop directly deletes table informatio...