How to add a column to a large MySQL table

How to add a column to a large MySQL table

The question is referenced from: https://www.zhihu.com/question/440231149. In MySQL, there are 300 million data in a table, which is not divided into tables. The requirement is to add a column of data to this large table. The database cannot be stopped, and there are still addition, deletion and modification operations. How do I do this? The answer is original

The old version of MySQL used to add a column:

ALTER TABLE your table ADD COLUMN new column char(128);

Will cause table lock, the simple process is as follows:

  • Create a new Table2 that is completely isomorphic to Table1
  • Add a write lock to Table1
  • Execute ALTER TABLE 你的表ADD COLUMN 新列char(128)
  • Copy the data in Table1 to Table2
  • Rename Table2 to Table1 and remove Table1, releasing all related locks

If the amount of data is extremely large, the table lock time will be very long, during which time all table updates will be blocked and online business cannot be executed normally.

For versions earlier than MySQL 5.6 (not included) , triggers are used to repeat updates to one table on another table and synchronize data. When data synchronization is complete, the business changes the table name to a new table and publishes it. Business will not be suspended. The trigger setup is similar to:

create trigger person_trigger_update AFTER UPDATE on original table for each row 
begin set @x = "trigger UPDATE";
Replace into new table SELECT * from original table where new table.id = original table.id;
END IF;
end;

MySQL 5.6 and later versions introduce the online DDL feature :

Alter table your table, ALGORITHM [=] {DEFAULT|INSTANT|INPLACE|COPY}, LOCK [=] { DEFAULT| NONE| SHARED| EXCLUSIVE }

The parameters are:

ALGORITHM:

  • DEFAULT : Default mode. In MySQL 8.0, if ALGORITHM is not specified, the INSTANT algorithm is selected first. If that fails, the INPLACE algorithm is used. If the INPLACE algorithm is not supported, the COPY method is used.
  • INSTANT : A new algorithm added in 8.0. The added column is returned immediately. But it cannot be a virtual column. The principle is very simple. When a new column is created, all the original data in the table does not change immediately . Only the column and the default value are recorded in the table dictionary. For the default Dynamic row format (which is actually a variant of Compressed), if the column is updated, the original data is marked as deleted and the updated record is appended at the end. Doing so means that there is no reserved queue space in advance, and subsequent updates may often cause row record space changes. But for most businesses, only the most recent records will be modified , so it is not a big problem.
  • INPLACE : Modify the original table directly without copying the temporary table. Modify records one by one without generating a large amount of undolog and redolog and without occupying a lot of buffer. This can avoid the IO and CPU consumption caused by rebuilding the table, ensuring good performance and concurrency during the period.
  • COPY : Copy to a temporary new table for modification. Due to record copying, a large amount of undolog and redolog will be generated, and a lot of buffer will be occupied, which will affect business performance.

LOCK:

  • DEFAULT : Similar to ALGORITHM's DEFAULT
  • NONE : No locks, allowing concurrent reading and updating of the table
  • SHARED : Shared lock, which allows reading but not updating
  • EXCLUSIVE : Reading and updating are not allowed

Comparison of online DDL modification algorithms supported by various versions:

image

Reference Documents:

MySQL 5.6: https://dev.mysql.com/doc/refman/5.6/en/innodb-online-ddl-operations.htmlMySQL

5.7: https://dev.mysql.com/doc/refman/5.7/en/innodb-online-ddl-operations.htmlMySQL

8.0: https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl-operations.html

This can be done by:

ALTER TABLE your table ADD COLUMN new column char(128), ALGORITHM=INSTANT, LOCK=NONE;

Similar statements are used to add fields online. It is best to specify ALGORITHM and LOCK so that when executing DDL, you can clearly know how much impact it will have on online business .

At the same time, the process of executing online DDL is roughly as follows:

image

It can be seen that metadata lock is required in the initial stage. Metadata lock was introduced to MySQL in 5.5. There were similar mechanisms for protecting metadata before, but the concept of metadata lock was not clearly proposed. However, there is a significant difference between versions before 5.5 (such as 5.1) and versions after 5.5 in terms of metadata protection. 5.1 protects metadata at the statement level, while 5.5 protects metadata at the transaction level. The so-called statement level means that after the statement is executed, its table structure can be updated by other sessions regardless of whether the transaction is committed or rolled back; while the transaction level means that the metadata lock is released only after the transaction ends.

The introduction of metadata lock mainly solves two problems. One is the transaction isolation problem. For example, under the repeatable isolation level, if session A modifies the table structure during two queries, the two query results will be inconsistent and cannot meet the repeatable read requirements. The other is the data replication problem. For example, if session A executes multiple update statements and another session B changes the table structure and commits it first, the slave will first redo the alter and then redo the update, which will cause replication errors.

If there are many transactions currently being executed, and there are transactions that contain large queries, such as:

START TRANSACTION;
select count(*) from your table

This will cause transactions that take a long time to execute and will also be blocked.

So, in principle:

  • Avoid large transactions
  • Make table structure changes during low business hours

This is the end of this article about how to add a column to a large MySQL table. For more information about how to add a column to a large MySQL table, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Example of how to quickly delete a 2T table in mysql in Innodb
  • High-efficiency query method for repeated fields in large MySQL tables
  • Solution to MySQL performance problem of deleting large tables
  • How to optimize MySQL tables with tens of millions of data?
  • Analysis of the principle of MySQL large table extraction mode from the perspective of cloud data migration service

<<:  JavaScript static scope and dynamic scope explained with examples

>>:  Implementation of navigation bar and drop-down menu in CSS

Recommend

How to smoothly upgrade and rollback Nginx version in 1 minute

Today, let's talk about a situation that is o...

Detailed explanation of grep and egrep commands in Linux

rep / egrep Syntax: grep [-cinvABC] 'word'...

HTML basics summary recommendation (paragraph)

HTML Paragraph Paragraphs are defined by the <...

DIV background semi-transparent text non-translucent style

DIV background is semi-transparent, but the words ...

Vue achieves seamless carousel effect

This article shares the specific code of Vue to a...

Awk command line or script that helps you sort text files (recommended)

Awk is a powerful tool that can perform some task...

mysql 5.7.20 win64 installation and configuration method

mysql-5.7.20-winx64.zipInstallation package witho...

How to solve the problem that mysql cannot be closed

Solution to mysql not closing: Right-click on the...

Detailed tutorial on deploying Hadoop cluster using Docker

Recently, I want to build a hadoop test cluster i...

Singleton design pattern in JavaScript

Table of contents 1. What is a design pattern? 2....

Ubuntu Server 16.04 MySQL 8.0 installation and configuration graphic tutorial

Ubuntu Server 16.04 MySQL 8.0 installation and co...