Reasons and optimization solutions for slow MySQL limit paging with large offsets

Reasons and optimization solutions for slow MySQL limit paging with large offsets

In MySQL, we usually use limit to complete the paging function on the page, but when the amount of data reaches a large value, the further you turn the page, the slower the interface response speed will be.

This article mainly discusses the reasons why limit paging is slow with large offsets and its optimization solutions. To simulate this situation, the following first introduces the table structure and the executed SQL.

Scenario simulation

Create table statement

The structure of the user table is relatively simple, including id, sex and name. To make the SQL execution time change more obvious, there are 9 name columns.

CREATE TABLE `user` (
 `id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'Primary key',
 `sex` tinyint(4) NULL DEFAULT NULL COMMENT 'Gender 0-Male 1-Female',
 `name1` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 `name2` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 `name3` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 `name4` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 `name5` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 `name6` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 `name7` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 `name8` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 `name9` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 PRIMARY KEY (`id`) USING BTREE,
 INDEX `sex`(`sex`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 9000001 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic;

Data filling

A stored procedure is created here to fill in the data, with a total of 9,000,000 records. After the function is executed, another SQL statement is executed to modify the gender field.

ps: This function takes a long time to execute. I ran it for 617.284 seconds.

CREATE DEFINER=`root`@`localhost` PROCEDURE `data`()
begin 
 declare i int; 
 set i=1; 
 while(i<=9000000)do 
  insert into user values(i,0,i,i,i,i,i,i,i,i,i);
  set i=i+1; 
 end while;
end

-- Set the gender of users with even ids to 1-female update user set sex=1 where id%2=0;

SQL and execution time

SQL Execution time
select * from user where sex = 1 limit 100, 10; OK, Time: 0.005000s
select * from user where sex = 1 limit 1000, 10; OK, Time: 0.007000s
select * from user where sex = 1 limit 10000, 10; OK, Time: 0.016000s
select * from user where sex = 1 limit 100000, 10; OK, Time: 0.169000s
select * from user where sex = 1 limit 1000000, 10; OK, Time: 5.892000s
select * from user where sex = 1 limit 10000000, 10; OK, Time: 33.465000s

As you can see, the larger the limit offset, the longer the execution time.

Cause Analysis

First, let's analyze the execution process of this SQL statement, taking the first row in the above table as an example.

Since the sex column is an index column, MySQL will walk the sex index tree and hit the data with sex=1.

Then, because the non-clustered index stores the value of the primary key id, and the query statement requires a query of all columns, a table return will occur here. After hitting the data with a value of 1 in the sex index tree, the value of its leaf node, that is, the value of the primary key id, is used to query the values ​​of other columns (name, sex) of this row in the primary key index tree, and finally returned to the result set, so that the first row of data is successfully queried.

The last SQL statement requires limit 100, 10, which means querying data from 101 to 110. However, MySQL will query the first 110 rows and then discard the first 100 rows. Finally, only rows 101 to 110 are left in the result set, and the execution ends.

To summarize, in the above execution process, the reasons why the execution time of limit with large offset becomes longer are:

  • Querying all columns results in a table return
  • limit a, b will query the first a+b pieces of data, and then discard the first a piece of data

Combining the above two reasons, MySQL spends a lot of time on table return, and the result of a table return will not appear in the result set, which causes the query time to become longer and longer.

Optimization plan

Covering Index

Since invalid table returns are the main reason for slow query, the optimization plan is to reduce the number of table returns. Assume that in limit a, b, we first get the IDs of data a+1 to a+b, and then return to the table to obtain data in other columns. In this way, a number of table returns is reduced, and the speed will definitely be much faster.

This involves covering indexes. The so-called covering indexes allow you to retrieve the desired data from non-primary clustered indexes without having to query other columns from the primary key index through table return, which can significantly improve performance.

Based on this idea, the optimization solution is to query the primary key ID first, and then query other column data based on the primary key ID. The optimized SQL and execution time are shown in the following table.

Optimized SQL Execution time
select * from user a join (select id from user where sex = 1 limit 100, 10) b on a.id=b.id; OK, Time: 0.000000s
select * from user a join (select id from user where sex = 1 limit 1000, 10) b on a.id=b.id; OK, Time: 0.00000s
select * from user a join (select id from user where sex = 1 limit 10000, 10) b on a.id=b.id; OK, Time: 0.002000s
select * from user a join (select id from user where sex = 1 limit 100000, 10) b on a.id=b.id; OK, Time: 0.015000s
select * from user a join (select id from user where sex = 1 limit 1000000, 10) b on a.id=b.id; OK, Time: 0.151000s
select * from user a join (select id from user where sex = 1 limit 10000000, 10) b on a.id=b.id; OK, Time: 1.161000s

Sure enough, the execution efficiency has been significantly improved.

Conditional filtering

Of course, there is also a flawed approach that is to do conditional filtering based on sorting.

For example, in the user table above, I want to use limit paging to get data from 1000001 to 1000010. I can write SQL like this:

select * from user where sex = 1 and id > (select id from user where sex = 1 limit 1000000, 1) limit 10;

However, this optimization method is conditional: the primary key id must be in order. Under ordered conditions, other fields such as creation time can also be used instead of the primary key id, but the prerequisite is that this field is indexed.

In short, there are many limitations to using conditional filtering to optimize limit. It is generally recommended to use covering indexes for optimization.

summary

This paper mainly analyzes the reasons why limit paging with large offsets is slow, and also proposes corresponding optimization solutions. It recommends using covering indexes to optimize the problem of long execution time of limit paging with large offsets.

I hope this helps everyone.

The above is the detailed content of the reasons why MySQL limit paging is slow with large offsets and the optimization solution. For more information about MySQL limit paging, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • MySQL optimization solution: enable slow query log
  • A brief discussion on MySQL select optimization solution
  • MySQL query optimization: a table optimization solution for 1 million data
  • A brief discussion on MySQL large table optimization solution
  • MySQL functional index optimization solution
  • MySQL Optimization Solution Reference
  • Several common optimization solutions for MySQL

<<:  Solution to Vue's inability to watch array changes

>>:  VMware virtual machine installation Apple Mac OS super detailed tutorial

Recommend

Introduction to user management under Linux system

Table of contents 1. The significance of users an...

vue dynamic component

Table of contents 1. Component 2. keep-alive 2.1 ...

Learn Node.js from scratch

Table of contents url module 1.parse method 2. fo...

A detailed introduction to setting up Jenkins on Tencent Cloud Server

Table of contents 1. Connect to Tencent Cloud Ser...

Detailed tutorial on how to install MySQL 5.7.18 in Linux (CentOS 7) using YUM

The project needs to use MySQL. Since I had alway...

Detailed explanation of the steps to build a Vue project with Vue-cli

First you need to install Vue-cli: npm install -g...

HTML validate HTML validation

HTML validate refers to HTML validation. It is the...

Detailed process of configuring NIS in Centos7

Table of contents principle Network environment p...

SELinux Getting Started

Back in the Kernel 2.6 era, a new security system...

js tag syntax usage details

Table of contents 1. Introduction to label statem...

How to reset MySQL root password under Windows

Today I found that WordPress could not connect to...

What is Makefile in Linux? How does it work?

Run and compile your programs more efficiently wi...