A case study to thoroughly understand how to correctly use MySQL inndb joint index

There is a business that queries the 5 most recently reviewed data.

SELECT `id`, `title`
FROM `th_content`
WHERE `audit_time` < 1541984478
 AND `status` = 'ONLINE'
ORDER BY `audit_time` DESC, `id` DESC
LIMIT 5;

Checking the monitoring status at that time, the CPU usage was over 100%. show processlist showed that many similar queries were in create sort index state.

View the structure of the table

CREATE TABLE `th_content` (
 `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
 `title` varchar(500) CHARACTER SET utf8 NOT NULL DEFAULT '' COMMENT 'Content title',
 `content` mediumtext CHARACTER SET utf8 NOT NULL COMMENT 'Main content',
 `audit_time` int(11) unsigned NOT NULL DEFAULT '0' COMMENT 'Audit time',
 `last_edit_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'Last edit time',
 `status` enum('CREATED','CHECKING','IGNORED','ONLINE','OFFLINE') CHARACTER SET utf8 NOT NULL DEFAULT 'CREATED' COMMENT 'Information status',
 PRIMARY KEY (`id`),
 KEY `idx_at_let` (`audit_time`,`last_edit_time`)
)ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

The index has a joint index with audit_time on the left, and no index on status .

Analyze the logic of the above sql execution:

Find all primary key IDs that are less than the review time from the joint index (if 1 million data have been reviewed before this timestamp, the primary key IDs of the corresponding 1 million data will be retrieved from the joint index)
It would be nice if there is an optimization in the future. Currently, there is: sort the 100 primary key IDs, and then in the next step of the table return operation, the primary keys that are close together may be retrieved in one disk I/O.
Go back to the table one by one, find out 1 million rows, and filter out the rows with status='ONLINE'
Finally, sort the query results (if 500,000 rows are all ONLINE, continue to sort these 500,000 rows)

Finally, because the amount of data is large, although only 5 rows are taken, according to the extreme example we just gave, 1 million rows of data are actually queried, and finally a memory sort of 500,000 rows of the database is performed in memory.

So it is very inefficient.

A schematic diagram is drawn to illustrate the query process in the first step. The pink part represents the data rows that need to be queried in the table in the end.

In the picture, I forged and filled in some data according to the index storage rules. If there is anything wrong, please leave a message to point it out. I hope that through this picture, you can see the way of joint index storage and index query.

Improvement idea 1

Range searches are not easy to use with good indexes. What improvements will there be if we add a joint index of audit_time and status ?

ALTER TABLE `th_content` ADD INDEX `idx_audit_status` (`audit_time`, `status`);

mysql> explain select `id`, `title` from `th_content` where `audit_time` < 1541984478 and `status` = 'ONLINE' order by `audit_time` desc, `id` desc limit 5;
+----+-------------+------------+-------+------------------------------------------+------------------+--------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+------------------------------------------+------------------+--------+------+--------+-------------+
| 1 | SIMPLE | th_content | range | idx_at_ft_pt_let,idx_audit_status | idx_audit_status | 4 | NULL | 209754 | Using where |
+----+-------------+------------+-------+------------------------------------------+------------------+--------+------+--------+-------------+

Details: Because audit_time is a range search, the index of the second column is not used. Only audit_time can be used, so key_len is 4. In the following idea 2, key_len these two fields is still 5.

Let's analyze the execution process after adding the index:

Find the joint index of the row with the largest audit_time that is less than the audit time from the joint index
Then search downwards one by one, because < audit_time is a range search and the values of the second column index are scattered. Therefore, you need to search forward one by one to match the index rows that meet the condition ( status = 'ONLINE') until you get the 5th row.
Return to the table to query the specific data required

In the diagram above, the pink color indicates the rows that meet the index requirements of the first column. If we query forward one by one, we will find three records on this leaf node. Then we need to continue to query to the left at the previous leaf node. Until 5 rows that meet the records are found, and finally return to the table.

Improvements

Because status value is in the index, there is no need to return to the table for query when filtering rows that satisfy status = 'ONLINE'. When returning to the table, only 5 rows of data are queried, which greatly reduces iops .

Disadvantages of this index

If status all five rows scanned in idx_audit_status is ONLINE , then only five rows need to be scanned;

If only 4 rows in the first 1 million rows scanned in idx_audit_status have status of ONLINE , you need to scan 1 million and 1 row to get the required 5 rows. The index needs to scan an undetermined number of rows.

Improvement idea 2

ALTER TABLE `th_content` DROP INDEX `idx_audit_status`;
ALTER TABLE `th_content` ADD INDEX `idx_status_audit` (`status`, `audit_time`);

This way, there is no pressure whether it is sorting or returning to the table.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. Thank you for your support of 123WORDPRESS.COM. If you want to learn more about this, please check out the following links

You may also be interested in:

Implementation of MySQL joint index (composite index)
Which is faster among MySQL full-text index, joint index, like query, and json query?
In-depth analysis of the leftmost matching principle of MySQL joint index
Analysis of MySQL joint index function and usage examples
MySQL joint index usage examples
Joint Index Learning Tutorial in MySQL
Optimizing MySQL joint index and Where clause to improve database operation efficiency
MySQL independent index and joint index selection

<<: Graphic tutorial on installing CentOS7 on VMware 15.5

>>: Detailed explanation of JavaScript stack and copy

Implementation of CSS scroll bar style settings

A case study to thoroughly understand how to correctly use MySQL inndb joint index

Implementation of CSS scroll bar style settings

In-depth understanding of MySQL slow query log

Vue+webrtc (Tencent Cloud) practice of implementing live broadcast function

Detailed discussion of several methods for deduplicating JavaScript arrays

Vue image cropping component example code

MySQL 5.7.19 installation and configuration method graphic tutorial (win10)

The iframe child page operates the parent page and implements the effect of shielding the page pop-up layer

Detailed example of MySQL exchange partition

Zabbix monitors Linux hosts based on snmp

Vue implements small search function

Recommend

Manually implement js SMS verification code input box

Detailed explanation of the steps to build a Vue project with Vue-cli

Detailed graphic explanation of hadoop installation and configuration based on Linux7

Detailed analysis of matching rules when Nginx processes requests

Analysis of the difference between placing on and where in MySQL query conditions

Detailed explanation of how to install mysql5.6 from binary installation package in centos7 environment

MySQL Basics Quick Start Knowledge Summary (with Mind Map)

Introduction to the three essential logs for MySQL database interviews

An article to understand the execution process of MySQL query statements

Web designer is a suitable talent

The best way to automatically mount shared directories on Ubuntu 16.04 startup under Virtualbox

Linux kernel device driver proc file system notes

Using CSS to implement image frame animation and curve motion

Textarea tag in HTML

Conditional comment style writing method and sample code