Optimized implementation of count() for large MySQL tables

The following is my judgment based on the data structure of the B+ tree and my speculation on the experimental results. If there are any errors, please correct me!

Today I experimented with the optimization of MySQL count() operation. The following discussion is based on MySQL 5.7 InnoDB storage engine and x86 Windows operating system.

The structure of the created table is as follows (the amount of data is 1 million):

表結構

First, the question is which is faster among MySQL's count(*), count(PK), and count(1).
The results are as follows:

這里寫圖片描述

There is no difference! After adding the WHERE clause, the time for the three queries is also the same, so I won’t post the pictures.

When I was working in the company before, I wrote a SQL statement of select count(*) from table , which was very slow when there was a lot of data. So how to optimize it?

This starts with the InnoDB index. The InnoDB index is B+Tree.

For the primary key index: it only stores data on leaf nodes, its key is the primary key , and its value is the entire data .
For auxiliary indexes: key is the column for creating the index, and value is the primary key.

This gives us two pieces of information:
1. The entire data will be found according to the primary key
2. Only the primary key can be found based on the secondary index, and then the remaining information must be found through the primary key.

So if we want to optimize the count(*) operation, we need to find a short column and create a secondary index for it.
In my case it is status , although its "severelity" is almost 0.

Create an index first: ALTER TABLE test1 ADD INDEX ( status );
Then query as shown below:

這里寫圖片描述

It can be seen that the query time dropped from 3.35s to 0.26s, and the query speed increased by nearly 13 times .

If the index is the str column, what will the result be?
Create an index first: alter table test1 add index (str)
The results are as follows:

這里寫圖片描述

As you can see, the time is 0.422s, which is also very fast, but it is still about 1.5 times slower than status column.

To be more daring, I will do an experiment. I will delete the index of the status column, create a joint index of status and left(omdb,200) (this column has an average of 1000 characters), and then check the query time.
Create an index: alter table test1 add index ( status ,omdb(200))
The results are as follows:

這里寫圖片描述

The time is 1.172s

alter table test1 add index (status,imdbid);

Replenish! !
Be careful about index failures!
After the index is created, it looks like this:

這里寫圖片描述

You can see that key_len is 6, and Extra's description is using index.

If the index fails:

這里寫圖片描述

There are many situations in which indexes become invalid, such as using functions, != operations, etc. For details, please refer to the official documentation.

I haven't done a deep study on MySQL, and the above is based on my judgment based on the B+ tree data structure and speculation on experimental results. If there are any mistakes, please correct me!

This is the end of this article about the optimized implementation of count() for large MySQL tables. For more relevant content on count() optimization for large MySQL tables, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:

Usage and performance optimization techniques of aggregate function count in MySQL
Sharing on count optimization issues in innodb in mysql
Let's talk about the performance of MySQL's COUNT(*)
Detailed explanation of the correct use of the count function in MySQL
A brief discussion on MySQL count of rows
Summary of methods to improve mysql count
Detailed explanation of count without filter conditions in MySQL
Summary of the differences between count(*), count(1) and count(col) in MySQL
Detailed explanation of mySQL count data examples in multiple tables
Use and optimization of MySQL COUNT function

<<: Common HTML tag writing errors

>>: Web development tutorial cross-domain solution detailed explanation

How to change mysql password under Centos

Optimized implementation of count() for large MySQL tables

How to change mysql password under Centos

JS generates unique ID methods: UUID and NanoID

CSS isolation issue in Blazor

The complete process of iptables rules in Docker being lost after iptables restart

Solution to the problem of the entire page not being centered when using margin:0 auto in HTML

Teach you how to implement Vue3 Reactivity

Methods to enhance access control security in Linux kernel

Create a code example of zabbix monitoring system based on Dockerfile

Implementation steps for Docker deployment of SpringBoot applications

JavaScript implements draggable progress bar

Recommend

MySQL database operation and maintenance data recovery method

MySQL joint table query basic operation left-join common pitfalls

Detailed explanation of lazy loading and preloading of webpack

Mysql query database capacity method steps

How to enable MySQL remote connection in Linux server

Detailed explanation of MySQL syntax, special symbols and regular expressions

MySQL index principle and query optimization detailed explanation

Using radial gradient in CSS to achieve card effect

Detailed steps to install Nginx on Linux

Summary of solutions for MySQL not supporting group by

Linux type version memory disk query command introduction

How to limit the number of records in a table in MySQL

Let's talk in detail about whether setState in React is a macro task or a micro task

Detailed tutorial on installing qt5.12.8 and environment configuration on ubuntu18.04

Zabbix monitoring docker application configuration