Let's talk about the performance of MySQL's COUNT(*)

Let's talk about the performance of MySQL's COUNT(*)

Preface

Basically, programmers in the workplace use count(*), count(1) or count(primary key) to count the number of rows in a database table. Do you understand the differences and performance between them?

In fact, during the development process, it is a very time-consuming operation for programmers to count the total number of rows in a large table. So which method should we use to count faster?

Next, let’s talk about the method and performance of counting the total number of rows in MySQL.

Which is faster: count(*), count(1), or count(primary key)?

1. Create a table and insert 10 million records for experimental testing:

# Create a test table CREATE TABLE `t6` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `name` varchar(50) NOT NULL,
 `status` tinyint(4) NOT NULL,
 PRIMARY KEY (`id`),
 KEY `idx_status` (`status`)
)ENGINE=InnoDB DEFAULT CHARSET=utf8;

# Create a stored procedure to insert 1000w data CREATE PROCEDURE insert_1000w()
BEGIN
  DECLARE i INT;
  SET i=1;
  WHILE i<=10000000 DO
    INSERT INTO t6(name,status) VALUES('god-jiang-666',1);
    SET i=i+1;
  END WHILE;
END;

#Call the stored procedure to insert 10 million rows of data call insert_1000w();

2. Analyze the experimental results

# It took 0.572 seconds select count(*) from t6; 

insert image description here

# It took 0.572 seconds select count(1) from t6; 

insert image description here

# It took 0.580 seconds select count(id) from t6; 

insert image description here

# It took 0.620 seconds select count(*) from t6 force index (primary); 

insert image description here

From the above experiment, we can conclude that count(*) and count(1) are the fastest, followed by count(id), and the slowest is when count uses a mandatory primary key .

Let's continue to test their respective execution plans:

explain select count(*) from t6;
show warnings; 

insert image description here

insert image description here

explain select count(1) from t6;
show warnings; 

insert image description here

insert image description here

explain select count(id) from t6;
show warnings; 

insert image description here

insert image description here

explain select count(*) from t6 force index (primary);
show warnings; 

insert image description here

insert image description here

From the above experiment, we can draw the following three points:

  1. The MySQL query optimizer rewrites count(*) to count(0) and selects the idx_status index.
  2. Both count(1) and count(id) select the idx_statux index
  3. After adding force index (primary), the forced index is used.

This idx_status is equivalent to a secondary auxiliary index tree. Its purpose is to illustrate that when InnoDB processes count(*), if there is an auxiliary index tree, it will give priority to the auxiliary index tree to count the total number of rows.

To verify the conclusion that count(*) will give priority to the auxiliary index tree, let's continue with the following experiment:

# Delete the idx_status index and continue to execute count(*)
alter table t6 drop index idx_status;

explain select count(*) from t6; 

insert image description here

From the above experiment, we can conclude that if the auxiliary index tree idx_status is deleted, count(*) will choose to use the primary key index. So the conclusion is: count(*) will give priority to the auxiliary index. If there is no auxiliary index, the primary key index will be used.

Why does count(*) prefer secondary indexes?

Prior to MySQL 5.7.18, InnoDB processed count(*) statements by scanning the clustered index.

Starting with MySQL 5.7.18, InnoDB handles COUNT(*) statements by traversing the smallest available secondary index. If no secondary index exists, the clustered index is scanned.

Why does the new version use secondary indexes to process count(*)?

Because the leaf nodes of the InnoDB secondary index tree store the primary key, and the leaf nodes of the primary key index tree store the entire row of data, the secondary index tree is smaller than the primary key index tree. Therefore, the query optimizer gives priority to secondary indexes based on cost considerations. So index count(*) is faster than count(primary key).

Summarize

The conclusion of this article is count(*)=count(1)>count(id) .

Why is count(id) slower without the primary key index? Because count(id) needs to retrieve the primary key, then check whether it is empty, and then accumulate it, which is more expensive.

Count(*) will count all NOT NULL and NULL fields, while count(id) will not count NULL fields, so we should try to use NOT NULL when building tables and give it a default of empty.

Finally, when you want to total the number of rows in a database table in the future, you can boldly use count(*) or count(1).

References

  • Chapter 6 of "High Performance MySQL" (3rd Edition) Optimizing COUNT() Queries
  • "MySQL Practice 45 Lectures" by Lin Xiaobin

This is the end of this article about the performance of MySQL COUNT(*). For more information about MySQL COUNT(*), please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Optimized implementation of count() for large MySQL tables
  • Usage and performance optimization techniques of aggregate function count in MySQL
  • Sharing on count optimization issues in innodb in mysql
  • Detailed explanation of the correct use of the count function in MySQL
  • A brief discussion on MySQL count of rows
  • Summary of methods to improve mysql count
  • Detailed explanation of count without filter conditions in MySQL
  • Summary of the differences between count(*), count(1) and count(col) in MySQL
  • Detailed explanation of mySQL count data examples in multiple tables
  • Use and optimization of MySQL COUNT function

<<:  Limit HTML text box input to only numbers and decimal points

>>:  Use non-root users to execute script operations in docker containers

Recommend

web.config (IIS) and .htaccess (Apache) configuration

xml <?xml version="1.0" encoding=&qu...

The difference between absolute path and relative path in web page creation

1. Absolute path First of all, on the local compu...

Install mysql 5.6 from yum source in centos7.4 system

System environment: centos7.4 1. Check whether th...

Detailed explanation of CSS pre-compiled languages ​​and their differences

1. What is As a markup language, CSS has a relati...

In-depth understanding of uid and gid in docker containers

By default, processes in the container run with r...

How to create a flame effect using CSS

The main text starts below. 123WORDPRESS.COM Down...

Comprehensive understanding of html.css overflow

Comprehensive understanding of html.css overflow ...

WeChat applet implements fixed header and list table components

Table of contents need: Function Points Rendering...

HTML6 implements folding menu and accordion menu example code

The main part of the page: <body> <ul id...

CSS style to center the HTML tag in the browser

CSS style: Copy code The code is as follows: <s...

Mysql5.6.36 script compilation, installation and initialization tutorial

Overview This article is a script for automatical...

HTML cellpadding and cellspacing attributes explained in pictures

Cell -- the content of the table Cell margin (tabl...

Zabbix monitoring docker application configuration

The application of containers is becoming more an...