Use and optimization of MySQL COUNT function

What does the COUNT function do?
The “magic” of MyISAM
Simple COUNT optimization
Use approximations
More complex optimization

What does the COUNT function do?

COUNT is a specialized function that is usually used in two different ways: counting values and rows of data. Value refers to a non-NULL expression (NULL means the value is missing). If we specify a column name or other expression in the COUNT parameter, the COUNT function counts the number of times the expression has a value. This confuses many people, in large part because the concepts of value and NULL are ambiguous.

Another form of COUNT is to simply count the number of rows in the result set. This is how MySQL calculates the COUNT function when it knows that the expression given as an argument cannot possibly be NULL. The most typical example is COUNT(*), which you might think is an alternative to expanding all the columns of a data table. In fact, it ignores entire columns and only counts the number of rows of data.

A common mistake is that we specify the column name in the COUNT parameter and think that we are counting the rows. If you want to get the number of rows in the result, you should always use COUNT(*), which makes your query more explicit and avoids performance issues.

The “magic” of MyISAM

A common misconception is that MyISAM will be very fast for COUNT queries. MyISAM's COUNT query is indeed fast, but the scenario in which this speed is fast is very limited: this effect can be achieved only when the COUNT() query is performed and there is no WHERE condition, and in reality this scenario is rare. The reason MySQL can optimize this statement is that the storage engine always knows the exact number of rows in the data table. If MySQL knows that a column col cannot be NULL, it will also convert COUNT(col) to COUNT() for optimization.

There is nothing "magic" about MyISAM when there are WHERE conditions in COUNT queries, or other methods of counting values. Might be faster or slower than other storage engines, depending on many other factors.

Simple COUNT optimization

When you want to count the number of all rows when the index coverage of the data rows is not high, you can use the COUNT(*) of the MyISAM engine for optimization. The following example uses the standard world database to demonstrate the optimization when looking up the number of cities with an ID greater than 5. The SQL statement you write might look like this:

SELECT COUNT(*) FROM world.City WHERE ID > 5;

If you check the query using SHOW STATUS you will see that 4079 rows were scanned. If you use a negative condition query and subtract the number of cities whose ID is less than or equal to 5, you will find that you can reduce the scan results to 5 rows.

SELECT (SELECT COUNT(*) FROM world.City) - COUNT(*) FROM world.City WHERE ID <= 5;

This query reads fewer rows because the query is converted to a constant during the query optimization phase, which can be seen using EXPLAIN:

id	select_type	table	rows	Extra
1	PRIMARY	City	6	Using where; Using index
2	SUBQUERY	NULL	NULL	Select tables optimized way

A common problem is how to complete the query of the number of distinct values of the same column in one query statement. For example, you want to find out the number of different colors through a query statement. You can't use a query like SELECT COUNT(color = 'blue' OR color='red') FROM items because that wouldn't tell the difference between the corresponding counts of the different colors. You also cannot put color into the WHERE condition, such as SELECT COUNT(*) FROM items WHERE color = 'blue' AND color = 'red'. Since colors are mutually exclusive, you can solve this problem as follows:

SELECT SUM(IF(color = 'blue', 1, 0)) AS blue, 
SUM(IF(color = 'red', 1, 0)) as red FROM items;

Another alternative is to use COUNT instead of SUM, which only ensures that the judgment expression of the expression without value is false:

SELECT COUNT(color = 'blue' OR NULL) as blue,
COUNT(color = 'red' OR NULL) as red FROM items;

Use approximations

Sometimes exact numbers are not necessary, and approximations can be used. The estimated number of rows given by the EXPLAIN optimizer usually satisfies this scenario, and EXPLAIN can be used instead of the actual query.

In many cases, an exact quantity is much less efficient than an approximation. A client once asked to count the number of active users on their website. The number of users is cached and updated every 30 minutes. This is inherently inaccurate, so using an estimate is acceptable. This query uses multiple WHERE conditions to ensure that inactive users or default users (those with a special ID) are not counted. Removing these conditions and slightly modifying the count operation can make it more efficient. A further optimization is to remove unnecessary DISTINCT operations, thereby removing a filesort operation. The optimized query is faster and returns almost accurate results.

More complex optimization

Generally speaking, COUNT queries are difficult to optimize because they usually need to count many rows (access a lot of data). Another alternative in MySQL is to use covering indexes. If that is not enough, the entire system application architecture may need to be adjusted. For example, consider statistical data tables, or use an external caching system (such as Memcached). We often face a similar dilemma: fast, accurate and simple - you can only choose two!

The above is the detailed content of the use and optimization of MySQL COUNT function. For more information about the use and optimization of MySQL COUNT, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:

Optimized implementation of count() for large MySQL tables
Usage and performance optimization techniques of aggregate function count in MySQL
Sharing on count optimization issues in innodb in mysql
Let's talk about the performance of MySQL's COUNT(*)
Detailed explanation of the correct use of the count function in MySQL
A brief discussion on MySQL count of rows
Summary of methods to improve mysql count
Detailed explanation of count without filter conditions in MySQL
Summary of the differences between count(*), count(1) and count(col) in MySQL
Detailed explanation of mySQL count data examples in multiple tables

<<: Vue implements image drag and drop function

>>: How to deploy MySQL master and slave in Docker