MYSQL database basics - Join operation principle

MYSQL database basics - Join operation principle

Join uses the Nested-Loop Join algorithm. There are three types of Nested-Loop Join.

select * from t1 join t2 on t1.a = t2.a;
-- a 100 data items, b 1000 data items

Simple Nested-Loop Join

The entire table t1 will be traversed. T1 is used as the driving table. Each data in t1 will be queried in the entire table in t2. This process will be compared 100*1000 times.

Every time a full table query is performed in t2, the full table scan is not guaranteed to be in memory, the Buffer Pool will be eliminated, and it may be on disk.

Block Nested-Loop Join (MySQL driver link does not use index)

It will traverse the entire t1 table, load the t1 data into join_buffer, and then traverse the entire t2 table to match each piece of data in t2 with the data cached in t1 in join_buffer.

t1 full table scan = 100 times

t2 full table scan = 1000 times

Number of queries = 1100

Comparisons in join_buffer = 100 * 1000 times

The number of comparisons is the same as that of Simple Nested-Loop Join, but the comparison process is much faster than Simple Nested-Loop Join and has better performance.

join_buffer has a size. If the data found in t1 is larger than the size of join_buffer, part of the data in t1 will be loaded first. After comparing t2, join_buffer will be cleared and the remaining data in t1 will be loaded. If the loading is incomplete, the operation will be repeated.

The number of full table scans for t1 remains the same as the number in join_buffer 1, but the number of scans for t2 is multiplied by the number of segments.

Assume that the number of data rows in the driving table is N, which needs to be divided into K segments to complete the algorithm process, and the number of data rows in the driven table is M.

K = λ * N

Scan the driven table times = M * λ * N

λ is related to the size of join_buffer. When the join_buffer size is large enough, the time for large table driver and small table driver is the same.

When segmentation is required, the fewer the segmentation times, the fewer times the driven table is scanned, so a small table driver should be used.

Index Nested-Loop Join (MySQL driver link uses index)

Let’s take the above SQL as an example, if field a is indexed.

The entire t1 table will be scanned, and each data in the t1 table will be indexed in the t2 table. After the ID is found, the table will be queried again (if the connection field is the primary key of the t2 table, the table retrieval operation will be omitted).

t1 scans the entire table = 100 times

t2 index queries = log1000 times

t2 table query = log1000 times

Assume that the number of data rows in the driving table is N, and the number of data rows in the driven table is M.

Total number of queries = N + N * 2logM

As can be seen from the above, the larger the data in the driving table, the more queries there will be, so a small table should be used as the driving table.

The article refers to "MySQL Practical 45 Lectures--Lecture 34"

Summarize

This is the end of this article about the basics of MYSQL database Join operation principle. For more relevant MYSQL Join principle content, please search 123WORDPRESS.COM's previous articles or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Summary of seven MySQL JOIN types
  • MySQL join buffer principle
  • Specific usage instructions for mysql-joins
  • Mysql join query syntax and examples
  • Summary of various common join table query examples in MySQL
  • Specific use of MySQL's seven JOINs

<<:  HTML pop-up div is very useful to realize mobile centering

>>:  Detailed explanation of the commonly used functions copy_from_user open read write in Linux driver development

Recommend

Linux directory switching implementation code example

Switching files is a common operation in Linux. W...

Vue routing returns the operation method of restoring page status

Route parameters, route navigation guards: retain...

JavaScript implements checkbox selection function

This article example shares the specific code of ...

Detailed explanation of how Tomcat implements asynchronous Servlet

Preface Through my previous Tomcat series of arti...

Summary of uncommon js operation operators

Table of contents 2. Comma operator 3. JavaScript...

How to delete extra kernels in Ubuntu

Step 1: View the current kernel rew $ uname -a Li...

Detailed explanation of Linux index node inode

1. Introduction to inode To understand inode, we ...

Explanation of the usage scenarios of sql and various nosql databases

SQL is the main trunk. Why do I understand it thi...

Detailed explanation of Docker working mode and principle

As shown in the following figure: When we use vir...

In-depth explanation of Mysql deadlock viewing and deadlock removal

Preface I encountered a Mysql deadlock problem so...

Summary of the execution issues between mysql max and where

Execution problem between mysql max and where Exe...

Problems with join queries and subqueries in MySQL

Table of contents Basic syntax for multi-table jo...

How to remove the header from the element table

Document hints using the show-header attribute sh...