Details of the underlying data structure of MySQL indexes

1. Index Type

1. B+ Tree
2. What are the differences between MyISAM and InnoDB's B+ tree index implementations (clustered index and non-clustered index)?
3. Non-clustered index
4. Advantages and disadvantages of clustered index
5. Hash Index
6. Adaptive Hash Index

1. Index Type

1. B+ Tree

Why B+ tree instead of B tree?

First, let's look at the structural differences between B-tree and B+ tree.

B-tree structure:

B+ Tree:

You can see:

The B-tree has satellite data (a row of data in the data table) on each node, while the B+ tree only has satellite data on leaf nodes. This means that for disk sectors of the same size, the B+ tree can store more leaf nodes and require fewer disk IO times; it also means that the search efficiency of the B+ tree is more stable, and the fastest time complexity of B-tree data query is O(1).
Each node of a B-tree appears only once, and all nodes of a B+ tree appear in leaf nodes. All leaf nodes of the B+ tree form an ascending linked list, which is suitable for interval range search, while the B tree is not suitable.

2. What are the differences between MyISAM and InnoDB's B+ tree index implementations (clustered index and non-clustered index)?

First you need to understand clustered indexes and non-clustered indexes.

Clustered index:

In a clustered index, the leaf pages contain all the data for the row, and the node pages contain the index columns. InnoDB clusters data by primary key. If no primary key is defined, a unique non-empty index column is selected instead. If there is no such index, InnoDB implicitly defines a primary key as the clustered index.

Data distribution of clustered index:

In a clustered index, in addition to the primary key index, there is also a secondary index. The leaf nodes in the secondary index do not store "row pointers" but primary key values, which are used as "pointers" to the rows. This means that when searching for a row through a secondary index, the storage engine needs to find the leaf node of the secondary index to obtain the corresponding primary key value, and then search for the corresponding row in the clustered index based on this value, which is also called "back to the table". Of course, you can avoid table repetition by using covering indexes or InnoDB 's adaptive indexes to reduce such repetitive work.

Note : Each leaf node in a clustered index contains not only the complete data row, but also the transaction ID, rollback pointer for transactions and MVCC.

3. Non-clustered index

The primary key index and secondary index of a non-clustered index are no different in structure, both of which store "row pointers" pointing to the physical address of the data on the leaf nodes.

Primary key index and secondary index of clustered index:

Primary key index and secondary index of non-clustered index:

4. Advantages and disadvantages of clustered index

advantage:

Store related data together (for example, group all the user's emails together by user ID), otherwise each data read may cause a disk IO
Faster data access. Store indexes and data in the same B+ tree. It is usually faster to retrieve data from a clustered index than from a non-clustered index. Using a covering query, you can directly use the primary key value in the page node.

shortcoming:

If all data can be stored in memory, sequential access is no longer necessary, and clustered indexes have no advantage. Insertion speed depends on the insertion order. Random insertion can cause page splits and holes. Use OPTIMIZE TABLE to rebuild the table. Every insertion, update, and deletion requires maintenance of index changes, which is very expensive. Secondary indexes may be larger than expected because the primary key columns of the referenced rows are included in the node.

5. Hash Index

Hash indexes are implemented based on hash tables. Only queries that exactly match all columns of the index are valid, which means that hash indexes are suitable for equal value queries.

Specific implementation: For each row of data, the storage engine calculates a hash code for all index columns. The hash index stores all hash codes in the index and saves a pointer to each data row in the hash table.

In MySQL, only the Memory engine explicitly supports hash indexes, although Memory engine also supports B-tree indexes.

Note: The Memory engine supports non-unique hash indexes. The way to resolve conflicts is to store multiple record pointers with the same hash value in the form of a linked list.

6. Adaptive Hash Index

When InnoDB notices that certain index values are used very frequently, it creates a hash index based on the B+ tree index in memory, so that the B+ tree index also has some advantages of the hash index, such as fast hash search.

This is the end of this article about the details of the underlying data structure of MySQL indexes. For more information about the underlying data structure of MySQL indexes, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:

How to construct a table index in MySQL
How to maintain MySQL indexes and data tables
Detailed introduction to MySQL database index
Detailed explanation of MySQL database index
MySQL Data Optimization - Multi-layer Index
MySQL Database Indexes and Transactions
Detailed explanation of the principles of indexing MySQL tables

<<: How to solve the problem of not getting form value after submitting html form input using disabled

>>: Solution for using Baidu share on Https page

Details of the underlying data structure of MySQL indexes

Table of contents

1. Index Type

1. B+ Tree

2. What are the differences between MyISAM and InnoDB's B+ tree index implementations (clustered index and non-clustered index)?

3. Non-clustered index

4. Advantages and disadvantages of clustered index

5. Hash Index

6. Adaptive Hash Index

Detailed Linux installation tutorial

Sample code using scss in uni-app

How to mount a disk in Linux and set it to automatically mount on boot

Detailed steps to use Arthas in a Docker container

Methods of adaptive web design (good access experience on mobile phones)

Javascript basics about built-in objects

Free tool to verify that HTML, CSS and RSS feeds are correct

JavaScript Regular Expressions Explained

A graphic tutorial on how to install redhat 8.0 system (a must-have for beginners)

mysql8.0 forgotten password modification and net command service name invalid problem

Recommend

Windows Server 2016 Quick Start Guide to Deploy Remote Desktop Services

Several ways to submit HTML forms_PowerNode Java Academy

How to pass the value of the select drop-down box to the id to implement the code

About Docker security Docker-TLS encrypted communication issues

MySql Group By implements grouping of multiple fields

Detailed explanation of how to configure secondary domain name on Apache server under Linux environment

Solve the cross-domain problem of get and post requests of vue $http

Web design experience: Make the navigation system thin

The most complete package.json analysis

Use nginx to dynamically convert image sizes to generate thumbnails

How to use environment variables in nginx configuration file

Parsing Apache Avro Data in One Article

JavaScript implementation of the back to top button example

Detailed explanation of Tomcat configuration and optimization solutions

Share some key interview questions about MySQL index