MySQL's method of dealing with duplicate data (preventing and deleting)

MySQL's method of dealing with duplicate data (preventing and deleting)

Some MySQL tables may contain duplicate records. In some cases, we allow duplicate data to exist, but sometimes we also need to delete the duplicate data.

In this chapter, we will introduce how to prevent duplicate data from appearing in data tables and how to delete duplicate data in data tables.

Preventing duplicate data from appearing in the table

You can set a specified field in a MySQL table as a PRIMARY KEY or UNIQUE index to ensure the uniqueness of the data.
Let's try an example: There is no index or primary key in the following table, so multiple duplicate records are allowed in the table.

CREATE TABLE person_tbl
(
 first_name CHAR(20),
 last_name CHAR(20),
 sex CHAR(10)
);

If you want to set the data of the first_name and last_name fields in the table to not be repeated, you can set a dual primary key mode to set the uniqueness of the data. If you set a dual primary key, the default value of that key cannot be NULL and can be set to NOT NULL. As shown below:

CREATE TABLE person_tbl
(
 first_name CHAR(20) NOT NULL,
 last_name CHAR(20) NOT NULL,
 sex CHAR(10),
 PRIMARY KEY (last_name, first_name)
);

If we set a unique index, the SQL statement will not be executed successfully when inserting duplicate data and an error will be thrown.

The difference between INSERT IGNORE INTO and INSERT INTO is that INSERT IGNORE will ignore the data that already exists in the database. If there is no data in the database, new data will be inserted. If there is data, the data will be skipped. In this way, the existing data in the database can be retained, achieving the purpose of inserting data in the gap.

The following example uses INSERT IGNORE INTO. No error occurs after execution, and no duplicate data is inserted into the data table:

mysql> INSERT IGNORE INTO person_tbl (last_name, first_name)
 -> VALUES('Jay', 'Thomas');
Query OK, 1 row affected (0.00 sec)
mysql> INSERT IGNORE INTO person_tbl (last_name, first_name)
 -> VALUES('Jay', 'Thomas');
Query OK, 0 rows affected (0.00 sec)

INSERT IGNORE INTO When inserting data, after setting the uniqueness of the record, if duplicate data is inserted, no error will be returned, only a warning will be returned. If REPLACE INTO has a record with the same primary or unique, it will be deleted first. Insert new records again.

Another way to set uniqueness on your data is to add a UNIQUE index, as shown below:

CREATE TABLE person_tbl
(
 first_name CHAR(20) NOT NULL,
 last_name CHAR(20) NOT NULL,
 sex CHAR(10),
 UNIQUE (last_name, first_name)
);

Counting duplicate data

Below we will count the number of duplicate records of first_name and last_name in the table:

mysql> SELECT COUNT(*) as repetitions, last_name, first_name
 -> FROM person_tbl
 -> GROUP BY last_name, first_name
 -> HAVING repetitions > 1;

The above query will return the number of duplicate records in the person_tbl table. In general, to query for duplicate values, do the following:

  • Determines which column contains values ​​that may be repeated.
  • Use COUNT(*) in the column selection list to list those columns.
  • The columns listed in the GROUP BY clause.
  • The HAVING clause sets the number of repetitions to greater than 1.

Filtering Duplicate Data

If you need to read non-duplicate data, you can use the DISTINCT keyword in the SELECT statement to filter out duplicate data.

mysql> SELECT DISTINCT last_name, first_name
 -> FROM person_tbl;

You can also use GROUP BY to read unique data in the table:

mysql> SELECT last_name, first_name
 -> FROM person_tbl
 -> GROUP BY (last_name, first_name);

Deduplication

If you want to delete duplicate data in a table, you can use the following SQL statement:

mysql> CREATE TABLE tmp SELECT last_name, first_name, sex FROM person_tbl GROUP BY (last_name, first_name, sex);
mysql> DROP TABLE person_tbl;
mysql> ALTER TABLE tmp RENAME TO person_tbl;

Of course, you can also add INDEX (index) and PRIMAY KEY (primary key) in the data table to delete duplicate records in the table. Here’s how:

mysql> ALTER IGNORE TABLE person_tbl
 -> ADD PRIMARY KEY (last_name, first_name);

The above is the details of MySQL's method of handling duplicate data (prevention and deletion). For more information about MySQL's handling of duplicate data, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • A practical record of checking and processing duplicate MySQL records on site
  • MySQL study notes on handling duplicate data
  • How to handle concurrent updates of MySQL data
  • Detailed explanation of MySQL execution principle, logical layering, and changing database processing engine
  • Some methods to optimize query speed when MySQL processes massive data
  • MySQL data processing sorting and explaining the operations of adding, deleting and modifying

<<:  How to View All Running Processes in Linux

>>:  How to use the yum command

Recommend

JavaScript setTimeout and setTimeinterval use cases explained

Both methods can be used to execute a piece of ja...

How to mount the CD to find the rpm package under Linux

Written in front Sometimes you need to install so...

Detailed explanation of MySql installation and login

Check if MySQL is already installed in Linux sudo...

Specific method of viewing user authorization information in mysql

Specific method: 1. Open Command Prompt 2. Enter ...

Element-ui's built-in two remote search (fuzzy query) usage explanation

Problem Description There is a type of query call...

MySQL Server 8.0.13.0 Installation Tutorial with Pictures and Text

Install 8.0.13 based on MySQL 6.1.3. MySQL 8.0.13...

Detailed explanation of the spacing problem between img tags

IMG tag basic analysis In HTML5, the img tag has ...

JavaScript implementation of classic snake game

This article shares the specific code of JavaScri...

MySQL 8.0.14 installation and configuration method graphic tutorial

This article records the installation and configu...

Complete steps to achieve high availability with nginx combined with keepalived

Preface In order to meet the high availability of...

SQL Aggregation, Grouping, and Sorting

Table of contents 1. Aggregate Query 1. COUNT fun...

Explore JavaScript prototype data sharing and method sharing implementation

Data Sharing What kind of data needs to be writte...

A quick solution to the first login failure in mysql5.7.20

First, we will introduce how (1) MySQL 5.7 has a ...

JavaScript canvas to achieve code rain effect

This article shares the specific code for canvas ...

Detailed explanation of how to use CMD command to operate MySql database

First: Start and stop the mysql service net stop ...