Detailed example of removing duplicate data in MySQL

Detailed example of removing duplicate data in MySQL

Detailed example of removing duplicate data in MySQL

There are two meanings of duplicate records: one is a completely duplicate record, that is, all fields are repeated; the other is a record with some fields repeated. The first type of duplication is relatively easy to solve. You only need to use the distinct keyword in the query statement to remove duplicates. Almost all database systems support the distinct operation. The main reason for this duplication is poor table design, which can be avoided by adding a primary key or unique index column to the table.

select distinct * from t;

For the second type of duplicate problem, it is usually required to query any one of the duplicate records. Assume that table t has three fields: id, name, and address. id is the primary key. The repeated fields are name and address. It is required to get a unique result set for these two fields.

-- Oracle, MySQL, using correlated subqueries

select * from t t1

 where t1.id =

 (select min(t2.id)

  from t t2

  where t1.name = t2.name and t1.address = t2.address);

-- Hive only supports subqueries in the FROM clause. Subqueries must have names and columns must be unique.

select t1.*

 from t t1,

   (select name, address, min(id) id from t group by name, address) t2

 where t1.id = t2.id;

 

-- You can also use hive's row_number() analysis function

select t.id, t.name, t.address

 from (select id, name, address,

row_number() over (distribute by name, address sort by id) as rn 

     from t) t 

 where t.rn=1;

Thank you for reading, I hope it can help you, thank you for your support of this site!

You may also be interested in:
  • Summary of methods for finding and deleting duplicate data in MySQL tables
  • MySQL query duplicate data (delete duplicate data and keep the one with the smallest id as the only data)
  • MySQL SQL statement to find duplicate data based on one or more fields
  • MySQL study notes on handling duplicate data

<<:  Sample code for implementing login and registration template in Vue

>>:  Detailed steps for quick installation of openshift

Recommend

Attributes and usage of ins and del tags

ins and del were introduced in HTML 4.0 to help au...

Record a slow query event caused by a misjudgment of the online MySQL optimizer

Preface: I received crazy slow query and request ...

A brief discussion on the implementation principle of Webpack4 plugins

Table of contents Preface know Practice makes per...

Docker installs redis 5.0.7 and mounts external configuration and data issues

Redis is an open source NoSQL database written in...

IE6/7 is going to be a mess: empty text node height issue

Preface: Use debugbar to view document code in iet...

A quick review of CSS3 pseudo-class selectors

Preface If CSS is the basic skill of front-end de...

CentOS 6 uses Docker to deploy redis master-slave database operation example

This article describes how to use docker to deplo...

What does the "a" in rgba mean? CSS RGBA Color Guide

RGBA is a CSS color that can set color value and ...

Analysis of the Principle and Method of Implementing Linux Disk Partition

remember: IDE disk: the first disk is hda, the se...

CSS implements Google Material Design text input box style (recommended)

Hello everyone, today I want to share with you ho...

How to connect JDBC to MySQL 5.7

1. First prepare the MySQL and Eclipse environmen...

WeChat applet development chapter: pitfall record

Recently, I participated in the development of th...

mysql8.0.11 winx64 manual installation and configuration tutorial

First of all, let me talk to you about my daily l...

Implementation of Linux command line wildcards and escape characters

If we want to perform batch operations on a type ...