How to find identical files in Linux

How to find identical files in Linux

As the computer is used, a lot of garbage will be generated in the system. The most typical case is that the same file is saved in different locations. The result is that a large amount of disk space is occupied and the system runs slower and slower.

So if your computer is running out of space, you can try deleting such files to free up some space. Under Linux, we can find the same file in the system by identifying the file's inode value.

An inode is a data structure that records all information about a file, except the file name and file contents. If two or more files have the same inode value, even if their file names are different and their locations are different, their contents, owners, and permissions are actually the same, and we can regard them as the same file.

This type of file is actually a so-called "hard link". Hard links have the same inode value but different file names. A soft link is actually a shortcut that points to the target file but has its own inode value.

$ ls -l my*
-rw-r--r-- 4 liangxu liangxu 228 Apr 12 19:37 myfile
lrwxrwxrwx 1 liangxu liangxu 6 Apr 15 11:18 myref -> myfile
-rw-r--r-- 4 liangxu liangxu 228 Apr 12 19:37 mytwin

We cannot directly know which files in the same directory have the same inode value, but it is not difficult to identify them. In fact, we can find these files directly by using the ls -i command and sorting by inode value.

$ ls -i | sort -n | more
 ...
 788000 myfile <==
 788000 mytwin <==
 801865 Name_Labels.pdf
 786692 never leave home angry
 920242 NFCU_Docs
 800247 nmap-notes

In the first column of this result is the corresponding inode value. So from this result we can see at a glance which files have the same inode value.

If you just want to find the corresponding hard link file of a file, we can use the find command and add the -samefile option to find it quickly.

$ find . -samefile myfile
./myfile
./save/mycopy
./mytwin

These files all have the same inode value. If you don’t believe it, you can use the ls command to view more information:

$ find . -samefile myfile -ls
 788000 4 -rw-r--r-- 4 liangxu liangxu 228 Apr 12 19:37 ./myfile
 788000 4 -rw-r--r-- 4 liangxu liangxu 228 Apr 12 19:37 ./save/mycopy
 788000 4 -rw-r--r-- 4 liangxu liangxu 228 Apr 12 19:37 ./mytwin

We can see that, except for the file names, the information of these file names is exactly the same. Careful friends may notice that the second column (number of hard links) is 4, but in fact we found only 3 files, which means there is another file sharing the inode value with them, but we did not find it through this command.

As a lazy person, it is too troublesome to type commands every time, so I can just use the script to find the same files in the directory!

#!/bin/bash

# seaches for files sharing inodes

prev=""

# list files by inode
ls -i | sort -n > /tmp/$0

# search through file for duplicate inode #s
while read line
do
  inode=`echo $line | awk '{print $1}'`
  if [ "$inode" == "$prev" ]; then
    grep $inode /tmp/$0
  fi
  prev=$inode
done < /tmp/$0

# clean up
rm /tmp/$0

Running results:

$ ./findHardLinks
 788000 myfile
 788000 mytwin

Of course, you can also use the find command to find all identical files in the system based on the inode value.

$ find / -inum 788000 -ls 2> /dev/null
 788000 4 -rw-r--r-- 4 liangxu liangxu 228 Apr 12 19:37 /tmp/mycopy
 788000 4 -rw-r--r-- 4 liangxu liangxu 228 Apr 12 19:37 /home/liangxu/myfile
 788000 4 -rw-r--r-- 4 liangxu liangxu 228 Apr 12 19:37 /home/liangxu/save/mycopy
 788000 4 -rw-r--r-- 4 liangxu liangxu 228 Apr 12 19:37 /home/liangxu/mytwin

In this command, we redirect the error message to the special file /dev/null, so that when searching for some paths that we do not have permission to access, the screen will not be filled with permission denied.

This is the end of this article about how to find identical files in Linux. For more information about finding identical files in Linux, please search previous articles on 123WORDPRESS.COM or continue browsing the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Summary of five search commands in Linux
  • How to fuzzily find a file in Linux
  • Linux shell searches for files and displays line numbers and corresponding intervals
  • What command is better for fuzzy searching files in Linux?
  • Linux command find file search example
  • Detailed explanation of how to find files filtered by time in a directory in Linux
  • How to find files in Linux
  • Complete Guide to File Search in Linux

<<:  js canvas realizes rounded corners picture

>>:  Example analysis of mysql variable usage [system variables, user variables]

Recommend

How to disable IE10's password clear text display and quick clear function

IE10 provides a quick clear button (X icon) and a ...

Design theory: people-oriented design concept

<br />When thoughts were divided into East a...

How to choose between MySQL CHAR and VARCHAR

Table of contents VARCHAR and CHAR Types Conclusi...

Analysis of the cause of docker error Exited (1) 4 minutes ago

Docker error 1. Check the cause docker logs nexus...

Detailed explanation of real-time backup knowledge points of MySQL database

Preface The need for real-time database backup is...

Detailed steps for installing and configuring mysql 5.6.21

1. Overview MySQL version: 5.6.21 Download addres...

Detailed explanation of how to install MySQL on Alibaba Cloud

As a lightweight open source database, MySQL is w...

Jenkins Docker static agent node build process

A static node is fixed on a machine and is starte...

Detailed steps for installing rockerChat in docker and setting up a chat room

Comprehensive Documentation github address https:...

6 solutions to IDEA's inability to connect to the MySQL database

This article mainly introduces 6 solutions to the...

JavaScript to achieve all or reverse selection function

This article shares the specific code of JavaScri...

Solution to the error problem of Vscode remotely connecting to Ubuntu

1. Background of the incident: Because of work ne...