Detailed explanation of nginx anti-hotlink and anti-crawler configuration

Create a new configuration file (for example, go to the conf directory under the nginx installation directory and create: agent_deny.conf)

Disable crawling by tools such as Scrapy if ($http_user_agent ~* (Scrapy|Curl|HttpClient)) { return 403; }

Prohibit access with specified UA or empty UA

#forbidden Scrapy
if ($http_user_agent ~* (Scrapy|Curl|HttpClient))
{
  return 403;
}

#forbidden UA
if ($http_user_agent ~ "Bytespider|FeedDemon|JikeSpider|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|CoolpadWebkit|Java|Feedly|UniversalFeedParser|ApacheBench|Microsoft URL Control|Swiftbot|ZmEu|oBot|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|YisouSpider|HttpClient|MJ12bot|heritrix|EasouSpider|Ezooms|^$" )
{
  return 403;
}

#forbidden not GET|HEAD|POST method access
if ($request_method !~ ^(GET|HEAD|POST)$)
{
  return 403;
}

Then, insert the following code into the server section of the website configuration: include agent_deny.conf;

Restart nginx:

/data/nginx/sbin/nginx -s reload

The test can be done by using curl -A to simulate crawling, for example:

curl -I -A 'YYSpider' <<www.xxx.con>>

result

[root@11 conf]# curl -I -A 'YYSpider' www.xxx.cn
HTTP/1.1 403 Forbidden
Server: nginx/1.12.0
Date: Wed, 24 Apr 2019 11:35:21 GMT
Content-Type: text/html
Content-Length: 169
Connection: keep-alive

Simulate a crawl with empty UA:

curl -I -A' ' <<www.xxx.cn>>

result

[root@11 conf]# curl -I -A' ' www.xxx.cn
HTTP/1.1 403 Forbidden
Server: nginx/1.12.0
Date: Wed, 24 Apr 2019 11:36:06 GMT
Content-Type: text/html
Content-Length: 169
Connection: keep-alive

Simulate the crawling of Baidu spider:

curl -I -A 'Baiduspider' <<<www.xxx.cn>>>

[root@11 conf]# curl -I -A 'Baiduspider' www.xxx.cn
HTTP/1.1 200 OK
Server: nginx/1.12.0
Date: Wed, 24 Apr 2019 11:36:47 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Fri, 12 Apr 2019 13:49:36 GMT
Connection: keep-alive
ETag: "5cb09770-264"
Accept-Ranges: bytes

UA Type

FeedDemon content collection BOT/0.1 (BOT for JCE) sql injection CrawlDaddy sql injection Java content collection Jullo content collection Feedly content collection UniversalFeedParser content collection ApacheBench cc attacker Swiftbot useless crawler YandexBot useless crawler AhrefsBot useless crawler YisouSpider useless crawler (has been acquired by UC Shenma Search, this spider can be released!)
jikeSpider useless crawlerMJ12bot useless crawlerZmEu phpmyadmin vulnerability scanningWinHttp collectioncc attackEasouSpider useless crawlerHttpClient tcp attackMicrosoft URL Control scanningYYSpider useless crawlerjaunty wordpress blasting scanneroBot useless crawlerPython-urllib content collectionIndy Library scanningFlightDeckReports Bot useless crawlerLinguee Bot useless crawler

Nginx anti-hotlink configuration

Background: To prevent third-party reference links from accessing our images and consuming server resources and network traffic, we can do anti-hotlink restrictions on the server.
There are two ways to implement hotlink protection: referral method and signature method.

Refer method to achieve anti-hotlinking

Working module: ngx_http_referer_module.

Valid variables: $invalid_referer, global variable.

Configuration domain: server, location

Configuration:

server {
  listen 80;
  server_name www.imcati.com refer-test.imcati.com;
  root /usr/share/nginx/html;
  location ~*\.(gif|jpg|jpeg|png|bmp|swf)$ {
    valid_referers none blocked www.imcati.com;
    if ($invalid_referer) {
      return 403;
      }
   }
  }

valid_referers: Specifies that resource access is legal in the following ways, i.e., a whitelist. vaild_referers Valid reference links, as follows, otherwise enter $invaild_refere and return 403 forbidden.
none: Allow missing header access.
blocked: Allows requests with no corresponding referer value.
server_names: If the referer site domain name is the same as the domain name configured for the local machine in server_name, access is allowed.

This is the end of this article about the detailed configuration of nginx anti-hotlink and anti-crawler. For more relevant nginx anti-hotlink and anti-crawler configuration content, please search for previous articles on 123WORDPRESS.COM or continue to browse the related articles below. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:

How to implement anti-hotlinking with jsp
Some websites allow blank referer anti-hotlink image js cracking code
PHP implementation of image anti-hotlink cracking operation example [Solving image anti-hotlink problem/reverse proxy]
Complete steps for Nginx to configure anti-hotlinking
Detailed explanation of Referer principle and image hotlink protection implementation method
How to configure Nginx's anti-hotlinking
Detailed method of using .htaccess to set up image hotlink protection
Analysis of PHP's method of implementing anti-hotlinking
Nginx uses referer directive to implement anti-hotlink configuration
The principle and cracking method of JavaScript anti-hotlinking

<<: Vue uses ECharts to implement line charts and pie charts

>>: Detailed explanation of storage engine in MySQL

Common Linux English Error Chinese Translation (Newbies Must Know)

Detailed explanation of nginx anti-hotlink and anti-crawler configuration

Common Linux English Error Chinese Translation (Newbies Must Know)

Summary of MySQL time statistics methods

MySQL 8.0.24 installation and configuration method graphic tutorial

Detailed explanation of MySQL multi-version concurrency control mechanism (MVCC) source code

A brief discussion on the VUE uni-app development environment

IE6/7 is going to be a mess: empty text node height issue

Vue realizes the progress bar change effect

How to insert 10 million records into a MySQL database table in 88 seconds

How to view version information in Linux

Creating Responsive Emails with Vue.js and MJML

Recommend

MySQL database deletes duplicate data and only retains one method instance

Perfect solution to the problem of connection failure after MySQL client authorization

MySQL 8.0.24 version installation and configuration method graphic tutorial

Detailed explanation of the solution to duplicate insertion of MySQL primary key and unique key

Detailed explanation of how Nginx works

Vue+openlayer5 method to get the coordinates of the current mouse slide

JavaScript to achieve uniform animation effect

How to design high-quality web pages Experience in designing high-quality web pages (pictures and text)

Vue implements a simple shopping cart example

Implementation of Nginx forwarding matching rules

Element's el-tree multiple-select tree (checkbox) parent-child node association is not associated

Tutorial on downloading, installing and deploying Tomcat to IDEA (with two hot deployment setting methods for IDEA)

How to create your first React page

Specific use of MySQL global locks and table-level locks

Detailed explanation of how to use the canvas operation plugin fabric.js