Bots are a fact of life on the internet.

Some are helpful—like search engine crawlers.

Others scrape your data, spam your forms, or brute-force your login pages.

If you’re self-hosting with Nginx, you don’t need a pricey SaaS WAF to stop them.

Here's how to detect and destroy malicious bots using good ol’ Nginx, a few scripts, and some zip-bomb flavor.

1. Start with Logs — Always

Nginx logs tell the full story. Make sure you're capturing User-Agent, IP, and paths.

log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                  '$status $body_bytes_sent "$http_referer" '
                  '"$http_user_agent"';
access_log  /var/log/nginx/access.log  main;

Now dig through logs for patterns:

# Top IPs by request volume
awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head

# Suspicious User-Agents
grep -iE 'curl|wget|python|scrapy|bot|crawler|headless' /var/log/nginx/access.log | less

Want real-time views? Try GoAccess for a terminal dashboard.

2. Identify Suspicious Behavior

Things that scream “bot”:

  • Blank or obviously fake User-Agent headers
  • High request volume from a single IP
  • Frequent hits to /wp-login.php, /xmlrpc.php, /admin, or random paths
  • Unusual Referer headers or none at all
  • Crawlers hitting endpoints that no normal user would

Bonus: check your logs against public bot signature lists like MitchellKrogza’s bad bot list.

3. Block the Obvious Stuff with Nginx

Create a quick and dirty User-Agent filter:

map $http_user_agent $bad_bot {
    default 0;
    ~*(curl|wget|python|scrapy|bot|Go-http-client) 1;
}

server {
    if ($bad_bot) {
        return 403;
    }
}

And rate limit abusive IPs:

limit_req_zone $binary_remote_addr zone=abusers:10m rate=5r/s;

server {
    location / {
        limit_req zone=abusers burst=10 nodelay;
        ...
    }
}

Also check out Nginx rate limiting docs.

4. Use Fail2Ban to Auto-Ban IPs

Install Fail2Ban and wire it to your Nginx logs:

Jail config (/etc/fail2ban/jail.local):

[nginx-badbots]
enabled  = true
filter   = nginx-badbots
logpath  = /var/log/nginx/access.log
maxretry = 5
findtime = 600
bantime  = 3600

Filter (/etc/fail2ban/filter.d/nginx-badbots.conf):

[Definition]
failregex = ^ -.*"(GET|POST).*HTTP.*"(curl|wget|python|scrapy|bot|Go-http-client)
ignoreregex =

Once this is running, bots get banned automatically after a few hits.

5. Use Better Tools for Smarter Bots

If you're seeing more sophisticated attacks, try:

  • CrowdSec: Open-source tool that shares a dynamic IP reputation list and applies bans
  • ModSecurity: Full WAF, works with Nginx
  • OpenResty: Extend Nginx with Lua scripting (e.g., custom captcha, behavior analysis)

If you’re open to a proxy layer:

  • Cloudflare free tier: Blocks a lot of trash automatically
  • Fastly Bot Protection: Advanced but paid

Bonus Serve Zip Bombs to Dumb Bots (⚠️ Handle with care)

This blog post by Idiallo shows how he turned bot detection into punishment.

The method? Serve them a compressed zip bomb.

To generate one:

dd if=/dev/zero bs=1G count=10 | gzip -c > 10GB.gz

This creates a ~10MB file that decompresses to 10GB of zeros.

If a bot tries to read it without knowing, it chokes.

Then detect and serve it:

if (ipIsBlackListed() || isMalicious()) {
    header("Content-Encoding: deflate, gzip");
    header("Content-Length: " . filesize(ZIP_BOMB_FILE_10G));
    readfile(ZIP_BOMB_FILE_10G);
    exit;
}

He explains that when traffic spikes, he swaps in a 1MB variant.

It’s a great deterrent for low-effort bots.

Heuristics like repeated scanning and double-visits from spam IPs helped him fine-tune this method.

📎 Also check out this Hacker News discussion for community input on his approach.

Final Thoughts

You don’t need an enterprise WAF to defend your site.

With a bit of log inspection, some config hacks, and creative trolling like zip bombs, you can knock out the majority of disruptive bots.


I’ve been actively working on a super-convenient tool called LiveAPI.

LiveAPI helps you get all your backend APIs documented in a few minutes

With LiveAPI, you can quickly generate interactive API documentation that allows users to execute APIs directly from the browser.

Image description

If you’re tired of manually creating docs for your APIs, this tool might just make your life easier.