How to Install and Automate Referrer Spam Protection on an NGINX Server via the CLI
Introduction
Referrer spam is a common nuisance that can pollute your web analytics and clog up your server logs. In this tutorial, I’ll demonstrate how to block referrer spam on your NGINX server by leveraging a community-maintained referrer spam list from Matomo. I’ll guide you through cloning the Matomo Referrer Spam List, processing it into NGINX configuration rules, and automating the update process every 14 days using systemd timers.
Audience & Environment
Important: This tutorial is intended for users with root-level access to their web server. It is specifically tailored for a Debian-based server running NGINX. If you are not comfortable with using the command line as the root user, or if your server does not use Debian or NGINX, you may need to adapt these instructions to suit your environment.
Prerequisites
Before you begin, ensure your server meets the following requirements:
- Operating System: Debian 12 (or another Debian-based distribution)
- Web Server: NGINX
- User Privileges: Root access or sudo privileges
- Required Packages:
- nginx – The web server software.
- git – For cloning the referrer spam list repository.
- rsync – For synchronizing files safely.
- systemd – For managing services and timers (comes pre-installed on Debian 12).
- bash – The shell in which the script runs (typically installed by default).
To install any missing packages on Debian-based systems, you can run:
sudo apt-get update
sudo apt-get install git rsync
Directory Structure and Permissions
For security, the referrer spam list will reside in /etc/nginx/referrer-spam-list
. This directory is owned by root, ensuring that only privileged users can modify its contents. A typical listing of the directory might look like:
drwx------ 4 root root 4096 Mar 30 13:08 .
drwxr-xr-x 9 root root 4096 Mar 12 09:50 ..
drwxr-xr-x 8 root root 4096 Mar 30 13:08 .git
drwxr-xr-x 2 root root 4096 Mar 30 13:08 .github
-rw-r--r-- 1 root root 865 Mar 30 13:08 CONTRIBUTING.md
-rw-r--r-- 1 root root 4291 Mar 30 13:08 README.md
-rw-r--r-- 1 root root 254 Mar 30 13:08 composer.json
-rw-r--r-- 1 root root 37045 Mar 30 13:08 spammers.txt
While these permissions are typically sufficient (especially since /etc/nginx
is not web-accessible), you might consider further restricting sensitive directories like .git
and .github
if desired.
The Update Script
We will create a Bash script named makespammer.sh
that:
- Downloads the latest referrer spam list from GitHub.
- Processes the list to generate NGINX configuration rules.
- Tests and reloads the NGINX configuration if the test passes.
Handling Repository Updates Safely
Instead of deleting the current referrer spam list (which might leave your server unprotected if the clone fails), we’ll clone the repository into a temporary directory and use rsync
to update the active directory only if the clone is successful.
The Script Code
Below is the complete script with inline comments explaining each section:
#!/bin/bash
# makespammer.sh
# This script updates the referrer spam list and configures NGINX to block spam referrers.
# Define the target directory for the referrer spam list
TARGET_DIR="/etc/nginx/referrer-spam-list"
# Create a temporary directory for safely cloning the repository
tmpdir=$(mktemp -d)
echo "Cloning the referrer spam list from GitHub..."
# Attempt to clone the repository into the temporary directory
if git clone https://github.com/matomo-org/referrer-spam-list.git "$tmpdir"; then
echo "Repository cloned successfully."
# Use rsync to synchronize the temporary directory with the target directory
# The --delete flag removes files that no longer exist in the source, keeping the list in sync
rsync -a --delete "$tmpdir/" "$TARGET_DIR/"
else
echo "Failed to update the referrer spam list from remote repository."
echo "Keeping the existing list to ensure protection."
fi
# Clean up the temporary directory
rm -rf "$tmpdir"
# Define the NGINX snippet file path
NGINX_SNIPPET="/etc/nginx/snippets/referer_spam.conf"
echo "Updating NGINX configuration snippet..."
# Clear the existing configuration snippet file to avoid duplicates
> "$NGINX_SNIPPET"
# Process the spammers list and append rules to the configuration snippet
# The sed command escapes dots for regex matching in NGINX configuration
sort "$TARGET_DIR/spammers.txt" | uniq | sed 's/\./\\\\./g' | while read host; do
echo "if (\$http_referer ~ '$host') { return 403; }" >> "$NGINX_SNIPPET"
done
echo "Testing NGINX configuration..."
# Test the updated NGINX configuration and reload if the test passes
if nginx -t; then
echo "NGINX configuration test passed. Reloading NGINX..."
systemctl reload nginx
else
echo "NGINX configuration test failed. Please review the changes."
fi
Key Points:
- Safe Update:
The script clones the repository into a temporary directory and then usesrsync
to update the active directory. This avoids removing the current spam list if the clone fails. - NGINX Snippet Generation:
It processesspammers.txt
to generate NGINX configuration rules, escaping periods (.) for proper regex matching. - NGINX Reload:
The script tests the configuration withnginx -t
before reloading NGINX, ensuring that no errors are introduced.
How the referer_spam.conf
Snippet Is Included in NGINX
To activate the referrer spam protection, you need to include the referer_spam.conf
snippet inside each server block (or in a common config if you’re using a shared structure like default.conf
).
Here’s the line you should add inside your NGINX server
block(s):
include snippets/referer_spam.conf;
Example:
server {
listen 80;
server_name example.com;
include snippets/referer_spam.conf;
location / {
try_files $uri $uri/ =404;
}
}
Why This Is Useful
Using a snippet
like referer_spam.conf
is a modular and reusable approach that provides several benefits—especially when hosting multiple sites (virtual hosts) on a single server:
- Centralized Management: You only need to maintain and update the spam-blocking logic in one place.
- Consistency Across Sites: Ensures every domain or subdomain is equally protected from spam referrers.
- Reduced Errors: Keeps your server blocks clean and easier to read, minimizing the chance of copy-paste mistakes.
- Flexible Automation: Our script overwrites the snippet directly—so your changes propagate to all server blocks on reload.
In short, by using include snippets/referer_spam.conf;
in your server blocks, you maintain clean, maintainable configs while benefiting from powerful and centralized spam protection logic.
Automating the Script with Systemd Timers
Manual updates can be error-prone, so we’ll automate the update process using systemd timers. This ensures that your spam protection is kept up-to-date every 14 days.
Step 1: Create the Service Unit
Create a file at /etc/systemd/system/referrer-spam-update.service
with the following content:
[Unit]
Description=Update NGINX Referrer Spam List
[Service]
Type=oneshot
ExecStart=/usr/local/bin/makespammer.sh
Make sure to place your makespammer.sh
script in /usr/local/bin/
and mark it as executable.
Step 2: Create the Timer Unit
Next, create the timer file at /etc/systemd/system/referrer-spam-update.timer
:
[Unit]
Description=Run referrer spam update script every 14 days
[Timer]
OnBootSec=5min
OnUnitActiveSec=14d
Persistent=true
[Install]
WantedBy=timers.target
Timer Explanation:
- OnBootSec=5min:
The timer waits 5 minutes after boot before running the script. - OnUnitActiveSec=14d:
The script is scheduled to run 14 days after the last execution. - Persistent=true:
If the scheduled time is missed (e.g., because the server was offline), the service runs immediately after boot.
Step 3: Enable and Start the Timer
Reload systemd and enable the timer with:
sudo systemctl daemon-reload
sudo systemctl enable --now referrer-spam-update.timer
You can check the status and next run time by listing timers:
sudo systemctl list-timers --all
Conclusion
By following these steps, you now have a robust solution to protect your NGINX server from referrer spam. This tutorial has guided you through:
- Downloading and integrating the Matomo Referrer Spam List into your server configuration.
- Creating a secure update script that only updates your configuration if the remote repository is available.
- Automating the update process using systemd timers to run the script every 14 days.
This setup is designed for administrators with root-level access on Debian-based servers running NGINX. With the required packages (nginx, git, rsync, and systemd) installed, your server will remain safeguarded against referrer spam, keeping your logs and analytics clean.
References: