This is a Python script that checks the HTTP status of every <a>
element on a website using it's sitemap.xml
and produces a report which is stored on your server and emailed to you (if enabled)!
Simply attach it to your crontab to run it on a regular basis and be aware of any dead links that need fixing!
Sound good?
- Clone the repo
cd ~/
git clone [email protected]:Finnito/link-checker.git
cd link-checker
- Install pipenv
sudo apt install pipenv
- Install the project
pipenv install
- Configure some settings
cp config.example config
nano config
[default]
sitemap_url = http://localhost:1313/sitemap.xml # The script only does XML sitemaps
email_log = yes # Set to "no" to disable email logs
to_email = [email protected] # Ensure the domain meets your mail server requirements
from_email = [email protected] # Who's getting the email reports?
email_subject = Link Checker reports # Customise the subject in case you don't like it!
- Test the script!
python3 -m pipenv run link-checker
- Setup a crontab, I prefer mine to run at 6am each Monday like so:
crontab -e
# Paste the following
0 6 * * 1 cd ~/link-checker/ && python3 -m pipenv run link-checker
You can alternatively pass the sitemap URL as an argument to the script - this will override the config
file allowing you to check multiple sites with the one script. It might look like this in your crontab:
0 6 * * 1 cd ~/link-checker/ && python3 -m pipenv run link-checker https://finn.lesueur.nz/sitemap.xml
5 6 * * 1 cd ~/link-checker/ && python3 -m pipenv run link-checker https://science.lesueur.nz/sitemap.xml