Packt+ | Advance your knowledge in tech

You're reading from Linux Shell Scripting Cookbook Do amazing things with the shell and automate tedious tasks

Product type Paperback

Published in May 2017

Publisher

ISBN-13 9781785881985

Length 552 pages

Edition 3rd Edition

Tools

Linux

Concepts

Shell Scripting

Authors (3):

Clif Flynt

Sarath Lakshman

Shantanu Tushar

View More author details

Table of Contents (20) Chapters

Title Page

Credits

About the Authors

About the Reviewer

www.PacktPub.com

Customer Feedback

Preface

1. Shell Something Out FREE CHAPTER

2. Have a Good Command

3. File In, File Out

4. Texting and Driving

5. Tangled Web? Not At All!

6. Repository Management

7. The Backup Plan

8. The Old-Boy Network

9. Put On the Monitors Cap

10. Administration Calls

11. Tracing the Clues

12. Tuning a Linux System

13. Containers, Virtual Machines, and the Cloud

Finding broken links in a website

Websites must be tested for broken links. It's not feasible to do this manually for large websites. Luckily, this is an easy task to automate. We can find the broken links with HTTP manipulation tools.

Getting ready

We can use lynx and curl to identify the links and find broken ones. Lynx has the -traversal option, which recursively visits pages on the website and builds a list of all hyperlinks. cURL is used to verify each of the links.

How to do it...

This script uses lynx and curl to find the broken links on a web page:

#!/bin/bash  
#Filename: find_broken.sh 
#Desc: Find broken links in a website 

if [ $# -ne 1 ];  
then  
  echo -e "$Usage: $0 URL\n"  
  exit 1;  
fi  

echo Broken links:  

mkdir /tmp/$$.lynx  
cd /tmp/$$.lynx  

lynx -traversal $1 > /dev/null  
count=0;  

sort -u reject.dat > links.txt  

while read link;  
do  
  output=`curl -I $link -s \ 
| grep -e "HTTP/.*OK" -e "HTTP/.*200"` 
  if [[ -z $output ]];  
  then  
    output=`curl...