WebsiteMaintenance

webcheck

The webcheck tool performs the following tasks:

  • generate site map
  • check for bad links
  • check pages for standards conformance
  • and more


Debian package

webcheck


References:

Homepage 
http://ch.tudelft.nl/~arthur/webcheck/
man page 
man webcheck


Usage examples:

  • webcheck http://www.herzbube.ch
    • checks everything below the specified URL
    • if an URL is encountered without the prefix www.herzbube.ch, it is considered external; it is visited, but then crawling stops
    • robots.txt is honored
  • webcheck --avoid-external http://www.herzbube.ch
    • does not check URLs outside of www.herzbube.ch
    • makes the whole process go faster
  • webcheck --continue
    • resumes the work of a previous run
    • --yank options are ignored in this mode