LinkChecker is an open source and free command-line utility that will help you check websites and HTML documents for broken links.
LinkChecker features recursive checking, multithreading, output in colored or normal text, HTML, SQL, CSV or a sitemap graph in GML or XML.
Here are some key features of "LinkChecker":
· recursive and multithreaded checking
· output in colored or normal text, HTML, SQL, CSV, XML or a sitemap graph in different formats
· HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Telnet and local file links support
· restriction of link checking with regular expression filters for URLs
· proxy support
· username/password authorization for HTTP and FTP and Telnet
· honors robots.txt exclusion protocol
· Cookie support
· HTML and CSS syntax check
· Antivirus check
· a command line interface
· a GUI client interface
· a (Fast)CGI web interface (requires HTTP server)
Requirements:
· Python 2.5 or later
What's New in This Release: [ read full changelog ]
Features:
· checking: Support URLs.
· logging: Sending SIGUSR1 signal prints the stack trace of all current running threads. This makes debugging deadlocks easier.
· gui: Support Drag-and-Drop of local files. If the local file is a LinkChecker project (.lcp) file it is loaded, else the check URL is set to the local file URL.
Changes:
· checking: Increase per-host connection limits to speed up checking.
Fixes:
· checking: Fix a crash when closing a Word document after scanning failed. Closes: GH bug #369
· checking: Catch UnicodeError from idna.encode() fixing an internal error when trying to connect to certain invalid hostnames.
· checking: Always close HTTP connections without body content. See also http://bugs.python.org/issue16298 Closes: GH bug #376