dead link checker

Guest user Login    Create account

Dead link checker - frequently asked questions

Here are some common questions, and their answers.
A description of the features of Dead Link Checker can be found here.
If you have a question that is not answered here, you can contact us at:
info@deadlinkchecker.com
 

Q: What are dead/broken links?

Dead or broken links are links in a web page which do not take the user to where the author intended. This may be because the link was incorrectly set up by the site's author, or because the destination web page no longer exists, or because there is some other problem with accessing the destination (for example, the server is not responding). When a dead link is clicked, an error code (e.g. 404 Not Found, or 500 Server Error) is normally returned by the server, or a timeout occurs if the server does not exist or returns no information within a given time span.

Q: Why do I need to check my website for broken links? How can I check my website for broken links?

Broken links are a source of irritation to users, and will cause your site to have a lower ranking in search engine results. Sites with lower rankings receive fewer visitors. Use deadlinkchecker.com to find broken links on your site so that they can be fixed, so restoring the site's rankings and attracting more web traffic.

Q: Is this service free?

The interactive version of this service is free. You can enter the URL of a website and it will be scanned for dead links. If you have signed in with a valid email address, you can request that a report be emailed to you when the scan is complete, and you can specify multiple websites to be scanned in a single operation. If you make a small monthly payment to subscribe to our automatic checker then you can set up a daily, weekly or monthly schedule for a number of sites to be scanned automatically with no further interaction required. Reports will be emailed to you containing details of any dead links found on your web pages.

Q: What is SEO?

SEO stands for Search Engine Optimisation. Search engines such as Google, Bing, etc. order their results using a number of criteria, including the perceived quality of the site being indexed. SEO is designed to ensure that a site is ranked as high as possible in a search engine's results. By removing dead links on your web pages you can signal to search engines that the site is an actively maintained and reliable source of information.

Q: What do the different server status codes mean?

Some common server status codes are:

  • 302 - page moved to new location
  • 403 - forbidden
  • 404 - page not found
  • 500 - internal server error

More information can be found here.

Q: How do I schedule a site scan?

Assuming you have subscribed to the auto-checker service and have logged in to DeadLinkChecker.com using your email address, you can set up a scheduled scan by clicking the 'auto check' menu option. Clicking the button 'Add new scan' will pop up a small box in which you can enter the URL of the website to scan, when the next scan should occur, how frequently the site is to be scanned, and the depth of the scan. Clicking the 'Now' button will schedule the scan to run as soon as possible after you click 'Create'.
 
When a scan has been scheduled, you will be able to check on its status, or edit the details of the scan. After the scan has run you will be emailed a report with details of any dead links found.

Q: How do I fix bad links?

To fix a bad link, you need to determine why the link is flagged as bad. If it is referring to a non-existent page (404 error) then it is likely that either the link has been mis-typed in the HTML source (in which case you need to edit it) or the destination page has been temporarily or permanently removed or renamed. If the destination no longer exists then you either need to link to a different page, or remove the link entirely. In any case, if you want to change the link then you will need to be able to edit the HTML of the website, or have access to someone who can do this for you.

Q: Can I check password protected pages?

No, unfortunately it is not possible to use Dead Link Checker to scan password protected pages. To do so would require our server to know and store username and password details for the site being scanned, which would represent a security risk.

Q: Can I restrict the scan to a subdirectory of my site?

Yes - if you start the scan from a page within a subdirectory, then all links found on that page will be checked for existence, but only pages within that subdirectory (or a descendant of it) will be scanned for further links.
 
For example starting at http://www.example.com/news/index.html would verify links to pages such as http://www.example.com/weather.html but that page would not be scanned for more links since it is not within the news/ subdirectory. However the page http://www.example/com/news/current/story1.html would be scanned for links, since it is within the news/ subdirectory.
 
Note that this is a feature of the auto-checker only.

Q: What is a robots.txt file?

At the root of most websites there exists a file called robots.txt - for example, http://www.microsoft.com/robots.txt. This file is used to tell any system that scans the site (such as a link checker, or a search engine's site crawler) which pages or folders should not be scanned or indexed. The robots.txt file can also be used to indicate the location of a sitemap, and to reduce the rate at which automated systems request pages from the site (using the 'crawl-delay' directive). Note that there is no obligation for a scanner to follow the instructions in robots.txt - it is merely a request to behave in a certain way - but well-behaved scanners will attempt to obey the directives. Dead Link Checker follows the 'disallow' and 'crawl-delay' directives where it can. The user-agent used by Dead Link Checker is 'www.deadlinkchecker.com'. For more information on robots.txt, see http://www.robotstxt.org/.

Q: Can I exclude certain links or subdirectories from the scan?

There are two ways to achieve this. If you are able to edit the robots.txt file on the website's server, you can add a directive which denies the link checker access to specific pages, or all pages within a certain subdirectory. For example, to exclude the shoppingbasket/ folder (and all its descendants), add the following to robots.txt:

User-agent: www.deadlinkchecker.com
 Disallow: /shoppingbasket/
Alternatively, when using the auto-checker there is an 'advanced' section in the scheduled scan editor where you can specify a list of strings that will be used to prevent specific URLs from being checked. URLs containing any of the strings specified in this section will not be checked for errors.

Q: Not all pages on my website have been scanned, why?

Dead Link Checker checks links on the initial page given to it, and if any of those links are on the same website then it also checks the linked pages for dead links, and so on. The depth of this scan is limited to 10 for a full scan, so any links which are only accessible by traversing more than this many steps, will not be checked. In addition, pages which have no incoming links will not be discovered. A further consideration is that a change in the subdomain is treated as being a different website, so links on pages on these domains will not be checked (even if they point back to the original domain). For example when scanning the site http://www.example.com, a link to http://test.example.com would be checked, but the contents of that page would not be further checked for links.

Q: Why does Dead Link Checker report links as bad although I can open them in my browser?

Under some circumstances Dead Link Checker might be unable to access a page which is accessible from your browser. Our server is currently located in the US, so if there is a problem accessing your site from there, or if your site serves different content depending on (for example) the geo-location of the requester or its user-agent, then you may see errors being reported.
 
Some web browsers will automatically correct URLs which are actually invalid. For example, URLs are not allowed to contain a backslash '\' character. Chrome and IE seem to silently convert it to a forward slash '/' but other browsers do not. Dead Link Checker will flag such URLs as errors.
 
Sometimes pages are temporarily unavailable, perhaps due to server loading issues. Dead Link Checker will retry such links after a pause, but if it cannot access the page then it will be marked as a bad link even though it may be possible to reach the page at a later time.

Q: Will the tool work on an intranet?

Dead Link Checker is an online tool which will not work on an intranet, because the webpages need to be visible to our server which is external to your intranet.

Q: Will the tool work on dynamic (ASP/JSP/PHP/Rails) sites?

Dead Link Checker will work on dynamic pages. However it only checks any given URL once in a scanning session, so if the active page content changes from one call to the next then only links found on the first encounter will be processed.

Q: What is the difference between the Site Checker, Multi-Site Checker, Auto-Checker and the Auto-Checker Premium and Professional services?

Site Checker is a free tool which allows you to scan a single website for dead links. Multi-Site Checker is also free to use but requires an email address to be used as a login name. You can then scan multiple sites in one sitting, and have a report automatically emailed to you at the end of the scan. Auto-Checker is our entry-level subscription service. For a small monthly fee you can have up to five sites scanned automatically on a regular schedule, with no further interaction required on your part. A report will be emailed to you after each scan, and is also available online. Auto-Checker Premium and Professional allow you to check a larger number of sites, with more links on each site.

Q: How can I reduce the load on my server when it is being scanned?

Dead Link Checker has been optimised to scan websites as quickly as possible, whilst automatically adjusting its scan rate to reduce server errors. However some servers may struggle if pages are requested too quickly, or the requests may be interpreted as a Denial-Of-Service attack. You can slow down the rate at which pages are requested by modifying the robots.txt file on your server, to include a section:

 User-agent: www.deadlinkchecker.com
 Crawl-delay: 1
This will restrict the page requests to approximately one per second. The scan will be slowed considerably but the server load will be reduced in proportion.
 
Alternatively when using the auto-check feature, you can access the 'advanced' settings in the scheduled scan editor and enter a value for the 'Interval' to specify a minimum duration between successive page requests on the website being scanned.

Q: How does Dead Link Checker identify itself when scanning a web site?

Dead Link Checker uses the user-agent string 'www.deadlinkchecker.com' when requesting web pages. At present, the IP address is 23.250.3.10 but this may change in the future and should not be relied upon.

Q: How can I check an on-going auto-checker scan?

When using the auto-checker you can visit the Auto Check page and see a list of scan schedules you have created. In the column headed 'Status' there is an indication whether a scan is in progress, or pending, or disabled. Clicking on this indicator for a site marked as 'scanning' pops up box in which you can see statistics on the scan, the URL of a recently discovered link, and buttons which allow you to request an intermediate report (without ending the scan), end the scan prematurely, or close the pop-up box.

Q: Can I reduce the number of emails I receive?

When using the auto-checker you can edit the 'advanced' scan settings and tick the checkbox marked 'suppress email if no errors' - this will stop Dead Link Checker from sending you emails if a scheduled scan detected no errors.

Q: Can I change the email address that reports are sent to?

For security reasons, Dead Link Checker will only email reports to the registered account holder. However it should be possible to configure your email software to forward the email to a third party, based on the sender and/or subject line.

Q: Will Dead Link Checker's scan affect Google Analytics results?

You can configure Google Analytics to filter out all requests from Dead Link Checker's IP address (see 'How does Dead Link Checker identify itself?'). For instructions on configuring Google Analytics, see here.

Q: Do pages with different query strings count as different pages?

Server-generated web pages can alter their content depending on parameters passed to them, so Dead Link Checker regards pages with differing query strings as being distinct pages which are checked separately.

Q: Can I import scan results into a spreadsheet such as Excel?

The reports generated by the Auto-Check tool can be viewed in a tab-delimited CSV format as well as HTML format. The CSV file can be saved to your computer and imported into Excel etc. for further analysis.




multi-check

MULTI CHECK

Multi Check enables you to run Dead Link Checker through multiple websites in one go. The report is then emailed to you automatically.
auto-check

AUTO CHECK

Our most popular service is Auto Check, which runs Dead Link Checker through your website(s) on a regular basis and emails the reports to you automatically.