After I completed the registration process for Google sitemaps, I checked my sitemap status and found an extra benefit of this whole rigamarole. From your “My Sitemaps” page, follow the stats
link next to one of your submitted URI’s. If the page shows
URLs found during our regular crawl process
this indicates broken links into your site, not in your sitemap. If you can find the offending link on your own site, go ahead and fix it. If it’s a link from another site, see if you can get the webmaster of that site to fix it, (assuming you can find the source of the link). In either case, it’d probably be a good idea to put a Permanent Redirect (301) on your own site to correct anybody who’s chasing the broken link.
Vanessa Fox from Google engineering discusses the new site statistics a tiny bit, but more information is available in the google-sitemaps Google group. I’ve already left a comment there that I think there are two important pieces of information missing from the new site statistics:
- The time that the broken URL was last crawled.
- One or more URL’s pointing to source of the broken link.
Another feature that would be very helpful would be some way to alert Google that a broken URL has been repaired, e.g. via a 301 redirect.