The last time I was validating my blog’s XHTML, the “tip of the day” from W3C had some great suggestions (as usual). It inspired me to write this post.
Minimum SOP’s for web authors/bloggers should include:
- Validate your pages. Use the markup validation as well as the CSS validation services, both offered by W3C.
- Run the link checker against your site’s pages. Find and fix any broken links.
- If you are providing syndicated feeds, validate it/them.
The W3C tips are recommended reading for all web authors. [Re-]read one of these every day and apply what you learn! Your visitors will thank you.
After I completed the registration process for Google sitemaps, I checked my sitemap status and found an extra benefit of this whole rigamarole. From your “My Sitemaps” page, follow the stats link next to one of your submitted URI’s. If the page shows
URLs found during our regular crawl process
this indicates broken links into your site, not in your sitemap. If you can find the offending link on your own site, go ahead and fix it. If it’s a link from another site, see if you can get the webmaster of that site to fix it, (assuming you can find the source of the link). In either case, it’d probably be a good idea to put a Permanent Redirect (301) on your own site to correct anybody who’s chasing the broken link.
Vanessa Fox from Google engineering discusses the new site statistics a tiny bit, but more information is available in the google-sitemaps Google group. I’ve already left a comment there that I think there are two important pieces of information missing from the new site statistics:
- The time that the broken URL was last crawled.
- One or more URL’s pointing to source of the broken link.
Another feature that would be very helpful would be some way to alert Google that a broken URL has been repaired, e.g. via a 301 redirect.