





















































Let's get started.
As smart as the Google spider is, it's possible for them to miss pages on your site. Maybe you've got an orphaned page that isn't in your navigation anymore. Or, perhaps you have moved a link to a piece of content so that it's not easily accessible. It's also possible that your site is so big that Google just can't crawl it all without completely pulling all your server's resources—not pretty!
The solution is a sitemap.
In the early 2000s, Google started supporting XML sitemaps. Soon after Yahoo came out with their own standard and other search engines started to follow suit. Fortunately, in 2006, Google, Yahoo, Microsoft, and a handful of smaller players all got together and decided to support the same sitemap specification. That made it much easier for site owners to make sure every page of their web site is crawled and added to the search engine index. They published their specification at http://sitemaps.org. Shortly thereafter, the Drupal community stepped up and created a module called (surprise!) the XML sitemap module. This module automatically generates an XML sitemap containing every node and taxonomy on your Drupal site. Actually, it was written by Matthew Loar as part of the Google Summer of Code. The Drupal 6 version of the module was developed by Kiam LaLuno. Finally, in mid-2009, Dave Reid began working on a version 2.0 of the module to address performance, scalability, and reliability issues. Thanks, guys!
According to www.sitemaps.org:
Sitemaps are an easy way for Webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.
Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all URLs in the Sitemap and learn about those URLs using the associated metadata.
Using a sitemap does not guarantee that every page will be included in the search engines. Rather, it helps the search engine crawlers find more of your pages. In my experience, submitting an XML Sitemap to Google will greatly increase the number of pages when you do a site: search.
The keyword site: searches show you how many pages of your site are included in the search engine index, as shown in the following screenshot:
The XML Sitemap module creates a sitemap that conforms to the sitemap.org specification.
Which XML Sitemap module should you use?
There are two versions of the XML Sitemap module for Drupal 6. The 1.x version is, as of this writing, considered the stable release and should be used for production sites. However, if you have a site with more than about 2000 nodes, you should probably consider using the 2.x version. From www.drupal.org: 'The 6.x-2.x branch is a complete refactoring with considerations for performance, scalability, and reliability. Once the 6.x-2.x branch is tested and upgradeable, the 6.x-1.x branch will no longer be supported'. What this means is that in the next few months (quite possibly by the time you're reading this) everyone should be using the 2.x version of this module. That's the beauty of open source software—there are always improvements coming that make your Drupal site better Search Engine Optimized.
Carry out the following steps to set up the XML Sitemap module:
Now that you have the XML sitemap module properly installed and configured, you can start defining the priority of the content on your site—by default, the priority is .5. However, there are times when you may want Google to visit some content more often and other times when you may not want your content in the sitemap at all (like the comment or contact us submission forms).
Each node now has an XML sitemap section that looks like the following screenshot:
Before you turn on any included modules, consider what pieces of content on your site you want to show up in the search engines and only turn on the modules you need.
What is priority and how does it work?
Priority is an often-misunderstood part of a sitemap. For instance, the priority is only used to compare pages of your own site and you cannot increase your ranking in the Search Engine Results Page (SERPS) by increasing the priority of your pages. However, it does help let the search engines know which pages of your site you feel are more important. They could use this information to select between two different pages on your site when deciding which page to show to a search engine user.