Greg Benedict

Thoughts on the web and creativity.

Auto-Generated sitemaps.xml File for Mephisto

Mephisto (using 7.3 edge) has some nice built in atom feeds. I could not however figure out how to get a full site feed. Everything was by section. Specifically, I wanted a feed that followed the sitemaps.org specification that Google, MSN and Yahoo use to spider the site.

Kudos to Joseph Moore for doing 90% of the work. His original version used a Google version of the spec (0.84) and put everything to http://mydomain.com/sitemap/ but the spiders look for sitemap.xml unless told otherwise.

Here are the changes I made to /app/views/sitemap/index.rxml

# see https://www.google.com/webmasters/tools/docs/en/protocol.html
# http://www.sitemaps.org/protocol.php
xml.instruct! :x ml, :version=>”1.0″, :encoding=>”UTF-8″
xml.urlset(:xmlns => “http://www.sitemaps.org/schemas/sitemap/0.9″) do
time_zone = TimeZone.new(@site.timezone.current_period.utc_offset)

# Priority is a relative weighting, the default is 0.5 if not specified. 0.0 – 1.0
# give priority to the homepage (daily, 1.0)
# give priority to the subdirectories (daily, 0.8)
# else priority to the articles based on age
# >= 1 day = 0.9 and weekly
# >= 1 week = 0.8 and weekly
# >= 1 month = 0.5 and monthly
# >= 6 month = 0.3 and yearly

# give priority to the homepage (daily, 1.0)
xml.url do
xml.loc(“http://#{request.host_with_port}#{request.relative_url_root}/”)
xml.lastmod(Date.today.strftime(“%Y-%m-%dT%H:%M:%S#{time_zone.formatted_offset}”))
xml.changefreq(“daily”)
xml.priority(“1.0″)
end

@sections.each do |section|
if section.name.downcase != “home” then #exclude the home page!
xml.url do
xml.loc(“http://#{request.host_with_port}#{request.relative_url_root}/#{Section.permalink_for(section.name.to_s)}”)
#fudge the date to be recent.
xml.lastmod((Date.today-1).strftime(“%Y-%m-%dT%H:%M:%S#{time_zone.formatted_offset}”))
xml.changefreq(“daily”)
xml.priority(“0.8″)
end
end
end

@articles.each do |article|
xml.url do
xml.loc(“http://#{request.host_with_port}#{request.relative_url_root}#{site.permalink_for(article)}”)
xml.lastmod(article.updated_at.strftime(“%Y-%m-%dT%H:%M:%S#{time_zone.formatted_offset}”))

age = (Date.today – Date.parse(article.updated_at.to_s)) % (60*60*24)
if age >= 180 then
xml.changefreq(“yearly”)
xml.priority(“0.3″)
elsif age >= 30 then
xml.changefreq(“monthly”)
xml.priority(“0.5″)
elsif age >= 7 then
xml.changefreq(“weekly”)
xml.priority(“0.8″)
else
xml.changefreq(“daily”)
xml.priority(“0.9″)
end
end
end
end

Here are the changes I made to /app/controllers/sitemap_controller.rb to support the section lookup.

class SitemapController < ApplicationController
layout nil
session :o ff

def index
@sections = site.sections.find(:all)
@articles = Article.find(:all, :conditions => “published_at is not null”)
end

end

I’ve also added an additional route to lib/mephisto/routing.rb to recognize http://mydomain.com/sitemap.xml.

def self.connect_with(map)

# Allows access to the sitemap!
map.connect ‘sitemap’, :controller => ‘sitemap’
map.connect ‘sitemap.xml’, :controller => ‘sitemap’

When I get the chance I’ll make this a plugin. Very handy.

Update:

I forgot the instruct line for the charset in the sitemap.rxml file. I’ve added it above.

3 ResponsesLeave one →

  1. Thanks a bunch for the improved script! I too was disappointed when I found that Joseph’s script didn’t include such features as the homepage, and section pages. Thanks to your script they’re all accounted for!

  2. I have been visiting this site a lot lately, so i thought it is a good idea to show my appreciation with a comment.

    Thanks,
    Jim Mirkalami

    PS: I am a single dad. ;)

  3. Thank you for this. Have you made any progress on creating it as a plugin?

Leave a Reply