Mephisto (using 7.3 edge) has some nice built in atom feeds. I could not however figure out how to get a full site feed. Everything was by section. Specifically, I wanted a feed that followed the sitemaps.org specification that Google, MSN and Yahoo use to spider the site.
Kudos to Joseph Moore for doing 90% of the work. His original version used a Google version of the spec (0.84) and put everything to http://mydomain.com/sitemap/ but the spiders look for sitemap.xml unless told otherwise.
Here are the changes I made to /app/views/sitemap/index.rxml
# see https://www.google.com/webmasters/tools/docs/en/protocol.html
# http://www.sitemaps.org/protocol.php
xml.instruct!
ml, :version=>”1.0″, :encoding=>”UTF-8″
xml.urlset(:xmlns => “http://www.sitemaps.org/schemas/sitemap/0.9″) do
time_zone = TimeZone.new(@site.timezone.current_period.utc_offset)
# Priority is a relative weighting, the default is 0.5 if not specified. 0.0 – 1.0
# give priority to the homepage (daily, 1.0)
# give priority to the subdirectories (daily, 0.8)
# else priority to the articles based on age
# >= 1 day = 0.9 and weekly
# >= 1 week = 0.8 and weekly
# >= 1 month = 0.5 and monthly
# >= 6 month = 0.3 and yearly
# give priority to the homepage (daily, 1.0)
xml.url do
xml.loc(“http://#{request.host_with_port}#{request.relative_url_root}/”)
xml.lastmod(Date.today.strftime(“%Y-%m-%dT%H:%M:%S#{time_zone.formatted_offset}”))
xml.changefreq(“daily”)
xml.priority(“1.0″)
end
@sections.each do |section|
if section.name.downcase != “home” then #exclude the home page!
xml.url do
xml.loc(“http://#{request.host_with_port}#{request.relative_url_root}/#{Section.permalink_for(section.name.to_s)}”)
#fudge the date to be recent.
xml.lastmod((Date.today-1).strftime(“%Y-%m-%dT%H:%M:%S#{time_zone.formatted_offset}”))
xml.changefreq(“daily”)
xml.priority(“0.8″)
end
end
end
@articles.each do |article|
xml.url do
xml.loc(“http://#{request.host_with_port}#{request.relative_url_root}#{site.permalink_for(article)}”)
xml.lastmod(article.updated_at.strftime(“%Y-%m-%dT%H:%M:%S#{time_zone.formatted_offset}”))
age = (Date.today – Date.parse(article.updated_at.to_s)) % (60*60*24)
if age >= 180 then
xml.changefreq(“yearly”)
xml.priority(“0.3″)
elsif age >= 30 then
xml.changefreq(“monthly”)
xml.priority(“0.5″)
elsif age >= 7 then
xml.changefreq(“weekly”)
xml.priority(“0.8″)
else
xml.changefreq(“daily”)
xml.priority(“0.9″)
end
end
end
end
Here are the changes I made to /app/controllers/sitemap_controller.rb to support the section lookup.
class SitemapController < ApplicationController
layout nil
session
ff
def index
@sections = site.sections.find(:all)
@articles = Article.find(:all, :conditions => “published_at is not null”)
end
end
I’ve also added an additional route to lib/mephisto/routing.rb to recognize http://mydomain.com/sitemap.xml.
def self.connect_with(map)
# Allows access to the sitemap!
map.connect ‘sitemap’, :controller => ‘sitemap’
map.connect ‘sitemap.xml’, :controller => ‘sitemap’
…
When I get the chance I’ll make this a plugin. Very handy.
Update:
I forgot the instruct line for the charset in the sitemap.rxml file. I’ve added it above.
Subscribe
Dusty
/ December 1, 2007Thanks a bunch for the improved script! I too was disappointed when I found that Joseph’s script didn’t include such features as the homepage, and section pages. Thanks to your script they’re all accounted for!
Jim Mirkalami
/ February 9, 2008I have been visiting this site a lot lately, so i thought it is a good idea to show my appreciation with a comment.
Thanks,
Jim Mirkalami
PS: I am a single dad.
Galen King
/ March 5, 2008Thank you for this. Have you made any progress on creating it as a plugin?