Tag Archives: pligg

Where’s the xml sitemap?

Someone contacted me through the contact form to ask me where is the xml sitemap generated by the xml sitemap module for pligg .

If he would have read my first post about this module I think he would have eventually figured out where it is but since that first post was written a long time ago let me answer that question in this post.

I will do this in a post instead of answering privately because maybe there are others that might run into the same problem and I hate answering the same question over and over.

The module doesn't generate a sitemap but a sitemap index ( that's basically just a list of sitemaps in xml ) and unless you're using cache the module will generate it every time someone goes to the sitemap's URL.

If you're not using friendly urls for sitemaps then the url to the sitemap will be :
http://yourpliggsite.com/module.php?module=xml_sitemaps_show_sitemap
If you want to use friendly urls for the sitemap you will have to configure it as described here

Last time I checked ( when I first created the module ) ask.com could not be pinged unless your sitemap url looked like a static url or/and was ending in .xml and this is why I created the module with this choice in mind. If you don't care about pinging ask.com or if ask.com changed it's policy ( can anyone check this ? ) then you don't need friendly urls for sitemaps.

For the future I would appreciate if such questions would be asked in the comments instead of private contact. I prefer the comments for answering questions about my posts or the code in my posts because this way others can benefit from my answers or others can contribute.

The contact form would be for private matters like asking for consultancy , business proposals or others that don't fit into the comments.

Xml Sitemaps pligg module v0.9

This one is a quick release just like the previous one that fixes just one thing.

All previous versions had this problem that the urls were not urlencoded so those urls that contained special characters like those with an accent or diacritics were invalid and of course google would show an error on those sitemaps.

Version 0.9 makes escapes those urls so now those of you with such special characters in the urls can finally enjoy this module.

It seems like the modules is getting closer to version 1.0 . If you have any suggestion about some feature you would like in 1.0 or you found some other bug that needs fixed, don't hesitate to let me know about it.

download v 0.9 from the module's page

Xml Sitemaps pligg module v0.8

It seems my last version of the Xml Sitemaps module for pligg didn't really fixed the date format problem with the generated sitemaps.

Back when this module was created google had less strict rules about the date format in the lastmod section. My module generated a date and time string with this format YYYY-mm-ddTHH:MM:SS and it was ok but now it's only valid if it also contains the timezone offset in this format +/-HH:MM or if the string doesn't contain the time anymore.

So here is another update on this module that also contains the timezone offset so the sitemap is considered valid by google.

Download here

xml sitemap for pligg v0.7

This is a quick fix for a bug introduced in a previous version because I tried to make it compatible with php4 date().

This bug may have made your sitemap invalid because the lastmod date contained the timezone between the date and time.

This version also brings a new feature that could be useful for larger sites.

The cache

I noticed that on a site with over 20000 links it may take a lot of time to generate the sitemap index and sitemap files and will put some significant load on the server if google, yahoo or ask will try to access the sitemap every few minutes or hours ( depending on your site's posting/pinging frequency ), so I thought it would be nice to have some kind of cache.

The module will save generated sitemap index and sitemaps in pligg's cache directory ( this means it needs to be writable by the user running the webserver ) and if the cache has not expired yet the module will serve the sitemaps from the cache instead of trying to regenerate every time.

You can set the expire time ( TTL ) and the module will regenerate the sitemap if TTL seconds have passed since the last time it was modified.

You don't need to set any cron job to generate the sitemap files. It will only generate the sitemap if someone/something(google,yahoo,ask) tries to access it.

Another modification related to the caching system is that the site will only ping services if the cache has expired so make sure you set your cache's TTL accordingly.

Upgrade:

To upgrade to this version just download and unzip in your module's directory and then to pligg admin -> module management , disable and uninstall the module and then reinstall it so that you can see the new options ( "Use Cache" and "Cache TTL")

Download from module's own page

xml sitemap for pligg update v0.6

Update: There is be a new version of this module. Click here to get it.

Last year I have released a pligg module that creates XML sitemaps for better SEO. The module was tested and working on pligg 0.9.7. Since that time pligg was updated two times and is now at 0.9.9.

In pligg 0.9.8 they removed the clear_cache function that the my module's install code was using to clear the configuration cache. Because of this it was impossible to install the module.

The quick fix was just to edit xml_sitemaps_install.php and remove a line (#5) that called clear_cache() and then you could install it without any problems. This workaround was already known and discussed in the comments
Someone in the pligg community released a new version of this module that is basically just my module with that clear_cache function call removed and another 1 bug that was also discussed in the comments and was easy to fix. They bumped the version number to 0.5 which in my opinion was really not necessary since it was such a small fix.
They have also added a .htaccess file to the archive, again not necessary since my initial post already had the details ( .htaccess code ) about how to set it up and the module was also showing that in the configuration section.
One other thing I didn't like about their release was that they didn't make the module available for download without having to register on their forum.

So here I release a new version ( 0.6 ) of this module with more bugs fixed and now the module will have a page of it's own so that it will be easier to track new versions .

What changed:

  • use safe category names
  • generate links according to URLMethod configuration
  • modified use of date() function to work on php4

This module was tested on pligg 0.9.9.

Go to xml sitemap pligg module page for download

XML Sitemaps for Pligg

Update: There is be a new version of this module. Click here to get it.

I created a module that generates XML Sitemaps for Pligg ( the well known open source cms used for creating sites similar to digg.com ).

The module generates a sitemap index and sitemaps with all the stories in the database dynamically so nothing is stored on disk and you don't have to set a cron job to generate it.

The sitemaps are updated automatically when a new story is submitted. Because of the structure of the sitemap index and because it contains "lastmod" info, the search engines should only download the latest entries in the index so you shouldn't worry about the module putting too much load on your system.

There is also a "ping" function that will announce google, yahoo and ask.com every time a new story is submitted so that they know they have to download the sitemap. The ping function required a little patching to pligg source code to add some hooks ( only if you use 9.6, 9.7 should already have those hooks ). Here is the diff file in case you use pligg 0.9.6 : pligg submit hooks diff

The module was only tested on pligg 0.9.6, I haven't upgraded to 0.9.7 yet, so if you try this on 0.9.7 let me know how it works, any feedback is appreciated.

Download:

You can download Xml_Sitemaps module from here: xml_sitemaps-0.1.tar.gz and in case you want to verify it here is the md5 sum and the sha256 sum

the code is released under the same license as pligg, so feel free to use it, modify and share.

Installation:

This is pretty straight forward, you have to install this like any other pligg module, just put the .tar.gz file in the modules, un-archive it then go into pligg admin and activate it. If you use pligg 0.9.6 and you want to be able to ping the search engines don't forget to apply the submit hooks patch .

Configuration:

After installation you should be able to access the sitemap index like this : http://yourdomain.com/module.php?module=xml_sitemaps_show_sitemap or if you want the sitemap to look friendly ( btw ask.com will only accept a friendly sitemap ending in .xml ) , you just have to go into Admin->Configuration->XmlSitemaps and enable "Sitemap Friendly URL", and if you do that then you have to put the following lines in your .htaccess somewhere before the line "##### URL Method 2 ("Clean" URLs) Begin #####" :

  1. RewriteRule ^sitemapindex.xml module.php?module=xml_sitemaps_show_sitemap [L]
  2. RewriteRule ^sitemap-([a-zA-Z0-9]+).xml module.php?module=xml_sitemaps_show_sitemap&i=$1 [L]

Here is how the index looks on a site with sitemap friendly urls enabled: http://sapa.ro/sitemapindex.xml

There are other configuration options in there, you can set the maximum number of stories to put in a sitemap, and you can chose whether to ping any of the three search engines supported. You can also set your yahoo.com key in there if you want to ping yahoo.

That's it! Happy Sitemapping! and as always ... let me know how it works in the comments.