XML Sitemaps for Pligg

Update: There is be a new version of this module. Click here to get it.

I created a module that generates XML Sitemaps for Pligg ( the well known open source cms used for creating sites similar to digg.com ).

The module generates a sitemap index and sitemaps with all the stories in the database dynamically so nothing is stored on disk and you don't have to set a cron job to generate it.

The sitemaps are updated automatically when a new story is submitted. Because of the structure of the sitemap index and because it contains "lastmod" info, the search engines should only download the latest entries in the index so you shouldn't worry about the module putting too much load on your system.

There is also a "ping" function that will announce google, yahoo and ask.com every time a new story is submitted so that they know they have to download the sitemap. The ping function required a little patching to pligg source code to add some hooks ( only if you use 9.6, 9.7 should already have those hooks ). Here is the diff file in case you use pligg 0.9.6 : pligg submit hooks diff

The module was only tested on pligg 0.9.6, I haven't upgraded to 0.9.7 yet, so if you try this on 0.9.7 let me know how it works, any feedback is appreciated.

Download:

You can download Xml_Sitemaps module from here: xml_sitemaps-0.1.tar.gz and in case you want to verify it here is the md5 sum and the sha256 sum

the code is released under the same license as pligg, so feel free to use it, modify and share.

Installation:

This is pretty straight forward, you have to install this like any other pligg module, just put the .tar.gz file in the modules, un-archive it then go into pligg admin and activate it. If you use pligg 0.9.6 and you want to be able to ping the search engines don't forget to apply the submit hooks patch .

Configuration:

After installation you should be able to access the sitemap index like this : http://yourdomain.com/module.php?module=xml_sitemaps_show_sitemap or if you want the sitemap to look friendly ( btw ask.com will only accept a friendly sitemap ending in .xml ) , you just have to go into Admin->Configuration->XmlSitemaps and enable "Sitemap Friendly URL", and if you do that then you have to put the following lines in your .htaccess somewhere before the line "##### URL Method 2 ("Clean" URLs) Begin #####" :

  1. RewriteRule ^sitemapindex.xml module.php?module=xml_sitemaps_show_sitemap [L]
  2. RewriteRule ^sitemap-([a-zA-Z0-9]+).xml module.php?module=xml_sitemaps_show_sitemap&i=$1 [L]

Here is how the index looks on a site with sitemap friendly urls enabled: http://sapa.ro/sitemapindex.xml

There are other configuration options in there, you can set the maximum number of stories to put in a sitemap, and you can chose whether to ping any of the three search engines supported. You can also set your yahoo.com key in there if you want to ping yahoo.

That's it! Happy Sitemapping! and as always ... let me know how it works in the comments.

98 thoughts on “XML Sitemaps for Pligg

  1. for this to work you need to have certain rewrite rules set up. The examples for apache are in the module configuration. My guess is you didn't set them or there's something wrong with the way you set them.

  2. Make sure mod_rewrite in .htacces or virtual hosts configuration works first. If you managed to have pligg friendly urls work then this should work too.
    Look into the module's configuration it explains clearely what you have to put into .htaccess to have a nice looking sitemap.
    The final sitemap is actually a sitemap index ( this is the one that ou'll be sending to the SEs ) and it should be http://www.gevaldigg.com/sitemapindex.xml
    Good luck.

    1. I think the following lines need to be before the category management rules
      RewriteRule ^sitemapindex.xml module.php?module=xml_sitemaps_show_sitemap [L]
      RewriteRule ^sitemap-([a-zA-Z0-9]+).xml module.php?module=xml_sitemaps_show_sitemap&i=$1 [L]

  3. what should i put to google webmasters
    http://www.example.com/module.php?module=xml_sitemaps_show_sitemap
    http://www.example.com/sitemap-0.xml
    http://www.example.com/sitemap-pages0.xml
    http://www.example.com/sitemap-users0.xml
    http://www.example.com/sitemap-groups0.xml
    http://www.example.com/sitemap-main.xml
    or i should put them all.
    I have tried for all them but fail.
    But if a put them at robots.txt google said theya are all valid sitemap.
    what’s the difference.
    Thanks

      1. URL timeout: HTTP request timeout
        We encountered an error while trying to access your Sitemap. Please ensure your Sitemap follows our guidelines and can be accessed at the location you provided and then resubmit.

        1. this is so obvious that I thought I shouldn’t even approve the comment but here it is anyway: Google can’t access your server, doesn’t have anything to do with the sitemap module

  4. i have used the latest version of the module and it’s what i describe above.
    then i change the priority to 1.00 for all entry.
    but i still got that error message.
    i still confused wha’s wrong with that sitemap
    or i must change the last mod
    i see at the sitemap lastmod
    2010-03-18T12:36:50-04:00
    i compare with xml generator. the plus z at the end of lastmod
    2010-03-18T12:36:50Z
    Is those both above make any difference?

    1. You’re using the latest version you got from the pligg forum. I know this because I looked at your sitemaps index and it has items that are not generated by my latest version. The problem is that whoever released that version didn’t bother to test it.
      Just use the latest version from this page: http://patchlog.com/xmlsitemapspligg/
      and let me know if you still have the problem

  5. now i used the version you suggest but it seems nothing better. there were still this error message
    Line 10
    Parsing error
    We were unable to read your Sitemap. It may contain an entry we are unable to recognize. Please validate your Sitemap before resubmitting.

    but if i change that sitemap in xml file as you you can see at /sitemaperror.xml
    google said it’s valid sitemap.
    i’m going crazy with this
    what’s your suggestion.
    thank’s

    1. so you’re saying that if tell google to use http://www.indeksberita.com/sitemapindex.xml or http://www.indeksberita.com/sitemap-0.xml it says it’s invalid but if you use /sitemaperror.xml it says it’s valid ?
      Or am I just missing the point and this is all about that priority number having too many decimals ? If that’s the case it’s relatively simple to modify it I think I already showed someone else how to do it in a previous comment maybe on http://patchlog.com/php/xml-sitemaps-pligg-module-v09/

  6. it’s not about priority.
    i mean sitemaperror.xml is just copy paste code of sitemap-0.xml and it works.
    in other words there are no difference between both of them
    so what makes google cannot crawl sitemap-0.xml if it’s same with sitemaperror.xml.
    is there any other factor that influence so why google webmasters cannot validate.
    as long i know parsing error is another words of syntaks error, isn’t it?

  7. we have currently upgraded pligg from 1.0.0 to 1.0.3 after that xmlsitemap is not working. it is enabled on the admin and all the settings are just like earlier but when we access the sitemapindex file it gives a blank page

    please help us

      1. Hi Mahi,

        we have solved the problem. we have disabled all our modules, then we cleared all the cache, then we enabled the site map module.

        Now its working absolutely fine

        Thank you very much for your quick answer

        gishore

  8. I have this error:
    You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘XmlSitemaps_Links_per_sitemap ) l’ at line 1 in
    plig 9.9

Leave a Reply to MihaiCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.