Update: There is be a new version of this module. Click here to get it.
I created a module that generates XML Sitemaps for Pligg ( the well known open source cms used for creating sites similar to digg.com ).
The module generates a sitemap index and sitemaps with all the stories in the database dynamically so nothing is stored on disk and you don't have to set a cron job to generate it.
The sitemaps are updated automatically when a new story is submitted. Because of the structure of the sitemap index and because it contains "lastmod" info, the search engines should only download the latest entries in the index so you shouldn't worry about the module putting too much load on your system.
There is also a "ping" function that will announce google, yahoo and ask.com every time a new story is submitted so that they know they have to download the sitemap. The ping function required a little patching to pligg source code to add some hooks ( only if you use 9.6, 9.7 should already have those hooks ). Here is the diff file in case you use pligg 0.9.6 : pligg submit hooks diff
The module was only tested on pligg 0.9.6, I haven't upgraded to 0.9.7 yet, so if you try this on 0.9.7 let me know how it works, any feedback is appreciated.
Download:
You can download Xml_Sitemaps module from here: xml_sitemaps-0.1.tar.gz and in case you want to verify it here is the md5 sum and the sha256 sum
the code is released under the same license as pligg, so feel free to use it, modify and share.
Installation:
This is pretty straight forward, you have to install this like any other pligg module, just put the .tar.gz file in the modules, un-archive it then go into pligg admin and activate it. If you use pligg 0.9.6 and you want to be able to ping the search engines don't forget to apply the submit hooks patch .
Configuration:
After installation you should be able to access the sitemap index like this : http://yourdomain.com/module.php?module=xml_sitemaps_show_sitemap or if you want the sitemap to look friendly ( btw ask.com will only accept a friendly sitemap ending in .xml ) , you just have to go into Admin->Configuration->XmlSitemaps and enable "Sitemap Friendly URL", and if you do that then you have to put the following lines in your .htaccess somewhere before the line "##### URL Method 2 ("Clean" URLs) Begin #####" :
RewriteRule ^sitemapindex.xml module.php?module=xml_sitemaps_show_sitemap [L] RewriteRule ^sitemap-([a-zA-Z0-9]+).xml module.php?module=xml_sitemaps_show_sitemap&i=$1 [L]
Here is how the index looks on a site with sitemap friendly urls enabled: http://sapa.ro/sitemapindex.xml
There are other configuration options in there, you can set the maximum number of stories to put in a sitemap, and you can chose whether to ping any of the three search engines supported. You can also set your yahoo.com key in there if you want to ping yahoo.
That's it! Happy Sitemapping! and as always ... let me know how it works in the comments.
404 not found with my xml map http://www.webforinfo.com/sitemapindex.xml i have followed all instruction even remove pligg and then install again.
only sitemap show here http://www.webforinfo.com/module.php?module=xml…
even i have changed my site url method with value 2
also google accept this sitemap url
i don't know it will harm my site or not according to the seo.
for this to work you need to have certain rewrite rules set up. The examples for apache are in the module configuration. My guess is you didn't set them or there's something wrong with the way you set them.
Hi Mihai, Thanks for what I expect will be a great module. Unfortunately, I'm getting the exact same results as adeel. You can see mine here http://www.gevaldigg.com/module.php?module=xml_… .
I'm not a programmer but I can usually understand how to make minor tweaks to my site like this. What is the final sitemap supposed to look like? domain.com/sitemap.xml ???
Thanks in advance for you help
Make sure mod_rewrite in .htacces or virtual hosts configuration works first. If you managed to have pligg friendly urls work then this should work too.
Look into the module's configuration it explains clearely what you have to put into .htaccess to have a nice looking sitemap.
The final sitemap is actually a sitemap index ( this is the one that ou'll be sending to the SEs ) and it should be http://www.gevaldigg.com/sitemapindex.xml
Good luck.
my sitemap .xml
in work or not work??
http://www.dayadd.net/sitemapindex.xml
thank…
not work.
the sitemap pages point to the the same sitemap index instead of showing up the links.
You probably messed up the .htaccess rules
please edit my .htaccess
thank.
I think the following lines need to be before the category management rules
RewriteRule ^sitemapindex.xml module.php?module=xml_sitemaps_show_sitemap [L]
RewriteRule ^sitemap-([a-zA-Z0-9]+).xml module.php?module=xml_sitemaps_show_sitemap&i=$1 [L]
http://www.forex…..biz/sitemapindex.xml
How can I know I make the right method? Is it right?
there’s nothing wrong with it….
Last time I tried it was working.It’s not such a big deal to try it 🙂
i tried this sitemap but google dosn’t recognize the links
For some reason your links don’t have the protocol part in them ( the http:// part at the beginning )
Also I think you’re not using my latest version but some version modified by someone else. Get my latest version from here
and you should be fine:
http://patchlog.com/xmlsitemapspligg/
what should i put to google webmasters
http://www.example.com/module.php?module=xml_sitemaps_show_sitemap
http://www.example.com/sitemap-0.xml
http://www.example.com/sitemap-pages0.xml
http://www.example.com/sitemap-users0.xml
http://www.example.com/sitemap-groups0.xml
http://www.example.com/sitemap-main.xml
or i should put them all.
I have tried for all them but fail.
But if a put them at robots.txt google said theya are all valid sitemap.
what’s the difference.
Thanks
You should put only one url:
http://www.example.com/module.php?module=xml_sitemaps_show_sitemap
Or if you have set up the corrct rewrite rules in .htaccess and enabled friendly urls in the module’s configuration then you can use:
http://www.example.com/sitemapindex.xml
URL timeout: HTTP request timeout
We encountered an error while trying to access your Sitemap. Please ensure your Sitemap follows our guidelines and can be accessed at the location you provided and then resubmit.
this is so obvious that I thought I shouldn’t even approve the comment but here it is anyway: Google can’t access your server, doesn’t have anything to do with the sitemap module
They are accessing but not indexing
i stil got this error message with from google webmasters
“Parsing error
We were unable to read your Sitemap. It may contain an entry we are unable to recognize. Please validate your Sitemap before resubmitting.”
and i see this example from this site map
−
http://www.indeksberita.com/Misteri/fenomena-api-yang-tetap-menyala-di-dasar-laut/
2010-03-17T14:40:04-04:00
daily
0.000922922091993
is the priority is the cause of error?
The priority should be 0.0009 if you’re using my latest version of the module. http://patchlog.com/php/xml-sitemaps-pligg-module-v09/
i have used the latest version of the module and it’s what i describe above.
then i change the priority to 1.00 for all entry.
but i still got that error message.
i still confused wha’s wrong with that sitemap
or i must change the last mod
i see at the sitemap lastmod
2010-03-18T12:36:50-04:00
i compare with xml generator. the plus z at the end of lastmod
2010-03-18T12:36:50Z
Is those both above make any difference?
You’re using the latest version you got from the pligg forum. I know this because I looked at your sitemaps index and it has items that are not generated by my latest version. The problem is that whoever released that version didn’t bother to test it.
Just use the latest version from this page: http://patchlog.com/xmlsitemapspligg/
and let me know if you still have the problem
now i used the version you suggest but it seems nothing better. there were still this error message
Line 10
Parsing error
We were unable to read your Sitemap. It may contain an entry we are unable to recognize. Please validate your Sitemap before resubmitting.
but if i change that sitemap in xml file as you you can see at /sitemaperror.xml
google said it’s valid sitemap.
i’m going crazy with this
what’s your suggestion.
thank’s
so you’re saying that if tell google to use http://www.indeksberita.com/sitemapindex.xml or http://www.indeksberita.com/sitemap-0.xml it says it’s invalid but if you use /sitemaperror.xml it says it’s valid ?
Or am I just missing the point and this is all about that priority number having too many decimals ? If that’s the case it’s relatively simple to modify it I think I already showed someone else how to do it in a previous comment maybe on http://patchlog.com/php/xml-sitemaps-pligg-module-v09/
it’s not about priority.
i mean sitemaperror.xml is just copy paste code of sitemap-0.xml and it works.
in other words there are no difference between both of them
so what makes google cannot crawl sitemap-0.xml if it’s same with sitemaperror.xml.
is there any other factor that influence so why google webmasters cannot validate.
as long i know parsing error is another words of syntaks error, isn’t it?
thanks for your help
i have already add your mail to webmasters tools
you can use it as you wish
we have currently upgraded pligg from 1.0.0 to 1.0.3 after that xmlsitemap is not working. it is enabled on the admin and all the settings are just like earlier but when we access the sitemapindex file it gives a blank page
please help us
enable error logging or error reporting and then come back with the errors 🙂
Hi Mahi,
we have solved the problem. we have disabled all our modules, then we cleared all the cache, then we enabled the site map module.
Now its working absolutely fine
Thank you very much for your quick answer
gishore
I have this error:
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘XmlSitemaps_Links_per_sitemap ) l’ at line 1 in
plig 9.9
seems like this is generated from an error in the module’s configuration