Comment relish optimization
Comment Relish is a wordpress plugin that sends an email the first time someone comments on your blog. The plugin allows you to define the message in wp-admin and you can embed tags like author's name, email, website and others in the message.
This can be used to send a welcome message to the first time commentators and maybe invite them to subscribe to your rss feed.
John Chow, tried to use it on his blog but the plugin and his huge number of comments ( 43000 ? ) put the site down for about 2 hours. He said the plugin misbehaved and he would not recommend it if you have a lot of comments.
I did a little simulation on a wordpres blog with about 25000 comments and I can see how the plugin misbehaves . It took more then tho minutes to select the email addresses that should be emailed before I decided to stop it, because that's just unacceptable, but I think it would have taken a lot more.
I managed to optimize it easily by just modifying a field in the table used by this plugin. If you don't care how I did it and just want the optimized plugin skip to the end of the post. If not then read on...
How it works
The plugin uses a table ( cr_emailed ) to remember which addresses received an email, so that it does not send a message more then once. Basically the plugin just inserts a new record in this table every time someone new comments on the blog. It tries to see if someone is new by doing a LEFT JOIN select on two tables: comments and cr_emailed, and then filters the results that have cr_emailed.email = NULL. The problem with this is that the email column has no index so if you have 43000 comments the query will have a huge result set and mysql will have to process each result to find out if cr_emailed.email is null ( the email has not received a message yet ) .
This was really simple, just convert the email field to a varchar ( needed in order to be able to set a fixed length key on this field ) and then add an index on it.
After doing this the query that took more then two minutes, now takes a little less then 1 second.
Some one on John Chow's blog suggested to add an index to wp_comments.comment_author_email. I tried that but I noticed no improvements, and using explain on the sql query shows the index on comment_author_email is not used, so that's useless.
Looking more over the code I noticed another weird thing. The query executed to find the email addresses looks like this:
- SELECT c.*, p.*
- FROM wp_comments c
- INNER JOIN wp_posts p ON p.ID = c.comment_post_ID
- LEFT JOIN wp_cr_emailed e ON e.email = c.comment_author_email
- WHERE e.email IS NULL AND c.comment_approved = '1'
I wonder why the author used the posts table in this query? The results from the posts table are not used anywhere in the code. I really couldn't see a good reason for keeping the posts table in this query so I removed the posts table from this query, making it even faster.
Get the source code
Here is a diff file between my version and the original version and here is a my new version . If you have this plugin already installed you have to uninstall it and remove the wp_cr_emailed table and then copy the new version and activate it in wp-admin.
I'm using the new version on this blog and is working but I have few comments here. I have only tested the modifications on this blog and on a fictional blog with 25000 automatically generated comments so the usual disclaimer applies... you know all that "WARNING NO WARRANTIES" stuff...
If you have a blog with many comments and you want to try this, let me know how it works in a real environment.