Monthly Archives: July 2008

debian: building custom exim packages

This is a small howto that explains how to build custom exim4 packages on debian.

It was tested with both exim 4.63 ( on debian etch ) and exim 4.69 ( on debian testing/lenny ) .

I needed to build a custom exim email server that would be built with domainkeys and/or dkim support for signing outgoing messages.

So here are the 12 steps I took to get this done:

  1. Create a directory named exim where all activity will take place.
  2. Make sure you have the 'source' URIs in your source.list file.
    If you don't have them put them in  and then run apt-get update
  3. Install packages required for creating a custom package and building it:
    1. apt-get install dpatch fakeroot devscripts \
    2. grep-dctrl debhelper gcc libc6-dev libssl-dev pbuilder
    3.  
  4. Install exim4 source package:
    1. cd exim
    2. apt-get source exim4
  5. unpack standard configuration files:
    1. cd exim4-4.63
    2. fakeroot debian/rules unpack-configs
  6. Define the new package name. In this step we just put the new package name in a variable and export it in the environment to make the next steps easier. You can use anything for the package name ( actually it's just a package name suffix ) but I recommend using 'custom' for the package name for one main reason: dependencies. Packages that depend on exim4-daemon-light or exim4-daemon-heavy (like sa-exim, mailx and maybe others ) already accept exim4-daemon-custom as a replacement so with this custom package you're not breaking any dependencies.
    Ex:
    1. export my_pkg_name=custom
  7. Edit configuration files. There should be 3 EDITME configuration files for exim and one for eximon, one for each package that will be built. Copy one of the exim EDITME file to EDITME.exim4-$your_pkg_name then edit the new file to set up the new options you want.
    Ex:
    1. cp EDITME.exim4-heavy EDITME.exim4-$my_pkg_name
  8. pack the configuration files so your new configuration will be saved and used at build time:
    1. fakeroot debian/rules pack-configs
  9. Create the custom package. This is required only if you use a package name other then 'custom':
    1. sh debian/create-custom-package $my_pkg_name
  10. Activate the new package in debian/rules. Edit debian/rules and look for the line where the extradaemonpackages variable is defined and add your package name ( exim4-daemon-$my_pkg_name ) to the list of packages defined there.
  11. Install build dependencies. You can skip this step if this is not the first time you build this package.
    1.  
    2. /usr/lib/pbuilder/pbuilder-satisfydepends
    3.  
  12. Build the packages:
    1.  
    2. debuild -us -uc
  13. Install the new package. if you already had some version of the exim4-daemon package installed you will have to remove it first and then you can install the custom package. The new package will be in the base directory created at step 1.
    Ex. (for amd64 etch exim 4.63-17 ) :
    1. cd ..
    2. dpkg -i exim4-daemon-${my_pkg_name}_4.63-17_amd64.deb

This process went pretty well for both exim 4.63 and 4.69 on lenny. Exim 4.63 only had experiemental support for domainkeys ( not dkim ) and exim 4.69 on lenny had support for both but I was only able to build it after applying a small patch to exim to make it work with the latest version of libdkim ( 1.0.19 ) .

This post was intended to be a general howto about building a custom exim package. I will write more details about actually building exim with domainkeys and/or dkim in a future post.

Scour: The socially search engine

I just started using Scour, a search engine that let's you vote and comment on the results.

Scour queries the top 3 major search engines: Google, yahoo and live to provide results so it's like using your preferred search engine with a social twist. You can vote up or down each result and comment on it and then scour uses this data ( votes , comments ) to provide better  relevancy.

The problem is that when people want to search they want results quick and once they get them they just leave so in order to encourage users to contribute scour rewards them with points that can be converted in money using visa gift cards.

The idea is that since the major search engines are making billions from search the user should get something ( more then just search results ) out of it too.

Once you signed up to scour you can start using it for your daily search just like you did with google, yahoo or msn . They even have a search bar plugin for internet explorer and firefox  and in the faqs you can find instructions about how you can make firefox use it as the default search engine instead of google. There is also a toolbar but apparently it's only for internet explorer or only for windows ( .exe ) .

As you keep searching, voting, commenting you accumulate points. For each search you get 1 point, for each vote you get 2 points and for each comment you get 3 points but a maximum of 4 points / search  and once you reach 6500 points you get a $25 visa gift card.

I like Scour both for the ideas of higher relevancy through votes and comments and also for rewarding users.

Scour is still in the beginning and there are some small problems with it like : try searching for : 'var/log' or the fact that it only displays 3 pages of results, but I'm sure they will be fixed and the search engine will improve over time.

Of course the whole idea of better relevancy will work only if more users will sign up , use it regularly and contribute.

Where’s the xml sitemap?

Someone contacted me through the contact form to ask me where is the xml sitemap generated by the xml sitemap module for pligg .

If he would have read my first post about this module I think he would have eventually figured out where it is but since that first post was written a long time ago let me answer that question in this post.

I will do this in a post instead of answering privately because maybe there are others that might run into the same problem and I hate answering the same question over and over.

The module doesn't generate a sitemap but a sitemap index ( that's basically just a list of sitemaps in xml ) and unless you're using cache the module will generate it every time someone goes to the sitemap's URL.

If you're not using friendly urls for sitemaps then the url to the sitemap will be :

http://yourpliggsite.com/module.php?module=xml_sitemaps_show_sitemap

If you want to use friendly urls for the sitemap you will have to configure it as described here

Last time I checked ( when I first created the module ) ask.com could not be pinged unless your sitemap url looked like a static url or/and was ending in .xml and this is why I created the module with this choice in mind. If you don't care about pinging ask.com or if ask.com changed it's policy ( can anyone check this ? ) then you don't need friendly urls for sitemaps.

For the future I would appreciate if such questions would be asked in the comments instead of private contact. I prefer the comments for answering questions about my posts or the code in my posts because this way others can benefit from my answers or others can contribute.

The contact form would be for private matters like asking for consultancy , business proposals or others that don't fit into the comments.

MySQL: counting results

You have a query and you want to display the results on a web page but because there are so many results you want to paginate the data so the user can have some links like "prev page, page 1, page 2, next page, last page" that you can see on a lot of sites these days. This is a common problem a web developer faces, it's not hard to solve but it is often not solved in the best way.

The pagination concept is based on the fact that you can retrieve just part of the results using a limit clause in the query and display them on a page. This usually makes the query faster and allows the user to easily navigate without crashing his browser or having to scroll long pages.

If you want to show the user the total number of results or you want to allow them to skip right to the last page then you need to count the total number of results that the query would return without the LIMIT clause.

How some do it?

I have seen some badly designed software that was just removing the LIMIT from the query running it and then calling mysql_num_rows() to count the rows. That may be ok if your table has just a few rows and the query returns quickly but if your table will grow to a few thousand rows or if your query joins several big tables you're going to get in troubles

So how can this be done better?

There is no way that would be best for any case but here is what you can do:

  1. if your query is simple enough to not use the "group by" or "having" clause  you can simply remove all fields in your query and replace them with "count(*)" this will be really fast especially if you have the right indexes set on the table(s) in the query
  2. if your query does use "group by" then modify the query to use SQL_CALC_FOUND_ROWS.

Here is an example of the second option that may be more general as it works with any query and I think it's preferable even if it may be slower then count(*)

We have this query:

  1. SELECT age,count(*) FROM users WHERE age>18 GROUP BY age LIMIT 0,10

you would use that to display a list of ages and how many users have a certain age in your table, you want the list to have 10 results / page and your table is really big so it's very likely you will have more then one page to display.

As you can see this query already has a "count" and "group by" in it so you can't use count to get the total number of results.

If we modify this query like this:

  1. SELECT SQL_CALC_FOUND_ROWS age,count(*) FROM users WHERE age>18 GROUP BY age LIMIT 0,10

the query will return the exact results as the previous one but now if we do this :

  1. SELECT FOUND_ROWS()

we will get the total number of rows that the last query would have returned without the LIMIT clause.

This is a lot faster then running the query without the limit and counting the results with mysql_num_rows because MySQL will to the counting internally and will not have to return the whole result set to the client .

Other ideas to improve performance

Fetch details for a record in separate queries. Let's say you have a query that joins several tables and you want to display details from all those tables in a single row in your list. The joins make your query slow because it will have to examine a lot of rows when doing the count .Try to remove as many of those joins as you can do the count and then for each row in your list just run separate queries to get the other details.This way you will examine just a few rows from the other tables because you'll do the extra queries only for the results you are currently showing on a page.

Enable mysql slow query log then watch it to see how long your queries take and how many rows are examined.

Use explain to see if your query is using the right indexes and create indexes where you think they will improve the performance. If the explain will show the query will use a temporary table make sure your temporary table can hold all data in memory, if you have enough
( check the tmp_table_size and max_heap_table_size variables )

Enable query cache so the server will just server the results from cache instead of doing all the work over and over for data that is unchanged.

There are a few other techniques I have found on the official mysql documentation site, but these presented here helped me a lot in working with lists and counting the results.

If you have other tips I'll be happy to see them in the comments.