. mombe.org
home of the mad cow
  Not A Blog

Saturday, September 15, 2007

Secondary MXs (and Exim specifically)

For some time, I've been considering the need to run an off-site secondary MX for the University. From a disaster planning perspective, delivering mail off-site when the proverbial hits the fan is a good idea.

Unfortunately, the problem with secondary MXs on the modern Internet is SPAM. Unless you can get a complete list of local parts to the secondary MX, spammers are going to use it as a back door to inject SPAM into your mail systems. In our case, generating a complete list of local parts is something that's very complicated to do, so we haven't ever done anything about a proper secondary. Instead we've multi-homed our primary MX on three different networks. This solves network outages, but still leaves other problems.

When I set up my virtual machine last year, my VSP had an interesting solution to this problem. They offered secondary MXs to their customers, but they worked in a somewhat unusual way. The basic idea is that when I connect to the secondary MX, it checks to see whether or not the best preference MX is available. If it is, then they temporarily defer the mail with a message that says "please use the primary MX".

This works quite well. So far as I can tell it's in line with the RFC, which says "… it MUST sort the MX records to determine candidates for delivery. … If records do remain, they SHOULD be tried, best preference first, …". So if a better preference MX exists, a client should prefer to use it.

I decided that this would solve my secondary MX problem. So I needed to figure out how to implement it in Exim. This turned out to be reasonably simple, but there are some catches.

The problem is how to determine whether the best preference MX is available. I initially tried using Exim's readsocket expansion to connect to port 25. This turned out to be quite expensive (a TCP connect for each RCPT TO:) so I the tried using the exec expansion to call ping(1). This was slow when the mail server was down. We had to wait for ping to timeout before accepting mail, and we had to do this on every RCPT TO:.

My eventual solution was to use Exim's embedded Perl functionality to call a subroutine that checked whether or not the MX was available, and cache the result for a pre-determined time. This meant that I only did the test for the first RCPT TO:, and used the cached results for the remainder.

There turned out to be a catch to this too. Exim spawns a new process for each incoming connection, so to share the cache between processes, I needed to use shared memory. Fortunately this is fairly simple in Perl.

The result was an Exim config that looked something like this:

SECONDARYMX = /usr/local/etc/exim/list.secondarymx
acl_check_rcpt = acl_check_rcpt
perl_startup = do '/usr/ru/bin/exim_test_mx.pl'
begin acl
  accept  hosts         = +relay_from_hosts
  defer   message       = A better preference MX for $domain is available, please use it.
          domains       = lsearch;SECONDARYMX
          set acl_m_mx  = ${lookup {$domain} lsearch {SECONDARYMX}}
          set acl_m_alv = ${perl {test_mx} {$acl_m_mx}}
          condition     = $acl_m_alv
          log_message   = ping to $acl_m_mx returns $acl_m_alv
  accept  domains       = lsearch;SECONDARYMX
          verify        = recipient
          control       = queue_only
  # don't run an open relay
  deny    message       = Relay not permitted.
begin router
  domains = @mx_secondary
  driver = manualroute
  transport = smtp
  route_data = ${lookup {$domain} lsearch {SECONDARYMX}}

(Of course this isn't a complete config, it's only the relevent bits). The exim_text_mx.pl script is available too. It uses Net::Ping and IPC::Sharable, available from CPAN. YMMV and all that.

posted by guy at: 13:49 SAST | path: /systems | permanent link

Bloxsom Powered

© 2002-2005, webmaster@mombe.org
RSS Valid XHTML 1.0!

Creative Commons License