Migrating Qmail: smtproutes not working?

While migrating a Qmail installation to a new server, I ran into a peculiar problem. In order to avoid spreading out mail over two different servers while the DNS-change is propagating to all the dns-servers, you can use the smtproutes file to instruct Qmail to deliver mail to a different server. Basically you tell it to accept mail for the domain you are moving, but instantly deliver it remotely to the ip-address or domain of the new server.

The procedure is actually quite simple. Mirror all the accounts on the new server, so that all the mail gets accepted there. Then, on the old server, remove the domain from the /var/qmail/control/virtualdomains file (and locals file if it contains the domain as well). Finally, if it doesn't already exists, create /var/qmail/control/smtproutes and add in there. is the ip-address of the new mail server (you could also enter the domain name there), and is the domain you are migrating.

By using this procedure, you ensure that no new mail gets delivered on the old server. Mail servers that haven't seen the updated DNS entry yet will deliver to the old Qmail server, which will forward the message immediately to the new one.

This is where I ran into a problem. Apparently when I was testing out this configuration, the mail wouldn't end up on the new server. I would receive a bounce, containing the "554 too many hops, this message is looping" error. I could see in the logs that indeed, the message was looping. It was tried around twenty times in fast succession, never leaving the old mail server. I looked with tcpdump on the old and new server, to see which one was doing something wrong, but the new server didn't even get contacted at all.

After researching a bit several people encountered the same 'too many hops' issue in combination with smtproutes. The offered solutions were always the same:

  1. Make sure the domain is still in /var/qmail/control/rcpthosts, otherwise Qmail won't accept the mail at all (it won't relay for unknown domains)
  2. Remove the domain from /var/qmail/control/virtualdomains and locals
  3. Add the domain to the /var/qmail/control/smtproutes file in the format domain:newmailserver

Everything on that list was as it was supposed to be, and yet mail wasn't being forwarded to the new server. The old server didn't even open a connection to the new one. Even worse, all incoming mail was being bounced.

It took me a while, but I finally understood what the problem was. On the old server, I have a /29 block of ip-addresses available to put servers on, and all of them were assigned to the old server. The new mailserver, on a physically different machine, was configured for one of those ip-addresses! It didn't really matter that the old server still had that ip-address active as one of its own, as the router knew about the change. The mail server however thought that the ip-address I put in smtproutes was actually on the same old machine, because the network interface for that ip-address was still up.

As soon as I ifconfig down'ed the network interface for the obsolete ip-address on the old server, the mail forwarding worked just fine. So even if the three points you need to check above when smtproutes isn't working don't do the trick, make sure you're not forwarding to an ip that's assigned to a network interface on the old machine!