High availability and load balancing
The Halon MTA is agnostic when it comes to strategies and infrastructure around high availability and load balancing, but does support a fair bit of techniques and standards to make it easier.
Inbound load balancing
Inbound load balancing can be achieved in a few different ways, the most common way is to utilise DNS round robin by assigning each MTA its own MX record, and adding them all to the domain in question. This approach trusts that the sending party will respect the priorities specified and does the connections in the way specified. Typically this works very well.
For more control a load balancer can be put in front to handle the distribution of connections. In that case extra care needs to be taken to ensure that the correct source IP is being used on the MTAs. This is especially true if the MTAs does filtering that relies on the connecting IP. Halon supports the XCLIENT command as well as the PROXY protocol for inbound connections to achieve this.
Outbound shared IP
For several MTAs to share the same IPs for sending some additional infrastructure needs to be put in place. This can be achieved either with a load balancer with support for the PROXY protocol (HA Proxy for example) or with a network topology that supports that the MTAs sends from an IP address not associated to them (BINDANY). Both of these techniques can be configured from the Pre-delivery script.
Queue recovery
The Halon MTA stores all messages currently in the queue on disk in a spool folder, messages are stored in separate files that contain the message itself in addition to various other information such as metadata. These files are enough to restore the queue after a failure. At startup the spool folder is read and all messages are loaded, the halonctl command can also be used to load messages into a running MTA.
Take extra care to ensure that the storage used by the MTA is reliable so that the files can be accessed in the case of failure. Also make sure that the fsync setting is correctly configured to reflect the type of traffic passing through the system, fsync can also be configured on a per-message basis.
Queue-less operations
It is possible to run with a configuration that does in-line deliveries instead of queueing and thus making the MTA stateless. With this configuration messages are not accepted until they've been successfully delivered to the destination, and because of that never holds any data in the queue. This is useful for inbound mail flows where the backend servers are located close to the MTAs, it is not recommended to run an outbound flow with this configuration.