4.4. Cluster-aware policy
4.4.1. Example architecture
Generated messages are network load-balanced across all Halon instances, which are configured identically.
The delivery (localip) addresses are located on the proxies; any Halon instance can choose any localip.
The HAProxy layer can use keepalived
or similar, to provide resilience in the event of proxy host outage.
A basic approach is to statically divide the overall concurrency and rates across the Halon instances. This is acceptable for mailbox providers that allow concurrency greater than your number of instances, but some mailbox providers expect lower concurrency. A different approach to traffic shaping is needed for equitable sharing of connections.
4.4.2. Delivery Orchestrator: clusterd
clusterd
is a service that allocates concurrency, connectinterval and rates to instances. It support TLS authentication/verification of both the server and client, over TCP.
See Clustering directives for the required smtpd startup configuration. See also the Delivery Orchestrator service.
clusterd
registers connected Halon instances, counts them and announces the total number back to each instance.
It allocates outbound connections to instances in the cluster, ensuring fair rotation of concurrent connections for each (localip + grouping) combination, even for low concurrency destinations.
It ensures each instance with messages to deliver will receive a turn.
It works in dynamic scaling situations with Kubernetes, as well as static scaling with virtual machines.
If a Halon instance loses connection to clusterd
, it will revert to division of slots (min 1 connection per node) based on the last number of known hosts.
It will automatically try to reconnect and re-establish dynamic sharing.
Halon smtpd uses the settings in Clustering directives to communicate with clusterd
.
4.4.3. Writing policy for clusters
A system with clusterd
active has policies clustered by default.
Concurrent connections are granted to instances in the cluster.
The overall cluster follows the concurrency
properties for (localip
, grouping
) combinations in the smtpd-policy.yaml policies.fields
.
Because connections are opened, closed and pooled between specific localip
and grouping
, these fields must be used when writing cluster-aware concurrency policy.
smtpd-app.yaml
groupings:
- id: orange
remotemx:
- "smtp-in.orange.fr"
smtpd-policy.yaml
rate:
algorithm: tokenbucket
policies:
- fields:
- jobid
- fields:
- localip
- fields:
- tenantid
- fields:
- tenantid
- grouping
- fields: # cluster-wide policies here
- localip
- grouping
default:
concurrency: 20
conditions:
if:
grouping: &orange
then:
concurrency: 2 # Cluster-wide rate for all localip, see https://postmaster.orange.fr/
Rates for the cluster can be set on any field condition, because rates are always divisible.
How they are divided depends on your chosen rate.algorithm
setting.
Algorithm
tokenbucket
(default) gives smooth traffic flow over the interval by scaling the denominator, i.e.tokens / (interval * number_of_hosts)
Algorithm
fixedwindow
gives traffic bursts at the start of each interval, by allocating the tokens between hosts, with each getting at least 1, i.e.max(1, tokens / number_of_hosts) / interval
For fair rotation of connections between your instances, set delivery.pooling.timeout
to small values such as 5 to 10 seconds, and delivery.pooling.transactions
to 100 or less.
If you wish to disable rotation of connections on an entire smtpd instance, set cluster.policy.sharedconcurrency
to false
.
This will give basic “divide by n” concurrency, with at least 1 connection per instance.
4.4.3.1. Advanced usage: selective non-clustered properties
Rules can selectively disengage clustering, for example to give more throughput to time-critical message streams.
smtpd-policy.yaml
conditions:
if:
localip: 1.2.3.4
grouping: &yahoo
then:
concurrency: 5
cluster: false # each instance will get 5 connections
default:
rate: 5/1 # default rate applies cluster-wide
The cluster: false
property applies to all properties in then
scope.
To create policy that has some properties clustered, and others non-clustered, create two if
clauses:
conditions:
if:
localip: 1.2.3.4
grouping: &yahoo
then:
concurrency: 5
cluster: false # affects properties in “then” scope
if:
localip: 1.2.3.4
grouping: &yahoo
then:
rate: 10/1 # put (clustered) rate property in a separate if clause
default:
rate: 5/1 # default rate applies cluster-wide
4.4.4. Observing clustered policy in operation
The web user interface Delivery Insights view shows cluster behavior. For each policy setting, it shows:
the cluster’s concurrency and rate
the effective concurrency and rate in use on this instance.
The halontop command has a panel showing the cluster status and number of hosts found.
4.4.5. Moving messages between instances
Messages can be moved from the current MTA to another, using the hqfmove
command-line utility.
Before moving messages, either:
shut down the
smtpd
process, orunload specific messages using
halonctl queue unload
and move messages from thespool.path
folder to a separate folder, so new messages arriving are not mixed with the messages to transfer.
Run the command on the source MTA, specifying the folder of messages you wish to move with the --directory
argument. This will move all the messages in the specified folder and its sub-folders.
One or more destination MTAs can be specified with --server
arguments.
Each specifies the hostname/IP address and the destination’s environment.controlsocket.port
, or else the environment.controlsocket.path
.
If many
--server
destination hosts are given, messages will be pushed to destinations in round-robin fashion.If a hostname A record resolves to multiple hosts, messages will also be pushed to destinations in round-robin fashion.
All command-line arguments:
--server address:port/unix-socket [--server address:port/unix-socket]
--directory /path [--non-interactive] [--verbose] [--progress] [--rate x/y]