Net::SMTP Deficiencies and Suggested Workarounds

Transitory Connectivity Problems | 4xx Temporary Failures | Server is Down

Net::SMTP provides Simple Mail Transport Protocol (SMTP) connectivity to a single named server. If a service only needs unreliable e-mail delivery, and cannot rely on one of the several excellent modules that take advantage of a the many benefits of a local Mail Transport Agent (MTA), then Net::SMTP will suffice. Otherwise, if more reliable e-mail delivery is required, code will be required around Net::SMTP that begins to implement MTA features, and possibly that the server address be made more available.

Transitory Connectivity Problems

Connectivity problems, even if rare, are inevitable. If possible, establish a Service Level Agreement (SLA)—or, if that is not possible, a Service Level Guideline (SLG)—that dictates the availability requirements of the e-mail service. Can send failures can be discarded? If not, how long they are to be retried, or can they be discarded and replaced with an updated message after some time? These requirements will better inform how paranoid the implementation must be regarding delivery failures, retry attempts, and when to enter an alarm state.

If better transmission reliability is required, loop until the delivery succeeds, or a loop limit is reached. If possible, use the Mail Exchange (MX) records for the target domain:

#!/usr/bin/perl -w
#
# Example script to lookup MX for a domain.

use strict;
use Net::DNS qw(mx);

my $domain = shift || die "Usage: $0 domain\n";

# Inside this loop, attempt send with Net::SMTP. 'last' out of the loop
# if able to send the e-mail, otherwise either loop again, fail, or
# cache the message for later retry.
for my $target ( get_mx($domain) ) {
print "target: $target\n";
}

# Given a domain (hostname), returns a list of servers or IP address for
# the MX, or failing that, A records for that domain. Throws error if
# DNS lookup encounters a problem.
#
# A better script might cache the DNS lookups, or rely on a local
# caching nameservice to provide this support.
sub get_mx {
my ($domain) = @_;

my $res = Net::DNS::Resolver->new();
my @mx = mx( $res, $domain );

if (@mx) {
@mx = map $_->exchange, @mx;
} else {
# Fallback to A if no MX found
my $query = $res->search($domain);
if ( !$query ) {
die "$domain A record lookup failed: ", $res->errorstring, "\n";
}
@mx = map $_->type eq 'A' ? $_->address : (), $query->answer;
}

die "no MX or A found for $domain\n" if !@mx;

return @mx;
}

If sending to a single host instead of using MX records, make the send attempt multiple times, until either the message is sent, or a loop limit is reached. At the loop limit, a decision will need to be made as to whether the message should be discarded, an error issued, or the message queued somehow for later delivery. Introduce a delay between the sent attempts. This delay, multiplied by the loop limit, determines the amount of time spent attempting to send the message. If possible, make this time span larger than expected outage windows for scheduled downtime. Normal outages must not cause needless false alarms.

#!/usr/bin/perl -w
# Example multiple send attempts code
use strict;

my $MAX_ATTEMPTS = 3;

my $is_sent = 0;
for my $try ( 1 .. $MAX_ATTEMPTS ) {

eval { send_email("TODO_ADD_ARGS"); };

if ($@) {
warn "info: send attempt $try failed: $@\n";
sleep int( $try + rand($try) );

} else {
warn "info: send attempt $try pass\n";
$is_sent = 1;
last;
}
}
if ( !$is_sent ) {
die "error: unable to send in $MAX_ATTEMPTS\n";
}

sub send_email {
# TODO flesh out & die on (proper) failures
die "uh oh" if rand() > 0.5;
}

SMTP vs. Load Balancers

A load balancer will likely not suit legitimate MTA, as these will back off a Virtual IP Address (VIP) should any VIP member be down. This increases the odds that the entire VIP will be marked as unavailable, should any one VIP member indicate a problem. However, with a custom implementation, a VIP could be tried multiple times until a functioning VIP member is reached. If possible, layer 7 health probes should be used so that bad VIP members can be removed automatically. However, note that detecting whether e-mail is flowing properly through a particular system is difficult, as a server may accept the message, but promptly discard it—or some subsequent system rejects the message, and the bounce wanders off to a forgotten dead.letter file. Most test suites therefore require that test e-mail be delivered to some mailbox, so that the suite can then verify e-mail transmission by checking the contents of a mailbox via the Post Office Protocol (POP) or the Internet Mail Access Protocol (IMAP).

4xx Temporary Failures

SMTP temporary failures indicate that the connection is working, and the remote server responding, though that for whatever reason (disk full, configuration problem, greylisting, whatever) the server is (for the moment) unable to process the message. These may occur at most any point during the SMTP dialog.

Error handling must distinguish between temporary failures (4xx responses), recoverable permanent failures (server unreachable, but may be in the future), and unrecoverable permanent failures (an address is invalid, or a domain does not exist). Unrecoverable permanent failures should be sidelined for investigation, or automatically removed, while recoverable failures should be retired at some later date. If the message is destined to multiple recipients, specific recipients may fail, depending on the options passed to the recipient method. Net::SMTP does not throw errors, so no simple eval { … }; if ($@) { … } can be used to catch and handle problems. Assuming delivery to a single recipient, message sending code with error handling may run something like the following.

#!/usr/bin/perl -w
use strict;

use Net::SMTP;

# TODO define or populate these
my ( $mx, $mail_from, $rcpt_to, $msg );

# TODO loop over MX, or make multiple send attempts
eval { send_email( $mx, $mail_from, $rcpt_to, $msg ) };
if ($@) {
if ( $@ =~ m/^5/ ) {
die "permanent failure: $@";
}
}

sub send_email {
my ( $host, $mail_from, $rcpt_to, $msg ) = @_;

# DBG add Debug => 1 to see SMTP protocol
my $smtp = Net::SMTP->new($host);
$smtp->mail($mail_from)
|| handle_failure( $smtp, 'mail' );
$smtp->to($rcpt_to)
|| handle_failure( $smtp, 'to' );

$smtp->data() || handle_failure( $smtp, 'data' );
$smtp->datasend("To: $rcpt_to\n")
|| handle_failure( $smtp, 'data_send' );
$smtp->datasend($msg)
|| handle_failure( $smtp, 'data_send' );

$smtp->dataend() || handle_failure( $smtp, 'data_end' );
$smtp->quit() || handle_failure( $smtp, 'quit' );
}

sub handle_failure {
my $smtp = shift;
my $call = shift;

my $smtp_msg = ( $smtp->message )[-1];
chomp $smtp_msg;

die join( ':', $smtp->code, $call, $smtp_msg );
}

If possible, test against a real MTA, with special addresses configured to accept, temporarily deny, or permanently deny messages. With Sendmail, this could be done using virtusertable entries, or with a milter such as MIMEDefang, which allows Perl code to return any status at any stage of message delivery.

Server is Down

Complete connectivity failure, beyond the set number of attempts over a defined period of time, also must be handled. This may require caching the e-mail, either in memory or to more permanent storage, or discarding the send attempt, depending on the reliability needs of the service, and the nature of the e-mail. The SLA or SLG should dictate how this condition is to be handled. Monitoring should cut alarms, so that the cause of the failure can be determined.