The mail world - Mail servers, spam control and virus scanners

Introduction to SMTP

SMTP stands for simple mail transfer protocol and is documented in RFC5321 but it has been standardized some 30 years ago stared long long ago.

Such a long time ago when Internet was just in its infancy and email was still being considered a hobby project.

It is a text based project a la HTTP and it has only 5 or so commands.

HELO/EHLO
MAIL FROM
RCPT TO
DATA
QUIT

And the status codes also match the standard HTTP codes in what it means to have a 2xx or 4xx response.

The email world is still full of text though attachments with various media types are all very well supported using the MIME protocol. Although MIME started off as an email thing, it is now commonly used by browsers and elsewhere to associate a program with being able to handle a file type.

Even though the end user sees a mail as a document with or without attachments, beneath the surface on the wire, SMTP protocol is still all text and the MIME envelope is received as one block of text.

To quote wikipedia on SMTP encoding:

Lines of Quoted-Printable encoded data must not be longer than 76 characters. To satisfy this requirement without altering the encoded text, soft line breaks may be added as desired. A soft line break consists of an = at the end of an encoded line, and does not appear as a line break in the decoded text. These soft line breaks also allow encoding text without line breaks (or containing very long lines) for an environment where line size is limited, such as the 1000 characters per line limit of some SMTP software, as allowed by RFC 2821.

And by MIME decoding, we can deconstruct the encapsulation and take a peek inside.

All spam filters must be able to look into a MIME message in order to do its job.

SMTP stands for as you guessed Simple Mail Transfer Protocol which powers the entire Internet email as we know it. In today’s world there would not be a single person without an e-mail address or without exposure to the e-mail connectivity idea.

Everyone and his brother in law can send and receive mail. And one can send any attachment and receive any media file as attachment.

But underneath things are not so straight forward.

Even today in 2021, E-mail does not allow binary encoding. It uses 30 year old end markers and a variety of MIME encodings and formats.

Microsoft Exchange server Exim Internet mail Haraka transactional SMTP OpenSMTPD MTA Postfix MTA Qmail by Dan Bernstein

Postfix

Postfix MTA

Postfix is developed by Vietse Wenema, a Swiss researcher. It is quite popular and came as a fresh air when the world was getting bored and annoyed with the arcane old sendmail. It is widely used on the Internet and almost standard in all Debian Linux systems.

Postfix is a nice SMTP implementation and one of the best as it is well tested and the main.cf and master.cf config files are easy to edit and get going.

Postfix came across as being most mature and usable before the advent of OpenSMTPD.

The utility of Postfix is further enhanced by wide support in mailing lists and also its very excellent documentation.

Exim

Exim Internet Mailer

Exim is not very widely used but it is easy to setup. It supports some anti spam plugins, some commercial ones too.

It is from UK and not as widely used as Postfix.

OpenSMTPD

OpenSMTPD MTA

OpenSMTPD is from the OpenBSD project. It is developed mainly by Gilles Chehade and Eric Farout but many others from OpenBSD project pitch in and send patches.

It uses a filtering and config syntax very reminiscent of the clean OpenBSD world and this is by far the best SMTP implementation due to its power and elegance but it needs lot more field experience before the bugs are all ironed out and considered mature.

The documentation of OpenSMTPD is excellent and as and when commercial support is available the development also happens at a very rapid pace.

The widespread adoption is obvious when you google for a configuration problem.

Microsoft Exchange

Microsoft Exchange server

Microsoft Exchange is a popular mail server from Microsoft and the active directory integration and Microsoft Outlook protocol called MAPI made it popular in olden days. Today it is mostly used by the Offic365 suite.

Supposed to be stable and comes with the usual caveats of any Microsoft product.

Qmail

Qmail by Dan Bernstein

Qmail is written by Dan Julien Bernstien, a professor and is somewhat odd in that the mail server source code is not standard C style and is a bit awkward in the way it implements several things.

It is also popular though the community and code are both arcane. Not sure how many sites run qmail.

Haraka

Haraka transactional SMTP

Haraka is a transactional SMTP written in node.js. It is useful for various things as it has a very powerful plugin system. It is the new kid on the block of MTAs.

If I were to choose a mail server I would stay away from Haraka.

Spam control and virus scanning

rspamd spam filter ClamAV anti virus

rspamd is the C based spam control software SpamCheetah uses extensively and it scales remarkably well due to its being written in C. It also includes several spamassassin rules and it has ability to talk to several other backends.

ClamAV is the ultimate malware/virus scanner that is widely deployed and used in all places , not just virus scanning of email MIME content. ClamAV is useful for detecting malware in file system and is very widely supported in POSIX compliant systems.

rspamd is a world of its own

RSPAMD is a fantastic piece of software written by Belgian Vsevolod Stakhov.

Due to its extensive support of plugins and subsystems, rspamd supports a wide variety of algorithms and rules. In addition to its lua plugin architecture in which plenty of lua modules are inbuilt , it also has connectivity to several commercial APIs including the most powerful VirusTotal API for malware testing.

rspamd is not so widely used at the moment as the software is recently released and somewhat not widely known but the right people know it.

By the way rspamd also has ability to talk to ClamAV anti virus and various other DKIM signing/verification and such. You can obviously do content scanning and training using either fuzzy logic or ML.

The default configuration of rspamd in most UNIX systems are usually sufficient for most purposes but some more tweaking will go a long way towards creating great spam control.

rspamd also supports greylisting in a big way but Spamcheetah does not use it.

It is best to use rspamd and not the spamassassin rules since that will slow down thing quite a bit. rspamd is fully compatible with spamassassin ruleset.

It does pretty amazing RBL and header checks in addition to today’s contemporary spam fighting techniques.

Can you run a mail server without spam control?

This is a rather complicated question. You can indeed run OpenSMTPD without spam control as there are some built in filters. Ditto with postfix.

But in most cases, 9 out of 10, you definitely need a spam control product, either open source/free/gratis or commercial.

And even with spam control virus scanning and protection against BEC and spearphishing still attacks do occur.

The world of ClamAV

ClamAV is the biggest brand in open source virus scanning. Its signature are kept up-to-date with a companion script freshclam.

The script freshclam is to be invoked daily using cron job to make sure we get the latest malware signatures.

There is also another new software called Rkhunter. But we almost entirely depend on ClamAV.

Usually ClamAV needs modern standards RAM to do its job since it is very processor intensive.

The clamdscan executable is to be used for quick scanning as it runs as a daemon process in typical UNIX style and scans very quickly.

The time tested ClamAV suite has been invaluable for combating malware and viruses for decades now.

How do these things fit together?

The e-mail universe is quite diverse and interesting as you can see from above. But gaining familiarity and real life experience is key. Just theory from this blog or Google search won’t do.

You must go out there and get your feet wet, run your own mail server and taste the fun/misery of doing things on your own. You must learn to muck around with DNS MX records and learn how to worry about IP reputation, how not setup an open relay and so on and so forth.

After a lot of trial and even more errors you shall get the hang of email and even learn how to send a mail using netcat or mutt.

MUTT is a very powerful e-mail client nobody knows outside the techie /geek community.

Mutt 25 years celebration

But we guys use it a lot for our testing and daily mail traffic.

The popular ones need no mentioning, the browser based clients, Microsoft Outlook, Novell groupwise(dead?), and numerous others.

But the way things are going, the e-mail readers and composers are just simple IMAP protocol clients and most of the work happens at mail server level anyway.

The way you access your inbox using a local delivery protocol like IMAP or POP3 is very different from the SMTP protocol used for mail sending, queuing and delivery.

Here is a sample mutt display on my laptop.

My mutt screen

IMAP is useful for applications like email archival or even some spam training using the dovecot Sieve subsystem.

The importance of the mutt email client cannot be underestimated even with very powerful UNIX tools like nmh, nail and so on.

We have been using mutt for over 15 years now and it seems to do its job really really well.

Here is my muttrc for you to copy from:

set imap_user = 
set imap_pass = 
set editor=vim
set fast_reply=yes
set include=yes
set sort_aux = last-date-received
set forward_quote=yes

set smtp_url = "smtp://user@spamcheetah.com:587/"
set smtp_pass = 
set from = user@spamcheetah.com

set reverse_name=yes
set realname = "Your name"

set folder = "imaps://spamcheetah.com:993"
set spoolfile="+INBOX"
set copy=yes
set text_flowed = yes
set record ="=INBOX/Sent"
set postponed ="=INBOX/Drafts"
set move=yes
set delete=yes
set trash="=INBOX/Trash"
set mbox='=INBOX/Archive'
bind editor <space> noop
bind attach <return> view-mailcap
macro attach B "<pipe-message>cat > /tmp/mutt.html; /usr/bin/google-chrome /tmp/mutt.html<enter>"
set index_format="%Z %c %D %?X?%X 📎& ?  %L %s "

set date_format="%I:%M %p %d-%b-%y"

set header_cache=~/.mutt/cache/headers
set message_cachedir=~/.mutt/cache/bodies
set certificate_file=~/.mutt/certificates

set timeout=15
auto_view text/html
auto_view text/calendar application/ics

set query_command="goobook query %s"
macro index s  "<change-folder>=INBOX/Sent<enter>"  "go to Sent Items"
macro index l  "<change-folder>=INBOX/Archive<enter>"  "go to Archive"
macro index t  "<change-folder>=INBOX/Trash<enter>"  "go to Trash"
macro index i  "<change-folder>=INBOX<enter>"  "go to Inbox"

macro index,pager a "<pipe-message>goobook add<return>" "add sender to google contacts"
bind editor <Tab> complete-query
color normal        brightgreen    default         
color error         red             default         
color tilde         black           default         
color message       cyan            default         
color markers       red             white           
color attachment    white           default         
color search        brightwhite   default         
color status        brightred    black           
color indicator     brightblack     white          
color tree          yellow          default          
color index	    red           black ~p

Even with various encrypted mail and SSL enabled mail traffic , SMTP and text based mail wire protocols dominate the traffic.

As long as email exists SMTP will. And this ecosystem of various tools and protocols and filters et al will too.

References

Haraka OpenSMTPD Exim Postfix MS Exchange qmail