Network techniques for E-mail spam mitigation

Some definitions before we begin

E-mail is a protocol. And a protocol essentially means a client talking to server and vice versa. It involves handshakes, optionally sessions,cryptography in form of SSL, some secrets, counters and so on and so forth.

There is no email outside of the context of the Internet.

E-mail and Internet are synonymous and there is no other way than evaluate the behavior of email systems from a networking or layer III of the networking TCP/IP stack.

To figure out how email stacks up against other forms of abuse on the Internet, all you have to do is look at the statistics.

Every problem we see with widespread impacts to law enforcement or governments like fraudulent money transfer, threats and blackmail all happen over e-mail.

And e-mail can only be delivered and carried over networks.

So we naturally have to resort to understanding key networking concepts in order to demystify the world of email and consequently email spam.

Now let us get to some definitions for the purpose of this blog:

  • BOGON - a bogus network used to send spam(unallocated by IETF)
  • Botnet - a network with a command and control center and worker nodes
  • CIDR netblocks - An IPv4 address space that includes cluster of IPs

What is spam?

Flooding the Internet with many copies of the same email message is known as spamming. Otherwise known as UBE or UCE.

Unsolicited commercial email or Unsolicited bulk email.

While spammers can send thousands or even millions of spam emails at negligible cost, the recipient pays a considerable price for getting this unwanted mail.

These are the effects of spam

  • Reduction in worker productivity
  • Waste of available bandwidth,
  • Waste of data storage in mail server disk space
  • mail server efficiency

In addition to others we saw in other companion blogs.

Spam is unsolicited commercial email sent in bulk; it is considered an intrusive transmission.

These bulk messages often advertise commercial products, or suggest sex organ enlargement ideas but sometimes contain fraudulent offers and incentives.

Due to the nature of Internet mail, spammers can flood the Internet with millions of unwanted messages at negligible cost to themselves; the actual cost is distributed among the maintainers and users of the network.

Their methods are devious and illegal.

Specific crackers are hired to design systems that transmit a very large number of emails at the least possible cost to them. Unfortunately, these emails impose a significant burden upon recipients.

Issues caused by spam

In large companies, a considerable portion of the time of each worker is spent reviewing and deleting the spam itself, leading to a decrease in productivity. The increased network traffic has a deleterious effect on network performance, in general, and on the organization’s mail server(s), in particular. Also, data storage space is consumed by the need to store the large volume of mail.

It can be advertisements for low mortgage rates or sales on the latest electronic devices. It can be offensive like advertisements for drugs or pornographic websites. It can also be hostile and contain viruses, Trojan Horses, or other malware. The offensive spam may affect different people in different ways. Some may ignore it, while others may be deeply offended by it.

Employers can be held liable when an employee sues based on a hostile work environment, if the company was aware of the issue and has not acted on it.

Since spam originates from outside of your company, it is considered as a vendor or client harassing one of your employees. If you are aware of it, then it is your responsibility to take steps to remove it.

Employers face serious penalties if they don’t fix the working environment.

People who have been subjected to harmful work settings can sue in compensatory and punitive damages provided the company has more than few hundred employees.

If an employee leaves because of an environment judged hostile, they can ask for reinstatement, back pay and back benefits.

These messages may contain viruses, Trojan Horses, worms, and web bugs among other forms of malware.

The senders may try to fool recipients into believing that the email is safe and is from a trusted source by using the names from the address book. Without proper precautions in place (virus and spam protection), this malware can spread like wildfire in an enterprise environment and bring messaging and network infrastructure to its knees.

We discussed the proliferation of malware in other blogs here.

For instance, consider a company with 5000 employees. Introduce one worm on one of the workstation and it begins sending itself to all 5000 employees in the global address list. A few more worms get installed on other workstations and start replicating in the same manner. In a very short time the messaging load can clog messaging queues and network segments leading to slow network response or DDOS (Distributed Denial of Service).

The issue of backscatter is also significant since most bounce addresses are fake anyway. Spammers usually do a great job of covering their tracks along with bulk mailing to satisfy their criminal clientele.

Network behavior of bogons

A single mail server sending out spam is not very troublesome.

No matter what.

It cannot be used to send spam.

It can certainly be used to send even 50,000 e-mails in a day or more.

But spam sending is not from one single IP address.

It is a volume business. It cannot be so easy can it?

What is botnet spew?

A botnet spew is nothing but the mass mailings from automated machines or bots that masquerade as humans. Since SMTP is supposed to be operated by machines anyway there is no way for making out what is happening.

But certain safeguards exist and just like we have CAPTCHA to identify humans in website forms, we can indeed identify a human sender from a machine sender using some tricks too.

The idea of botnet spew is actually that of several recipients bombarded with mail from unknown sources and with the sole purpose of abuse.

The way to combat this attack is by employing appropriate firewall rules, to scale down TCP windows and so on.

Usually the attackers run out of resources and go away since they don’t have infinite resources at their disposal.

But the attack does cause trouble for at least a day or two.

How can IP address mean anything?

The idea of spam sending happening from certain IP addresses of CIDR IP address ranges mean that the most effective spam control technique is that of doing IP address filtering.

This is the idea of RBL or SPF in which a network block is used to identify a spammer from a good sender.

Or in other words an automaton spewing unwanted mail from a human typing real email.

Can you figure out the mail content from sending MTA?

The mail servers employed by spammers are usually not standards compliant and from the behavior of the mail sending MTA, we can glean some information that can be vital in arresting spam.

If the mail sender is known to have a low sender score or reputation or if it is a known open relay or found in an RBL, we can outright reject it without even looking into the mail content.

We don’t even allow the SMTP session to finish, so no question of mail content.

What happens when the SMTP dialog progresses?

SMTP is also network protocol and if the handshake happens slowly we know something is wrong.

SMTP is usually very fast and finishes within seconds of minutes.

It is by nature a queueing protocol and text based, very simple and effective at its job.

Each step on the mail server queue/relay means that the mail is retried without sending any message/bounce message to the sender. The recipient gets the mail once the queue is cleared in the upstream mail servers and the mail is finally delivered to the inbox once the user’s mail server gets the email.

But it is not so simple.

The user has an inbox that gets accessed by a mail client , webmail or something that talks an IMAP protocol like Dovecot.

The idea is that, the network semantics for a local delivery protocol is not so important since we are not talking on Internet unless this access is over a VPN.

The idea of SMTP being a network centric entity is due to the fact that network is a dynamic entity and IP addresses come and go. And mail servers are usually running 24/7 but the participation in your mail server maybe lasting only few minutes.

This blog addresses the network specific aspects of this protocol. Without delving into the MIME envelope and other factors like content scanning and Bayesian filtering.

How is an MTA expected to behave?

An email sender is expected to be standards compliant and play by the rules and maintain its rules and regulations and be a well behaved citizen on the Internet which is not owned by anybody and which has no rules.

Rules exist only for companies or organizations with accountability.

Spam control does take into account the overall behavior of an IP address from the MTA it runs and its behavior over time.

There are plenty of publicly available databases where this log is kept and it is possible to obtain information on the same when receiving mail from the MTA in question.

What are good and bad actors?

A good mail sender is supposed to play well with the standards mentioned in the SMTP rfc documents.

And a bad actor is usually one that does not retry mail or sends large volumes of mails.

No legitimate mail sender will pump mail traffic. So that is one telltale sign of good and bad.

And then there are other things like playing well with DNS records like SPF, DKIM, DMARC policies and so on.

What is the future?

The network characteristics of spam delivery is a very well understood topic today and despite of this those with a gmail account know well that spam is still not a solved problem.

Why?

This is only one of the weapons in our arsenal of spam fighting. There is still a long list of items to check before we can categorize a message as spam or as ham.

The same technology of networking and TCP/IP routers that carry your VoIP traffic, the Whatsapp messaging, the useful web traffic, the youtube videos also send out spam.

The raw metal has no clue what the payload means.

And it is all programs programs everywhere. So a mailbot vis-a-vis a human sending mail that is being carried by the mail server, the human is connected is hard to differentiate.

The only way to play fair is to abide by standards and not break any expectation. For instance if you send out mail, you must have a way to receive mail.

But sending bounces back is no longer acceptable today since the spammers usually use zombies or compromised machines and fake sender email addresses. So today nobody normally sends bounces to spam mails.