Am I a member?
Browse the member listing...

Fighting Spam Out in the Open with Open Source Software

Law firms face two major challenges in dealing with spam:  preventing it from disrupting workflow in law firms and making sure individual users’ messages are reliably delivered without being filtered.  Open source software (OSS) is a key component in the ongoing war against spam.

Methods of Controlling Spam
The most basic way to block spam is to block e-mail from anywhere, unless explicitly allowed.  But in a perfect world, the spam filter should be capable of denying connections from any known spam sources.  This is usually attempted with blacklists, reverse-name lookups and with new technologies that leverage DNS.   And another way to control spam is to make it “expensive” for the sender by charging for sent e-mail or making it difficult to send large volumes of e-mail.  These and other options are available as onsite products or by purchasing the services from outsourced providers.

Open Source Software
OSS projects are at the leading edge of technologies used to stop spam, and many of the newest techniques for filtering spam originate in open source projects.  TMDA (Tagged Message Delivery Agent), SPF (Sender Policy Framework), Hashcash (Sender Pay), Bayesian and Markovian filtering, Razor (distributed, collaborative spam-filtering network) are all examples of different OSS methods to block spam.  Many commercial spam filter products are also based on open source projects.  For example, the commercial McAffee SpamKiller product is based on the OSS SpamAssassin project, as are many other software- and hardware-based appliances.

OSS e-mail gateways comprise about two-thirds of all publicly accessible e-mail servers on the Internet.  For this reason, when selecting a spam filter, it’s important to choose a platform that can take advantage of standards-based filtering features existing in OSS software.  Some methods of blocking spam require both sender and recipient to be compatible, such as with Hashcash.  If your spam filter is a proprietary product, make sure that it’s capable of taking advantage of these new OSS technologies and features.

Tagged Message Delivery Agent
TMDA is an example of spam filtering at its most basic level.  TMDA denies all e-mail until it is explicitly allowed.  TMDA implements a “challenge / response” approach to deliver e-mail to local Linux users.  Although one of the best spam prevention methods available, this approach isn’t very applicable to a large law firm, but it would be an excellent spam control solution for smaller firms whose employees are POP3 users.  TMDA intercepts the delivery process after the MTA (message transfer agent) receives an e-mail and before it’s delivered to the user mailbox.  The TMDA agent immediately intervenes and autoresponds with a configurable challenge, requesting a response to validate the e-mail originated from the sender.  If the sender responds to the TMDA request, the e-mail address is added to a list of approved senders, the original e-mail is delivered to the recipient and future messages from the address will be accepted.  If the e-mail originated from an automated spambot, there will not be a reply to the TMDA request; the message will be deleted after a configurable quarantine period without knowledge of the intended recipient.

Sender Policy Framework
All spam filter products have the option of using blacklists and other methods to try drop connection attempts from known spam sources to conserve bandwidth and MTA CPU.  In addition to the commonly used DNSBL (domain name service blacklists) such as SpamHaus, SpamCop and Spews, a new standard is emerging for verifying senders.  SPF (Sender Policy Framework) is a way to prevent connections from
e-mail servers that don’t match the domain From: header fields.  This prevents spammers from “spoofing” or forging the source domain in the message.  SPF requires an additional “TXT” record to be added to the firm DNS zone file (e.g., mail.domain.com  IN TXT “v=spf1 a -all”).  Before an SPF-enabled e-mail server accepts a message, it will do a DNS lookup and check for an SPF record to ensure the sending server is permitted to send e-mail for the domain.  If the SPF record exists, the connection is dropped if the sending server IP address doesn’t match the SPF record as an allowed e-mail server for the domain.  AOL has adopted SPF for its mail system, and other major ISPs and e-mail providers are poised to adopt SPF as the standard matures.

Microsoft also embraced and attempted to extend SPF with its own version called “Sender-ID.”  Sender-ID gets its name from the conceptual similarity with Caller-ID in the voice world.  Sender-ID failed to get approval of the recently disbanded IETF MARID (MTA Authorization Records in DNS) committee.  The reason for the failure is that Microsoft had applied for patents on some of the technology and added licensing requirements to Sender-ID that were not compliant with OSS licensing and standards.  There were also issues with the “SenderID” name trademark held by another company who has a similar product.

Hashcash
Another evolving OSS method for increasing reliability of sent e-mail is Hashcash, which attempts to make it prohibitively “expensive” for a spammer to send out thousands or millions of messages from a server.  Hashcash is a concept that forces the sending e-mail server to make a lengthy computation based on the destination address in the message and attaches the “hash” or “digital stamp” of the computation to the message header. 

When the message is sent to a system that is Hashcash-aware, the system calculates and verifies the accompanying hash, matches the recipient addresses and, on a match, the spam filter lowers the spam “score” of the message to ensure or increase the likelihood of passage.  The underlying concept to Hashcash is that a server used for the purposes of spamming will not have enough CPU time to make the computations for millions of messages, and the number of spam messages will dramatically decrease. 

Hashcash requires more processing power for an e-mail server because of the additional computations that are required for each outgoing e-mail.  The delays are minimal for a typical e-mail server (e.g., a 3GHz mail gateway server delays each e-mail for about a half second while the hash is calculated). 

Camram is an example of a spam-filtering system built around the Hashcash concept, but Hashcash can also be “plugged in” to other e-mail systems like SpamAsssassin and TMDA.  Hashcash is a relatively new technology, but could be implemented in a law firm to increase the reliability of outgoing e-mail being flagged as spam at those sites that have already employed Hashcash.

Products Worth Checking Out

SpamBayes.  This is an excellent example of a successful OSS Bayesian-based filter.  SpamBayes is available as an Outlook clientside plugin that can “learn” spam preferences of an individual user.  Incoming e-mail is evaluated whether or not it is spam, based on previous user actions.  The unique thing about

SpamBayes is that e-mail is never blocked or deleted.  It will only redirect e-mail to folders that the user has created to keep spam out of the main inbox.  The SpamBayes Outlook plugin installation requires the user to save spam in a separate folder and provides a wizard that will “train” on the sample spam set. SpamAssassin. This is the premier open source spam-filtering project widely used by ISPs.  It’s embedded into many commercial products and is incorporated in many different “systems” like MailScanner and Amavisd.

SpamAssassin does it all.  This is a rule-based spam scanner that incorporates a Bayesian database for learning spam preferences and supports SPF and Hashcash.  SpamAssassin is generally used as a Linux-based gateway e-mail scanner, although a commercial version (McAfee SpamKiller) is available for personal use at the desktop. SpamAssassin evaluates each message and compares words and phrases with its internal ruleset, and it applies a weighted score to each element it finds that is typical for spam.  The spam score is cumulative and is compared against configurable thresholds for different levels of spam.  Low scores can be optionally flagged with identifiers in the subject line for clientside rule-based routing, and higher scored spam can be deleted if desired.

ILTA’s Linux Peer Group created the “Vegan” white paper, which documents the process required to create a SpamAssassin-based spam-filtering system.  Vegan employs the MailScanner package as a process controller that moves the e-mail through the filtering process, first allowing a virus scan to occur and then the spam-scanning process.  Vegan is excellent for a law firm that wants to keep full control of its
e-mail with maximum flexibility and cost control.

With so many good open source options from which to choose, you’re well-armed to fight the war on spam.

Credits and Sources
http://tmda.net/
http://spf.pobox.com/
http://en.wikipedia.org/wiki/Sender_Policy_Framework
http://hashcash.org
http://camram.org/
http://spambayes.sourceforge.net/windows.html
http://spamassassin.org
http://peertopeer.org/sig/vegan.html

About our author . . .

David Nevala is the Information Systems Administrator at Lukins & Annis, P.S. and is the current ILTA Linux Peer Group VP.  He is the leader of the Vegan spam and virus filter project and an active participant in the open-source Acrophobia PDF printer project.  David can be contacted at 509.623.2017 or dnevala@lukins.com.

From: 
Email:  
To: 
Email:  
Subject: 
Message: