| SpamAssassin | |
|---|---|
|
Screenshot
|
|
| Developed by | Apache Software Foundation [1] |
| Latest release | 3. A software developer is a person or organization concerned with facets of the software development process wider than design and coding a somewhat broader scope of A software release is the distribution whether public or private of an initial or new and upgraded version of a Computer software product 2. 4 / January 8, 2008 |
| Written in | Perl |
| OS | Cross-platform |
| Genre | Email spam filter |
| License | Apache License 2. Events 871 - Battle of Ashdown - Ethelred of Wessex defeats a Danish invasion army 2008 ( MMVIII) is the current year in accordance with the Gregorian calendar, a Leap year that started on Tuesday of the Common A programming language is an Artificial language that can be used to write programs which control the behavior of a machine particularly a Computer. NOTES FOR EDITORS "Perl" is not an acronym (read the "Name" section below An operating system (commonly abbreviated OS and O/S) is the software component of a Computer system that is responsible for the management and coordination In computing cross-platform (also known as multi-platform) is a term used to refer to Computer software or computing methods and concepts that are implemented Computer software can be organized into categories based on common function type or field of use Email filtering is the processing of E-mail to organize it according to specified criteria A software license (or software licence in commonwealth usage is a Legal instrument governing the usage or redistribution of copyright protected software 0 |
| Website | http://spamassassin.apache.org |
SpamAssassin is a computer program released under the Apache License 2.0 used for e-mail spam filtering based on content-matching rules, which also supports DNS-based, checksum-based and statistical filtering, supported by external programs and online databases. A website (alternatively web site or Web site, a back-construction from the Proper noun World Wide Web) is a collection of Web pages Computer programs (also software programs, or just programs) are instructions for a Computer. E-mail spam, also known as "bulk e-mail" or "junk e-mail" is a subset of spam that involves nearly identical messages sent to numerous recipients by To prevent E-mail spam, both end users and administrators of e-mail systems use various anti-spam techniques. The Domain Name System (DNS is a hierarchical naming system for computers services or any resource participating in the Internet.
SpamAssassin is regarded by some as one of the most effective spam filters, especially when used in combination with spam databases. While simple text-matching alone may, for most users, be sufficient to correctly classify a majority of incoming mail the complexity involved in the combination of the comparison of words and symbols used in conjunction with the sources of spam may far exceed the average user's capability. For instance, graphic-only spam messages have no text to compare to therefore checking the sender's originating mailserver and included links against various databases of known e-mail abusers enables the prevention of unnecessary or non-personal mail getting through to the end user.
Contents |
SpamAssassin was created by Justin Mason who had maintained a number of patches against an earlier program named filter. plx by Mark Jeftovic, which in turn was begun in August 1997. Year 1997 ( MCMXCVII) was a Common year starting on Wednesday (link will display full 1997 Gregorian calendar Mason rewrote all of Jeftovic's code from scratch and uploaded the resulting codebase to SourceForge.net on April 20, 2001. SourceForgenet is a Source code repository. It acts as a centralized location for software developers to control and manage open source software development Events 1303 - The University of Rome La Sapienza is instituted by Pope Boniface VIII. Year 2001 ( MMI) was a Common year starting on Monday according to the Gregorian calendar.
SpamAssassin is a Perl-based application (Mail::SpamAssassin in CPAN) which is usually used to filter all incoming mail for one or several users. NOTES FOR EDITORS "Perl" is not an acronym (read the "Name" section below CPAN is an Acronym standing for Comprehensive Perl Archive Network, an archive of over 12200 modules of software written in Perl, as well as documentation It can be run as a standalone application or as a client (spamc) that communicates with a daemon (spamd). A client is an application or system that accesses a remote service on another Computer system, known as a server, by way of a Network. In Unix and other computer multitasking Operating systems a daemon (ˈdiːmən or /ˈdeɪmən/ is a Computer program that runs in the background The latter mode of operation has performance benefits, but under certain circumstances may introduce additional security risks.
Typically either variant of the application is set up in a generic mail filter program, or it is called directly from a mail user agent that supports this, whenever new mail arrives. Email filtering is the processing of E-mail to organize it according to specified criteria An e-mail client, aka Mail User Agent (MUA aka e-mail reader is a frontend Computer program used to manage E-mail. Mail filter programs such as procmail can be made to pipe all incoming mail through SpamAssassin with an adjustment to user's . Procmail is a Mail delivery agent (MDA or Mail filter, a program to process incoming Emails on a computer widely used on Unix systems In Unix-like computer Operating systems a pipeline is the original software pipeline: a set of processes chained by their Standard procmailrc file.
SpamAssassin comes with a large set of rules which are applied to determine whether an email is spam or not. To decide, specific fields within the email header and the email body are typically searched for certain regular expressions, and if these expressions match, the email is assigned a certain score, depending on the test, and several (customizable) headers are added to the mail. In Computing, regular expressions provide a concise and flexible means for identifying strings of text of interest such as particular characters words or patterns of characters The total score resulting from all tests or other criteria can then be used by the end user or by the ISP to set the conditions under which email is moved to a separate spam folder, deleted, flagged etc.
Each test has a label and a description. The label is usually an all upper case identifier separated with underscores, such as "LIMITED_TIME_ONLY", with the description for that label being "Offers a limited time offer". A mail that fails that test (in this case, contains certain variants of the "limited time only" phrase) might be assigned a score of +0. 3. With a spam threshold of 5 (default as of SpamAssassin version 2. 55), several other tests would usually have to fail for the mail to be classified as spam. On the other hand, some tests, such as those for invalid message IDs or years, result in a very high score being assigned, where even a single test can almost put a mail "over the edge".
When a mail's total score is higher than the "required_score" setting in SpamAssassin's configuration, the mail is treated as spam and rewritten according to several options. In the default configuration, the content of the mail is appended as a MIME attachment, with a brief excerpt in the message body, and a description of the tests which resulted in the mail being classified as spam. Multipurpose Internet Mail Extensions ( MIME) is an Internet standard that extends the format of e-mail to support text in Character If the score is lower than the defined settings, by default the information about the passed tests and total score is still added to the email headers and can be used in post-processing for less severe actions, such as tagging the mail as suspicious.
The user can customize these filters using a file "user_prefs" in their home directory or a database. In computing a home directory is a file system directory which contains the personal files of a given user of the system Within this file, they can specify individuals whose emails are never considered spam, or change the scores for certain rules. The user can also define a list of languages which they want to receive mail in, and SpamAssassin then assigns a higher score to all mails that appear to be written in another language. This can be very useful to users receiving a lot of foreign spam but never actually corresponding with people in that language.
SpamAssassin also supports:
as a means to tell 'ham' from 'spam'. Distributed Checksum Clearinghouse (also referred to as DCC) is a hash sharing method of spam email detection Vipul's Razor is a Checksum -based distributed collaborative spam -detection-and-filtering network Hashcash is a Proof-of-work system designed to limit email spam and Denial of service attacks In Computing, Sender Policy
More methods can be added reasonably easily by writing a Perl plug-in for SpamAssassin.
SpamAssassin by default tries to reinforce its own rules through Bayesian filtering, but Bayesian learning is most effective with actual user input. Bayesian spam filtering (pronounced BAYS-ee-ən IPA pronunciation:, after Rev Typically, the user is expected to "feed" example spam mails and example "ham" (useful) mails to the filter, which can then learn the difference between the two. For this purpose, SpamAssassin provides the command-line tool sa-learn, which can be instructed to learn a single mail or an entire mailbox as either ham or spam.
Typically, the user will move unrecognized spam to a separate folder for a while, and then run sa-learn on the folder of non-spam and on the folder of spam separately. Alternatively, if the mail user agent supports it, sa-learn can be called for individual emails. Regardless of the method used to perform the learning, SpamAssassin's Bayesian test will subsequently assign a higher score to e-mails that are similar to previously received spam (or, more precisely, to those emails that are different from non-spam in ways similar to previously received spam e-mails).
SpamAssassin is free/open source software, licensed under the Apache License 2.0. Free software or software libre is Software that can be used studied and modified without restriction and which can be copied and redistributed in modified or unmodified Open source software (OSS began as a marketing campaign for Free software. Versions prior to 3. 0 are dual-licensed under the Artistic License and the GNU General Public License. Artistic license (also known as dramatic license, poetic license, narrative license, licentia poetica, or simply license) is a colloquial
sa-compile is a utility distributed with SpamAssassin as of version 3. 2. 0. It compiles a SpamAssassin ruleset into a deterministic finite automaton that allows SpamAssassin to use processor power more efficiently. In the Theory of computation, a deterministic finite state machine (also known as deterministic finite state automaton (DFSA or deterministic finite
Most implementations of SpamAssassin will trigger on the GTUBE, a 68 byte string not unlike the antivirus EICAR test file. The GTUBE (Generic Test for Unsolicited Bulk Email is a 68 byte test string used to test anti spam solutions notably those based on SpamAssassin. The EICAR test file (official name EICAR Standard Anti-Virus Test File) is a file developed by the European Institute for Computer Antivirus Research, to test the If this string is inserted in an RFC 2822 formatted message and passed through the SpamAssassin engine, SpamAssassin will trigger with a weight of 1000.
The following free/open source applications have support for SpamAssassin:
SpamAssassin has also been used in many commercial products including: