就像大多数Linux应用程序一样,SpamAssassin需要对配置文件的编辑。这个配置文件的路径是:/etc/mail/spamassassin/local.cf。
SpamAssassin在许多位置可查找配置文件,详细信息请参阅SpamAssassin手册。最容易使用的配置文件是/etc/mail/spamassassin/local.cf,可以编辑这个文件来全局配置SpamAssassin。用户可以覆盖这些全局选项并在~/.spamassassin/user_prefs文件中添加自己的选项。
例如,可以配置SpamAssassin来重写评级为垃圾邮件的邮件主题行。配置文件中的rewrite_header关键字可控制这种行为。跟随这个关键字的Subject字告诉SpamAssassin重写主题行。从以下行删除#就可以启用这种行为:
# rewrite_header Subject *****SPAM*****
required_score关键字指定:SpamAssassin认为它是垃圾邮件之前一封电子邮件必须获得的最低得分。默认值是5.00。设置此关键字到一个更高的数值,就能使SpamAssassin把较少的电子邮件标记为垃圾邮件。
required_score 5.00
有时标记为垃圾邮件地址的邮件并不是垃圾邮件,或者来自该地址的邮件并不应该标记为垃圾邮件。使用whitelist_from关键字可指定不应该被视为垃圾邮件的地址,blacklist_from用于指定应始终标记为垃圾邮件的地址:
whitelist_from sams@example.com
blacklist_from *@spammer.net
可以在whitelist_from和blacklist_from行上指定多个地址,并用空格隔开。每个地址可以包含通配符。使用whitelist_from *@example.com将从example.com域发送电子邮件的每个人列入白名单。可以使用多个whitelist_from和blacklist_from行。
下面给出了一个配置该文件的示例:
# How many hits before a message is considered spam.
required_score 7.5
# Change the subject of suspected spam
rewrite_header subject [SPAM]
# Encapsulate spam in an attachment (0=no, 1=yes, 2=safe)
report_safe 1
# Enable the Bayes system
use_bayes 1
# Enable Bayes auto-learning
bayes_auto_learn 1
# Enable or disable network checks
skip_rbl_checks 0
use_razor2 1
use_dcc 1
use_pyzor 1
# Mail using languages used in these country codes will not be marked
# as being possibly spam in a foreign language.
ok_languages all
# Mail using locales used in these country codes will not be marked
# as being possibly spam in a foreign language.
ok_locales all
其中,主要包括如下几个重点项需要进行设置:
required_score(评价阈值):设定该阈值通常情况下需要根据管理员的长期经验。阈值越低,就会有更少的邮件通过,因而将正常邮件误报为垃圾邮件的概率越高;阈值越高,则有可能将更多的垃圾邮件漏报为正常邮件,通常的默认值为5。
Rewrite header Subjects(重写消息主题):通过这个选项,用户可以配置SpamAssassin用你选择的任何对象来编辑电子邮件的主题行。默认值设置为:[SPAM]。
bayes_auto_learn(使用自动学习):SpamAssassin可以通过分析具有一定评价的消息,去自动化地整理其Bayes(贝叶斯)数据库,这个评价强烈地显示了消息是垃圾还是非垃圾消息。
Enable or disable network checks(使用网络检查和检验):选择是否使用将消息检查和(Checksum)与已知的垃圾邮件相比较的服务:这些服务有:Vipul's Razor 2.x、 DCC、 Pyzor等,不过只有当每种服务的客户端软件安装时这些服务才能正常工作。(即通过use_razor2, use_dcc, use_pyzor进行)。另外,该选项中还包括了Enable RBL Checks(启用RBL检查),即选择SpamAssassin是否应使用RBLS(DNS黑名单)。这有助于检测难于检测的垃圾信息,但需要消耗一些时间、网络带宽以及一个可用的DNS服务器。
Languages(语言):最后两种配置是关于语言的,第一个是哪些语言应检查,默认选项是检查所有的语言,建议不要修改。
在成功配置好SpamAssassin后,需要启动SpamAssassin应用程序。要想运行SpamAssassin,必须以根用户身份运行如下的命令:
#/etc/rc.d/init.d/spamassassin start
配置与sendmail协同工作
现在SpamAssassin已经启动并正常运行,现在需要设置它与邮件传输代理(Mail Delivery Agent)一起工作。本节介绍它与Sendmail协同工作的设置,因为Sendmail是在Linux环境中应用最广泛的邮件传输代理。
用户需要编辑/etc/mail/spamassassin/spamc.cf文件,并增加如下内容:
:0fw
/usr/bin/spamc
现在Sendmail被设置为使用SpamAssassin来评价和过滤进入的垃圾邮件。
运行SpamAssassin
随着spamd的运行,向spamc发送一个字符串可以查看其工作原理:
$ echo "hi there" | spamc
X-Spam-Checker-Version: SpamAssassin 3.3.2-r929478 (2010-03-31) on sobell.com
X-Spam-Flag: YES
X-Spam-Level: ******
X-Spam-Status: Yes, score=6.9 required=5.0 tests=EMPTY_MESSAGE,MISSING_DATE,
MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,
NO_RELAYS autolearn=no version=3.3.2-r929478
X-Spam-Report:
* -0.0 NO_RELAYS Informational: message was not relayed via SMTP
* 1.2 MISSING_HEADERS Missing To: header
* 0.1 MISSING_MID Missing Message-Id: header
* 1.8 MISSING_SUBJECT Missing Subject: header
* 2.3 EMPTY_MESSAGE Message appears to have no textual parts and no
* Subject: text
* -0.0 NO_RECEIVED Informational: message has no Received headers
* 1.4 MISSING_DATE Missing Date: header
* 0.0 NO_HEADERS_MESSAGE Message appears to be missing most RFC-822
* headers
hi there
Subject: [SPAM]
X-Spam-Prev-Subject: (nonexistent)
它首先会显示Yes,即认定该邮件是垃圾邮件。SpamAssassin使用评级系统,给一封电子邮件分配一个匹配命中数。如果该电子邮件收到的命中数超过所需的数量(默认为5.0),SpamAssassin则把它标记为垃圾邮件。字符串失败的原因是多方面的,都会在此状态行上列举。
以下列表是由SpamAssassin处理的一封真实垃圾邮件。它收到了24.5个命中,这几乎肯定是垃圾邮件。
X-Spam-Status: Yes, hits=24.5 required=5.0
tests=DATE_IN_FUTURE_06_12,INVALID_DATE_TZ_ABSURD,
MSGID_OE_SPAM_4ZERO,MSGID_OUTLOOK_TIME,
MSGID_SPAMSIGN_ZEROES,RCVD_IN_DSBL,RCVD_IN_NJABL,
RCVD_IN_UNCONFIRMED_DSBL,REMOVE_PAGE,VACATION_SCAM,
X_NJABL_OPEN_PROXY
version=2.55
X-Spam-Level: ************************
X-Spam-Checker-Version: SpamAssassin 2.55 (1.174.2.19-2003-05-19-exp)
X-Spam-Report: This mail is probably spam. The original message has been attached
along with this report, so you can recognize or block similar unwanted
mail in future. See http://spamassassin.org/tag/ for more details.
Content preview: Paradise SEX Island Awaits! Tropical 1 week vacations
where anything goes! We have lots of WOMEN, SEX, ALCOHOL, ETC! Every
man's dream awaits on this island of pleasure. [...]
Content analysis details: (24.50 points, 5 required)
MSGID_SPAMSIGN_ZEROES (4.3 points) Message-Id generated by spam tool (zeroes variant)
INVALID_DATE_TZ_ABSURD (4.3 points) Invalid Date: header (timezone does not exist)
MSGID_OE_SPAM_4ZERO (3.5 points) Message-Id generated by spam tool (4-zeroes variant)
VACATION_SCAM (1.9 points) BODY: Vacation Offers
REMOVE_PAGE (0.3 points) URI: URL of page called "remove"
MSGID_OUTLOOK_TIME (4.4 points) Message-Id is fake (in Outlook Express format)
DATE_IN_FUTURE_06_12 (1.3 points) Date: is 6 to 12 hours after Received: date
RCVD_IN_NJABL (0.9 points) RBL: Received via a relay in dnsbl.njabl.org
[RBL check: found 94.99.190.200.dnsbl.njabl.org.]
RCVD_IN_UNCONFIRMED_DSBL (0.5 points) RBL: Received via a relay in unconfirmed.dsbl.org
[RBL check: found 94.99.190.200.unconfirmed.dsbl.org.]
X_NJABL_OPEN_PROXY (0.5 points) RBL: NJABL: sender is proxy/relay/formmail/spam-source
RCVD_IN_DSBL (2.6 points) RBL: Received via a relay in list.dsbl.org
[RBL check: found 211.157.63.200.list.dsbl.org.]
X-Spam-Flag: YES
Subject: [SPAM] re: statement
垃圾邮件黑名单
通常情况下,垃圾邮件发送者都会借助某些域和用户会发送垃圾信息。幸运的是,SpamAssassin有一个对付已知垃圾邮件制造者的手段。设置黑名单是很简单的事情。用户可以向配置文件etc/mail/spamassain/local.cf添加黑名单。黑名单的书写方式如下所示:
blacklist_from sample_email@sampledomain.com
blacklist_from *@sampledomain.com
上面的内容相当清楚地向读者展示了如何配置黑名单。用户既可以配置具体的电子邮件地址(如sample_email@sampledomain.com),也可以配置整个域(如*@sampledomain.com)。另外,为了使用最新的网络上共享的垃圾邮件过滤信息,还可以从http://www.sa-blacklist.stearns.org/sa-blacklist/sa-blacklist.current下载最新的黑名单。不过,这个列表相当庞大,且有可能不会非常适合用户的需要,因此在下载此列表并添加到用户的黑名单时还需要进行细致的过滤和筛选。