still curious about antistill curious about anti-spam …...–30~40% increase in spam from...
TRANSCRIPT
![Page 1: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/1.jpg)
Still Curious about Anti-SpamStill Curious about Anti-Spam Testing?Here’s a Second Opinion
David Koconis, Ph.D.Senior Technical Advisor, ICSA Labs
Copyright 2009 Cybertrust. All Rights Reserved.
Senior Technical Advisor, ICSA Labs01 October 2010
![Page 2: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/2.jpg)
OutlineIntroductionAnti-spam testing synopsisp g y pComponents of meaningful testingAnti-Spam Testing MethodologyAnti Spam Testing Methodology– Legitimate email corpus– Store-and-forward versus live testing
Observations– Observations
Comparison to VBSpamConclusion & FutureConclusion & Future
2
![Page 3: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/3.jpg)
IntroductionICSA Labs and meEnterprise anti-spam productsp p pWhat was the original diagnosis?– Comparative– Unbiased– Real email in real-time– Statistically relevant (i.e., large corpus)– Explain what was done
3
![Page 4: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/4.jpg)
DefinitionsEffectiveness– Percent of all spam messages identified as such and not delivered
False Positive– Legitimate email misclassified as spam and not promptly delivered
F l P iti R tFalse Positive Rate– Percent of all legitimate messages not promptly delivered
Corpus (Corpora)Corpus (Corpora)– Collection of email messages typically having some property in
common
4
![Page 5: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/5.jpg)
Anti-spam Testing SynopsisNumber of spam messages on the Internet far exceeds number of legitimate messagesWant solution that– blocks every spam message (100% effective)
promptly delivers every legitimate email (0 false positives)– promptly delivers every legitimate email (0 false positives)
But Nobody’s perfectLegitimate email does get blocked/delayedLegitimate email does get blocked/delayed– End users get mad, Support cost, Missed opportunity
Spam gets deliveredSpam gets delivered– Storage and time wasted, possible malicious content
Which solution works best?
5
How can solutions be improved?
![Page 6: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/6.jpg)
f fWhat is needed for meaningful anti-spam testing?Lots of appropriate spam– Continually updated corpus
Representative of what is seen on the Internet– Representative of what is seen on the Internet
Lots of legitimate email– Personal and subscription lists or newslettersPersonal and subscription lists or newsletters– If possible, not proprietary
Test methodology that mirrors deployment– Products under test able to query Internet resources
»Protection updates»DNS, RBL, SPF, etc
Detailed logging and dispute resolution
6
![Page 7: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/7.jpg)
Lots of Spam - ICSA Labs CorpusSpam Collector– Internet connected gateway MTA honeypot
Pointed to by multiple valid MX records– Pointed to by multiple valid MX records– Accepts SMTP connection and generates unique identifier– Adds “Received:” header
St h d d t d l– Stores message headers, data and envelope
Messages arrive continually– Triggers syslog message and DB insertTriggers syslog message and DB insert
»Arrival time, Filename, Classification
Directory rolled at midnight– Rsync’ed to analysis server– Analyze entire corpus
7
![Page 8: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/8.jpg)
Daily Message Volume at ICSA Labs Spam Trap
1000000
1200000
800000
1000000
ssag
es
600000
ber o
f Mes
200000
400000
Num
b
0
1 1 0 0 0 0 0 1 0 0 0 0 010/01/0811/30/0801/30/0904/01/0906/01/0908/01/0909/30/0911/30/0901/30/1004/01/1006/01/1007/31/1009/30/10
8
![Page 9: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/9.jpg)
Daily Volume vs. events and predictionsISP take downs– November 2008 (McColo)
M di t l d d 35 80%»Media reports spam volume decreased 35-80%– June 2009 (3FN.net)
»Media reports smaller, if any decrease (spammers learned lesson)
Volume predictions for 2010– Peaked in mid 2009 and then returned to 2008 levels
»McAfee threat report for Q1 2010»McAfee threat report for Q1 2010– 30~40% increase in spam from 2009-2010
»Cisco 2009 annual report
9
![Page 10: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/10.jpg)
Daily Message Volume at ICSA Labs Spam Trap
1000000
1200000
800000
1000000
ssag
es
3FN net600000
ber o
f Mes 3FN.net
takedown
200000
400000
Num
b
0
1 1 0 0 0 0 0 1 0 0 0 0 0
McColotakedown
VB paper due
10/01/0811/30/0801/30/0904/01/0906/01/0908/01/0909/30/0911/30/0901/30/1004/01/1006/01/1007/31/1009/30/10
10
![Page 11: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/11.jpg)
Message AnalysisExtract & save interesting message properties– Sender, recipient(s), size, subject, source, body digest
MIME type headers– has attachment? What type?
Cl ifi tiClassification– Most are spam– Special accounts for Newsletter subscriptions & Project Honeypot feed
Decide if suitable for use in test set– RFC compliant addresses
Not duplicate message– Not duplicate message– Not relay attempt
11
![Page 12: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/12.jpg)
“… compiled from the SBL database using the number of currently listed SBL records for each network (ISP/NSP) sorted by country.”
12
Data from Spamhaus 31-May-2010, http://www.spamhaus.org/statistics/countries.lasso
SBL records for each network (ISP/NSP) sorted by country.
![Page 13: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/13.jpg)
Spam message sourceSource means IP that connected to ICSA LabsWhere does the U.S. rank?– First by far
»Spamhaus Symantec»Spamhaus, Symantec– First, but only by a hair
»SophosS d– Second»Cisco 2009
– Not even top 5»Panda Security»ICSA Labs
13
From ICSA Labs Spam Data Centerhttps://www.icsalabs.com/technology-program/anti-spam/spam-data-center
![Page 14: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/14.jpg)
Lots of Legitimate EmailLegitimate email separated into 2 categoriesNewsletters– Subscribe to press releases, announcements and newsletters
»Google Alerts, Bankrate.com, U.S. State Department, etc.– Messages arrive at spam collector with unique RCPTMessages arrive at spam collector with unique RCPT
Person-to-person email– Business related
»Meeting minutes, sales forecast, customer queries– Non-business related
»After hours or weekend plans, family photos, etc.p y p– One or more recipients– Occasional attachments
14
![Page 15: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/15.jpg)
Legitimate email generation frameworkMessage bodies from real email– list postings, non-proprietary msgs, personal accounts
Assorted MIME types– 40% text/plain, 40% text/html, 20% multipart/alternative
R it f tt h tRepository of attachments– 15% get attachment
Sender and Recipient addresses in DB tableSender and Recipient addresses in DB table– Users: Name, address, title– Companies: MX host, domain, email address convention, SPF
Number of recipients probability-driven– 80% single recipient, 20% up to 4
15
![Page 16: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/16.jpg)
Legitimate email generation framework (cont.)Isn’t this what spammer’s are trying to do?– Yes, but
It’s our MTA receiving messages– Received header passes SPF check– Other SMTP headers also validOther SMTP headers also valid
Not used for newsletter hamNo malicious content attachmentNo malicious content attachmentProduct developers can appeal– Results are available in real-time
16
![Page 17: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/17.jpg)
Spam Testing MethodologyTest bed overview
17
![Page 18: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/18.jpg)
18
![Page 19: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/19.jpg)
Spam Testing MethodologyTest bed overviewMessage test set determinationg
19
![Page 20: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/20.jpg)
Anatomy of a test setMessage order driven by probabilities– Main classification (90% spam / 10% ham)
Secondary classification of ham (95% personal / 5% newsletter)– Secondary classification of ham (95% personal / 5% newsletter)
First decide how many messages in the setSt t ith fi t i k l ifi tiStart with first message pick classificationThen identify message fileRepeat
20
![Page 21: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/21.jpg)
Spam Testing MethodologyTest bed overviewMessage test set determinationgEvolution of the testing process– Began with store-and-forward– Transitioned to Live
21
![Page 22: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/22.jpg)
Store-and-Forward Testing (batch)Wait for whole spam corpus from previous day to be analyzedGenerate corpus of legitimate messagesAssemble message test setgTest daily beginning at 0300Every product sees same messages in same ordere y p oduct sees sa e essages sa e o deBut faster products finish earlier
22
![Page 23: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/23.jpg)
Transitioned to Live TestingPredetermine message set classification orderProceed through list andg– Retrieve message from spam collector in real-timeor– Generate legitimate personal messageGenerate legitimate personal message
Analyze it on-the-fly (only essential checks)Initiate connection to every product at the same time forInitiate connection to every product at the same time for every messageExecute live test event twice daily (0300, 1700)Execute live test event twice daily (0300, 1700)
23
![Page 24: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/24.jpg)
From ICSA Labs Spam Data Centerhttps://www.icsalabs.com/technology-program/anti-spam/spam-data-center
24
![Page 25: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/25.jpg)
s lo
wer
)
Difference Between Spam Detection Effectiveness May 20105 0
ans
live
i 5.0
4.0
e (>
0 m
ea 3.0
2.0
diffe
renc
e
1.0
0 0
enta
ge d
5/1/2010
0.0
-1.05/8/2010 5/22/2010 5/29/20105/15/2010
Per
c 5/1/2010 5/8/2010 5/22/2010 5/29/20105/15/2010
25
![Page 26: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/26.jpg)
Lessons learnedMeasured Spam Effectiveness Differs– Always better with stored corpus
But relative ranking of products was same– But, relative ranking of products was same
Suggests that delay allows propagation of signature/knowledge to device being testedsignature/knowledge to device being testedMisclassified messages included in batch test set– 2nd Exposure effectivenessp– No correlation between age of message and length of delay
However, products sometimes forgetA bl k d i li t t i l t d li d– A spam message blocked in live test is later delivered
26
![Page 27: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/27.jpg)
Comparison to VBSpamSimilarities– Relay messages to products from single IP
Include original src IP etc in Received header– Include original src IP, etc. in Received header– Require tested product to make a decision (not quarantine)– Use “live” spam feed
Di ll Whit li ti f d– Disallow Whitelisting of senders
27
![Page 28: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/28.jpg)
Comparison to VBSpamDifferences
ICSA Labs VBSpamMessage delivery rate ~2300/hr ~600/hr
Spam feed On-site MTA PHP, Abusix
Message classification Pre-classified (before) By consensus (after)
Frequency Daily (11.5 hours/day) Quarterly (24/7 for 3 wks)
f IP i R i d h d XCLIENT t iPre-DATA filtering? IP in Received header XCLIENT extension
Final Score Report Effectiveness & FP Combined measure
And one more …
28
![Page 29: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/29.jpg)
ff fThere’s more than effectiveness and false positivesYou’re kidding. Right?Shouldn’t there be
A th ti t d t d i i t th d t th t k– Authenticated access to administer the product over the network– A way to configure the network settings– A way to change or configure the policy being enforced
Automatic spam protection updates– Automatic spam protection updates– Logging of
»password changes to an administrative account»attempts by a remote user to authenticate (success/failure)»attempts by a remote user to authenticate (success/failure)»message delivery decisions
– Sufficient and accurate documentationList of criteria requirements developed with consortium inputList of criteria requirements developed with consortium inputMethodology includes test cases to verify each requirement in the criteria
29
![Page 30: Still Curious about AntiStill Curious about Anti-Spam …...–30~40% increase in spam from 2009-2010 »Cisco 2009 annual report 9 Daily Message Volume at ICSA Labs Spam Trap 1000000](https://reader034.vdocuments.net/reader034/viewer/2022042322/5f0c7b517e708231d4359ef8/html5/thumbnails/30.jpg)
Conclusion & Future WorkCreating a fair, accurate unbiased test requires considerable expertise and developmentTesting with stored spam corpus may overestimate the effectiveness productsInvestigate sensitivity to time of test– Effectiveness better during business hours or at night?– On weekdays or weekends?On weekdays or weekends?
Incorporate more spam feeds– Project Honey Pot– Verizon Cloud Services
30