anomaly-based spam filtering - secrypt 2011
TRANSCRIPT
![Page 1: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/1.jpg)
Carlos Laorden
![Page 2: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/2.jpg)
WHAT YOU GOT, THEN? SPAM, EGG,
SPAM, SPAM, BACON AND
SPAM.
SPAM, SPAM, SPAM, BAKED BEANS AND
SPAM.
ANYTHING WITHOUT
SPAM?
I DON’T LIKE SPAM!!
UGH!
![Page 3: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/3.jpg)
Meet the real SPiced hAM
![Page 4: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/4.jpg)
Monty Python’s Flying Circus
![Page 5: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/5.jpg)
Something that repeats and repeats until being annoying
![Page 6: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/6.jpg)
It is a
real problem for Information Security
![Page 7: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/7.jpg)
Billions of daily losses in
productivity
![Page 8: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/8.jpg)
Infected computers
![Page 9: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/9.jpg)
Stolen credentials
![Page 10: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/10.jpg)
![Page 11: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/11.jpg)
We must
fight
![Page 12: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/12.jpg)
Anti-spam methods
Pre-sending
New
protocols
Post-sending
Increase sending
costs Increase risks
for spammers
sender
content
content
![Page 13: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/13.jpg)
Usually
supervised approaches
![Page 14: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/14.jpg)
A significant
labelling work is needed
![Page 15: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/15.jpg)
A significant
labelling work is needed
![Page 16: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/16.jpg)
But, is this
possible?
![Page 17: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/17.jpg)
I mean, is this
possible...
![Page 18: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/18.jpg)
![Page 19: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/19.jpg)
YES
![Page 20: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/20.jpg)
Anomaly Detection
![Page 21: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/21.jpg)
no interest this SpamAssassin word has
this has Ling Spam no interest word
SpamAssassin
Ling Spam t1
t2
t3 D1
D2
D10 D3
D9
D4
D7
D8
D5
D11
D6
![Page 22: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/22.jpg)
? ?
Anomaly detection
d
d > threshold?
> threshold?
![Page 23: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/23.jpg)
Manhattan distance
Euclidean distance
![Page 24: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/24.jpg)
Anomaly detection
?
d
d ?
![Page 25: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/25.jpg)
Minimum distance
Maximum distance
Mean distance
![Page 26: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/26.jpg)
Minimum
distance
Maximum
distance
Mean
distance
Manhattan
distance
Euclidean
distance
![Page 27: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/27.jpg)
10 different
thresholds
![Page 28: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/28.jpg)
Anomaly detection
d
d < threshold
> threshold
![Page 29: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/29.jpg)
![Page 30: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/30.jpg)
Minimum
distance
Maximum
distance
Mean
distance
Manhattan
distance
Euclidean
distance
10
thresholds
![Page 31: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/31.jpg)
Results
![Page 32: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/32.jpg)
SpamAssassin Manhattan Euclidean
Prec. Rec. F-Meas. Prec. Rec. F-Meas.
Mean 91.03% 92.85% 91.93% 76.14% 97.77% 85.61%
Maximum 69.61% 99.89% 82.05% 72.99% 97.66% 83.54%
Minimum 95.40% 93.86% 94.62% 92.10% 94.00% 93.04%
![Page 33: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/33.jpg)
SpamAssassin Manhattan Euclidean
Prec. Rec. F-Meas. Prec. Rec. F-Meas.
Mean 91.03% 92.85% 91.93% 76.14% 97.77% 85.61%
Maximum 69.61% 99.89% 82.05% 72.99% 97.66% 83.54%
Minimum 95.40% 93.86% 94.62% 92.10% 94.00% 93.04%
![Page 34: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/34.jpg)
Ling Spam Manhattan Euclidean
Prec. Rec. F-Meas. Prec. Rec. F-Meas.
Mean 79.18% 73.54% 76.26% 92.82% 91.58% 92.20%
Maximum 76.23% 74.29% 75.25% 85.95% 79.29% 82.49%
Minimum 65.82% 74.38% 69.84% 87.51% 93.13% 90.23%
![Page 35: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/35.jpg)
Ling Spam Manhattan Euclidean
Prec. Rec. F-Meas. Prec. Rec. F-Meas.
Mean 79.18% 73.54% 76.26% 92.82% 91.58% 92.20%
Maximum 76.23% 74.29% 75.25% 85.95% 79.29% 82.49%
Minimum 65.82% 74.38% 69.84% 87.51% 93.13% 90.23%
![Page 36: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/36.jpg)
Suitable to
overcome the amount
of unclassified spam e-mails
![Page 37: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/37.jpg)
![Page 38: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/38.jpg)
![Page 39: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/39.jpg)
Will we see
the END of spam?
![Page 40: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/40.jpg)
95%
![Page 41: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/41.jpg)
“Solution to spam”
Cut their billing systems?
![Page 42: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/42.jpg)
![Page 43: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/43.jpg)
![Page 44: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/44.jpg)
![Page 45: Anomaly-based Spam Filtering - SECRYPT 2011](https://reader034.vdocuments.net/reader034/viewer/2022051404/58efbf4d1a28abdf438b45e9/html5/thumbnails/45.jpg)
References
1. Monty Python – Spam: http://www.youtube.com/watch?v=anwy2MPT5RE
2. Spam wall by freezelight: http://www.flickr.com/photos/63056612@N00/155554663/
3. monty python flying circus by the_d8_show: http://www.flickr.com/photos/8056839@N04/478599790/
4. Dollars: http://vegasgravy.com/News-detail/two-women-
caught-for-transporting-drug-money-from-vegas/dollars/
5. Day 97: Infected by dustywrath: http://www.flickr.com/photos/10921499@N07/2187318683
6. my bank sucks by B Rosen: http://www.flickr.com/photos/rosengrant/3537904106/
7. Feet on table: http://bisystembuilders.com/wp-
content/uploads/2010/02/shutterstock_feet-on-table.jpg
8. Buried on bills: http://getupkids.net/wp-
content/uploads/2013/06/debt_piling.jpg
9. Kill spam: http://www.email-marketing-wizard.com/wp-
content/uploads/2010/03/spammer.jpg