greystar: fast and accurate detection of sms spam numbers ...€¦ · sms communication as other...

17
Open access to the Proceedings of the 22nd USENIX Security Symposium is sponsored by USENIX This paper is included in the Proceedings of the 22nd USENIX Security Symposium. August 14–16, 2013 • Washington, D.C., USA ISBN 978-1-931971-03-4 Greystar: Fast and Accurate Detection of SMS Spam Numbers in Large Cellular Networks using Grey Phone Space Nan Jiang, University of Minnesota; Yu Jin and Ann Skudlark, AT&T Labs; Zhi-Li Zhang, University of Minnesota

Upload: others

Post on 30-Apr-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Greystar: Fast and Accurate Detection of SMS Spam Numbers ...€¦ · SMS communication as other mobile numbers do (e.g., those associated with smartphones), they thereby form a grey

Open access to the Proceedings of the 22nd USENIX Security Symposium

is sponsored by USENIX

This paper is included in the Proceedings of the 22nd USENIX Security Symposium.August 14–16, 2013 • Washington, D.C., USA

ISBN 978-1-931971-03-4

Greystar: Fast and Accurate Detection of SMS Spam Numbers in Large Cellular Networks

using Grey Phone SpaceNan Jiang, University of Minnesota; Yu Jin and Ann Skudlark, AT&T Labs;

Zhi-Li Zhang, University of Minnesota

Page 2: Greystar: Fast and Accurate Detection of SMS Spam Numbers ...€¦ · SMS communication as other mobile numbers do (e.g., those associated with smartphones), they thereby form a grey

USENIX Association 22nd USENIX Security Symposium 1

Greystar: Fast and Accurate Detection of SMS Spam Numbers in LargeCellular Networks using Grey Phone Space

Nan JiangUniversity of Minnesota

Yu JinAT&T Labs

Ann SkudlarkAT&T Labs

Zhi-Li ZhangUniversity of Minnesota

AbstractIn this paper, we present the design of Greystar, an inno-vative defense system for combating the growing SMSspam traffic in cellular networks. By exploiting the factthat most SMS spammers select targets randomly fromthe finite phone number space, Greystar monitors phonenumbers from the grey phone space (which are associ-ated with data only devices like laptop data cards andmachine-to-machine communication devices like elec-tricity meters) and employs a novel statistical model todetect spam numbers based on their footprints on thegrey phone space. Evaluation using five month SMScall detail records from a large US cellular carrier showsthat Greystar can detect thousands of spam numbers eachmonth with very few false alarms and 15% of the de-tected spam numbers have never been reported by spamrecipients. Moreover, Greystar is much faster in detect-ing SMS spam than existing victim spam reports, reduc-ing spam traffic by 75% during peak hours.

1 Introduction

The explosion of mobile devices in the past decade hasbrought with it an onslaught of unwanted SMS (ShortMessage Service) spam [1]. It has been reported thatthe number of spam messages in the US has risen 45%in 2011 to 4.5 billion messages [2]. In 2012, therewere 350K variants of SMS spam messages accountedfor globally [3] and more than 69% of the mobile usersclaimed to have received text spam [4]. The sheer vol-ume of spam messages not only inflicts an annoying userexperience, but also incur significant costs to both cel-lular service providers and customers alike. In contrastto email spam where the number of possible email ad-dresses is unlimited - and therefore the spammer gener-ally needs a seed list beforehand, SMS spammers canmore easily reach victims by, e.g., simply enumeratingall numbers from the finite phone number space. This,

combined with wide adoption of mobile phones, makesSMS a medium of choice among spammers. Further-more, the increasingly rich functionality provided bysmart mobile devices also enables spammers to carry outmore sophisticated attacks through both voice and datachannels, for example, using SMS spam to entice users tovisit certain websites for product advertisement or otherillicit activities.Because SMS spam inflicts financial loss to mobile

users and adverse impact to cellular network perfor-mance, the objective of defense techniques is to restrictspam numbers quickly before they reach too many vic-tims. To this end, instead of applying popular solutions incontrolling email spam (e.g., filtering based on sendingpatterns), which can cause a high false alarm rate, cellu-lar carriers often seek help from their customers to alertthem of emerging spamming activities. More specifi-cally, cellular carriers deploy reporting mechanism forspam victims to report received spam messages and thenexamine and restrict these reported spam numbers ac-cordingly. Such spam detection techniques using victimspam reports are very accurate, thanks to the human in-telligence added while submitting these reports. How-ever, these methods can suffer from significant delay dueto the low report rate and slow user responses, renderingthem inefficient in controlling SMS spam.

To address the issues in existing solutions, in this pa-per, we carry out extensive analysis of SMS spammingactivities using five months of SMS call detail recordscollected from a large cellular network in the US and theSMS spam messages reported from the spam recipientsto that cellular carrier. We find that a majority of spam-mers choose targets randomly from a few area codes orthe entire phone number space, and initiate spam trafficat high rates. To detect such aggressive random spam-mers, we advance a novel notion of grey phone space.Grey phone space comprises a collection of grey phonenumbers (or grey numbers in short). Grey numbers areassociated with two types of mobile devices: data only

Page 3: Greystar: Fast and Accurate Detection of SMS Spam Numbers ...€¦ · SMS communication as other mobile numbers do (e.g., those associated with smartphones), they thereby form a grey

2 22nd USENIX Security Symposium USENIX Association

devices (e.g., many laptop data cards and data modems,etc.) and machine-to-machine (M2M) communicationdevices (e.g., utility meters and medical devices, etc.).These grey numbers usually do not participate actively inSMS communication as other mobile numbers do (e.g.,those associated with smartphones), they thereby forma grey territory that legitimate mobile users rarely enter.In the mean time, the wide dispersion of grey numbersmakes them hard to be evaded by spammers who choosetargets randomly.

On top of grey phone space, we propose the designof Greystar. Greystar employs a novel statistical modelto detect spam numbers based on their interactions withgrey numbers and other non-grey phone numbers. Weevaluate Greystar using five months of SMS call records.Experimental results indicate that Greystar is superior tothe existing SMS spam detection algorithms, which relyheavily on victim spam reports, in terms of both accuracyand detection speed. In particular, Greystar detected over34K spam numbers in five months while only generatingtwo false positives. In addition, more than 15% of the de-tected spam numbers have never been reported by mobileusers. Moreover, Greystar reacts fast to emerging spam-ming activities, with a median detection time of 1.2 hoursafter spamming activities occur. In 50% of the cases,Greystar is at least 1 day ahead of victim spam reports.The high accuracy and fast response time allow us to re-strict more spam numbers soon after spamming activitiesemerge, and hence to reduce a majority of the spam mes-sages in the network. We demonstrate through simula-tion on real network data that, after deploying Greystar,we can reduce 75% of the spam messages during peakhours. In this way, Greystar can greatly benefit the cellu-lar carriers by alleviating the load from aggressive SMSspam messages on network resources as well as limitingtheir adverse impact on legitimate mobile users.

The remainder of this paper is organized as follows.We introduce the SMS architecture and the datasets usedin our study in Section 2. We then motivate the design ofGreystar in Section 3. In Section 4 we study the SMS ac-tivities of spammers and legitimate users. The definitionof grey numbers is presented in Section 5. In Section 6,we explain in detail the design of Greystar. Evaluationresults are presented in Section 7. Section 8 discussesthe related work and Section 9 concludes the paper.

2 Background and Datasets

In this section, we briefly describe the cellular networkfocused in our study. We then introduce the datasets andour ground truth for identifying spam phone numbers.

2.1 SMS Architecture in UMTSThe cellular network under study utilizes primarilyUMTS (Universal Mobile Telecommunication System),a popular 3G mobile communication technology adoptedby many mobile carriers across the globe. Here we intro-duce the architecture for delivering SMS messages insideUMTS networks (for other aspects regarding UMTS net-works, e.g., mobile data channels, see [5]). Fig. 1 de-picts a schematic view of the architecture. When send-ing an SMS message, an end user equipment (UEA)directly communicates with a cell tower (or node-B),which forwards the message to a Radio Network Con-troller (RNC). The RNC then delivers the message to aMobile Switching Center (MSC) server, where the mes-sage enters the Signaling System 7 (SS7) network andis stored temporarily at a Short Message Service Center(SMSC). From the SMSC, the message will be routed tothe serving MSC of the recipient (UEB), then to the serv-ing RNC and Node-B, and finally reaches UEB. Sim-ilarly, messages originated from other carrier networks(e.g., from UEC) will also traverse the SS7 network andbypass the serving MSC before arriving at UEB

1.

Figure 1: SMS architecture in UMTS networks.

2.2 DatasetsIn this paper, we use two different datasets for our study.

SMS Call Detail Records (CDRs) are used for under-standing SMS user/spammer activities and evaluating theperformance of the proposed Greystar system. Theserecords were collected at the serving MSC’s of SMS re-cipients (see Fig. 1). This means that CDR records rep-resent SMS messages targeting registered mobile cus-tomers of the UMTS network under study2 and have been

1Note that similar SMS architecture is also adopted in other typesof 3G/4G cellular networks. Additionally, in this paper, we only focuson SMS through the voice control channel. Short message servicesthrough mobile data channels, such as iMessage, Tweets and MMS,etc., are out of the scope of this paper (though defenses for fightingemail spam can be applied to detect short message spam through datachannels, which we shall discuss in Section 8).

2SMS messages targeting mobile users in other carrier networks andlandline numbers are not seen at the serving MSCs and hence are not

2

Page 4: Greystar: Fast and Accurate Detection of SMS Spam Numbers ...€¦ · SMS communication as other mobile numbers do (e.g., those associated with smartphones), they thereby form a grey

USENIX Association 22nd USENIX Security Symposium 3

successfully routed through the SS7 network. The CDRdataset spans 5 months from Jan 2012 to May 2012. Eachrecord contains the SMS receiving time, the originatingnumber, the terminating number and the InternationalMobile Equipment Identity (IMEI) for the device asso-ciated with the terminating number3. We note that CDRrecords do not contain text content of the original SMSmessages.

Victim spam reports contain spam messages reportedby spam recipients to the carrier. The said cellular car-rier deploys an SMS spam reporting service for its users:when a user receives an SMS text and deems it as a spammessage, s/he can forward the message to a spam reportnumber designated by the cellular service provider. Oncethe spam is forwarded, an acknowledgment message isreturned, which asks the user to reply with the spam-mer’s phone number (referred to as the spam numberhereafter). Once the above two-stage process is com-pleted within a predefined time interval, a spam reportis created, which includes the reporter’s phone number,the spam number, the reporting time and the text contentof the reported spam message. We employ six monthsof spam reports from Jan 2012 to June 2012 in order tocover spam numbers observed between Jan and May butare reported after May due to the delay of the spam re-ports (see Section 3.2).

We emphasize that no customer personal informationwas collected or used for our study. All customer identi-fies were anonymized before any analysis was conducted.In particular, for phone numbers, only the area code (i.e.,the first 3 digits of the 10 digit North American num-bers) was used and the remaining digits were hashed.Similarly we only retain the first 8-digit Type Alloca-tion Code (TAC) of the IMEI to identify device types andanonymize the remaining 8-digit to preserve customers’privacy. In addition, to adhere to the confidentiality un-der which we have access to the data, in places we onlypresent normalized views of our results while retainingthe scientifically relevant magnitudes.

2.3 Obtaining Ground TruthAlthough victim spam reports provide us with groundtruth for some spam numbers, they are by no means com-prehensive and can be noisy (see Section 3.2). Therefore,

included in CDR records.3IMEI’s are stored at MSC’s and are updated every time users con-

nect to the network. Although we have observed that spammers some-times modify the IMEIs of their spamming devices (e.g., through spe-cial equipment like SIM boxes), IMEI spoofing among legitimate usersis rare. Therefore we can reliably identify the types of user devicesbased on their corresponding IMEIs. Meanwhile, since all the CDRsare collected at MSCs, we can identify the original phone numbers thatinitiate the SMS messages. Hence our approach is not affected evenwhen spammers employ spoofing techniques to change their caller IDs.

in this paper, we employ a more reliable source of groundtruth. In particular, we request the fraud agents from thesaid UMTS carrier to manually verify spam number can-didates detected by us. These fraud agents are exposedto much richer (and more expensive) sources of informa-tion. For example, fraud agents can investigate the own-ership and the price plan information of the candidates,examine their SMS sending patterns and correlate themwith known spam numbers in terms of their network lo-cations and active times, etc. The final decision is madeconservatively by corroborating different evidence.

Admittedly, fraud agents can make mistakes duringtheir investigation. Meanwhile, their breadth may belimited by not being able to inspect all mobile numbersin the network. Nevertheless, fraud agents provide uswith the most authoritative ground truth available for ourstudy. It is worth mentioning that such investigation byfraud agents has been deployed independently for SMSspam number detection and restriction for more than oneyear and no false alarm has yet been observed (e.g., nouser complaint is observed so far regarding incorrectlyrestricted phone numbers). Therefore, in our study, wewill treat fraud agents as a black box authority, i.e., wesubmit a list of spam number candidates to fraud agentsand they return a list of confirmed spam numbers.

3 Objectives and Existing Solutions

In this section, we discuss the objectives of developingan effective defense against SMS spam by comparingthe difference between SMS spam and traditional emailspam. We then review the most widely adopted SMSspam detection method based on crowdsourcing victimspam reports and point out its inefficacy. In the end, wepresent the rationale of the proposed Greystar system.

3.1 SMS Spam Defense ObjectivesIn a conventional SMS spamming scenario, an SMSspammer (note that we refer to an SMS spammer as theperson who employs a set of spam numbers to launchSMS spam campaigns) first invests in a set of phonenumbers and special high-speed devices, such as 3Gmodems and SIM boxes [6]. Using these devices, s/hethen initiates unsolicited SMS messages to a large num-ber of mobile phone numbers. Akin to traditional emailspam, the objective of SMS spam is to advertise certaininformation to entice further actions from the messagerecipients, e.g., calling a fraud number or clicking on aURL link embedded in the message which points to amalicious site. However, SMS spamming activities ex-hibit unique characteristics which shift the focus of thedefense mechanisms and hence render inapplicable or

3

Page 5: Greystar: Fast and Accurate Detection of SMS Spam Numbers ...€¦ · SMS communication as other mobile numbers do (e.g., those associated with smartphones), they thereby form a grey

4 22nd USENIX Security Symposium USENIX Association

inefficient existing solutions for defending against tradi-tional email spam.Email service providers usually detect and filter email

spam at their mail servers, to which they have full access.There they can build accurate spam filters by exploitingrich features in emails including the text content. Spamfilters at end user devices are also a common choice,where email clients (apps) filter spam while retrievingemails from remote mail servers. Though blacklist ofemail spammers are sometimes used to assist spam clas-sification [7–9], restricting email spam senders is usuallynot the main focus of the defense, since it requires closecollaboration between email providers and network car-riers. Moreover, it is observed that many spam emailsare originated from legitimate hosts due to botnet activ-ities [10], which makes restricting spam originators aninapplicable solution.

In comparison to emails which are generally stored onservers and wait for users to retrieve them, SMS mes-sages are delivered instantly to the recipients through theSS7 network. Along the path, SMS messages are onlycached temporarily at SMSC (only when the recipientsare offline), leaving little time for cellular carriers to re-act to them. The task becomes even more challenging es-pecially when the SMS traffic volume peaks during busyhours. Filtering SMS spam at end user devices (e.g., us-ing mobile apps) is also not applicable given many SMScapable devices (e.g., feature phones) do not support run-ning such apps. In addition, for a user with a pay-per-use SMS plan, she is already charged for the spam mes-sage once it arrives at her device. More importantly, evenwhen SMS spam filters are deployed at SMSC’s and enduser devices, SMS spammers can still inflict significantloss to the carrier and other mobile users. This is becausethe huge number of spam messages can lead to a signif-icant increase in the SMS traffic volume at the cell tow-ers serving the spam senders, possibly causing conges-tion and hence deteriorating voice/data usage experienceof nearby users. For example, we have found the SMStraffic volume at cell towers can easily get multiplied bymore than 10 times due to the activities of spammers.Therefore, the focus of the SMS spam defense is to con-trol spam numbers as soon as possible before they reacha large number of victims.An efficient SMS spam detection algorithm is hence

expected to react quickly to emerging spamming activi-ties. Meanwhile, the focus on restricting spam numbersplaces a strong emphasis on the accuracy of the algo-rithm. First, it requires a spam detection algorithm tolimit false alarms, because false alarms can lead to incor-rect restriction of legitimate users from accessing SMSservices. Second, it demands the algorithm detect asmany spam numbers as possible so as to minimize theimpact of SMS spam activities on the network. Such

high accuracy requirements are hard to achieve solelybased on the SMS sending patterns of the spammers.For example, it is difficult to separate spam campaignsfrom legitimate SMS campaigns, such as a school send-ing messages to its students to alert adverse weather con-ditions. These legitimate senders can exhibit character-istics that are common to SMS spammers4. Spammersmay also alter their sending patterns to mimic legitimateusers to avoid detection. As a result, cellular carriers of-ten seek the assistance from their customers to alert themof emerging SMS spam activities.

3.2 Spam Detection by CrowdsourcingVictim Spam Reports

The emphasis on high accuracy gives rise to the wideadoption of spam detection methods based on victimspam reports which were introduced in Section 2. Victimspam reports represent a more reliable and cleaner sourceof SMS spam samples, as all the spam messages con-tained in the reports have been vetted and classified bymobile users (using human intelligence). To further mit-igate the possible errors caused during the two-step re-porting process, cellular carriers often crowdsource spamreports from different users. For example, a simple yeteffective strategy is to identify a spam number after re-ceiving reports from K distinct users. Meanwhile, de-fense mechanisms based on victim spam reports are alsoof low cost, because only numbers reported by users needto be further analyzed. Due to this reason, spam reportsare usually a trigger for more sophisticated investigationon the senders, such as their sending patterns, serviceplans, etc..

Despite the high accuracy and low cost, detecting SMSspam based on spam reports is analogus to performingspam filtering at user devices. The major drawback isdetection delay, which we illustrate in Fig. 2 based onthe CDR data from January 2012. The red solid curvein Fig. 2 measures how long it takes for a spam num-ber to be reported after spam starts (a.k.a. report delay).We consider a spam number starts spamming when itfirst reaches at least 50 victims in an hour (see Section 4for discussion on spamming rates). From Fig. 2, we ob-serve that less than 3% of the spam numbers are reportedwithin 1 hour after spamming starts. More than 50% ofthe spam numbers are reported 1 day after. The reportdelay is mainly due to the extremely low report rate fromusers. In fact, less than 1 in 10,000 spam messages were

4Maintaining a whitelist of such legitimate intensive SMS users canbe challenging. First, we have little information to identify the whitelist if the users are outside the network. Second, even for the usersinside the network, the whitelist can still be dynamic, with new busi-nesses/organizations initiating/stopping SMS broadcasting services ev-ery day. More importantly, users are not obliged to report to the carrierwhen they intend to start such services.

4

Page 6: Greystar: Fast and Accurate Detection of SMS Spam Numbers ...€¦ · SMS communication as other mobile numbers do (e.g., those associated with smartphones), they thereby form a grey

USENIX Association 22nd USENIX Security Symposium 5

0.0

0.2

0.4

0.6

0.8

1.0

Delay (hours in log scale)

CD

F

0.1 0.5 2 5 10 50 200

Report delayUser delay

Figure 2: Lags of user reports.

Spamming rate (# targets per hour) in log scale

1 −

CD

F

1 5 50

0.0

0.2

0.4

0.6

0.8

1.0

Spam numberLegitimate number

Figure 3: Spamming rate.

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Areacode RU

Ran

dom

spa

mm

ing

ratio

Figure 4: Target selection strategies.

reported during the five month observation period. Asidefrom causing a long detection delay, the low report ratealso leads to many missed detections (see Section 7).

In addition, even when a victim reports a spam mes-sage, how long it takes him/her is at the reporter’s dis-cretion. The blue dotted curve in Fig. 2 shows how fasta user reports a spam message after receiving it (userdelay). Note that each user can receive multiple spammessages (possibly with different text content) from thesame sender and hence can report the same sender multi-ple times. Thus, we define user delay as the time differ-ence between when a user reports a spam message andthe last time that the user receives spam from that par-ticular spam number before the report. We observe inFig. 2 , among users who report spam, half of the spammessages are reported more than 1 hour after they receivethe spam messages. Around 20% spam are reported evenafter a day. Due to such a long delay, spammers havealready inflicted significant loss to the network and itscustomers.

In addition to the problem of detection delay, the cur-rent two-stage reporting method is error-prone. We findaround 10% reporters fail to provide a valid spam num-ber at the second stage. Moreover, spam report basedmethods are vulnerable to attacks, as attackers can eas-ily game with the detection system by sending bogusreports to Denial-of-Service (DoS) legitimate numbers.All these drawbacks render spam detection using victimspam reports an insufficient solution.

3.3 Overview of GreystarRecognizing the drawbacks of existing victim reportbased solutions, we introduce the rationale behindGreystar. The objective of Greystar is to accurately de-tect SMS spam while at the same time being able to con-trol spam numbers as soon as possible before they reachtoo many victims. To this end, we advance a novel notionof grey phone numbers. These grey numbers usually do

not communicate with other mobile numbers using SMS,they thereby form a grey territory that legitimate mobileusers rarely enter. On the other hand, as we shall seein Section 4, it is difficult for spammers to avoid touch-ing these grey numbers due to the random target selec-tion strategies that they usually adopt. Greystar then pas-sively monitors the footprints of SMS senders on thesegrey numbers to detect impending spam activities target-ing a large number of mobile users.

Greystar addresses the problems in existing spam re-port based solutions as follows. First, the populationof grey numbers is much larger and widely distributed(see Section 5), providing us with more “spam alerts”to capture more spam numbers more quickly. Second,by passively monitoring SMS communication with greynumbers, we avoid the user delay and errors introducedwhen submitting spam reports. Last, Greystar detectsspammers based on their interactions with grey phonespace. This prevents malicious users from gaming theGreystar detection system and launching DoS attacksagainst other legitimate users.In the following, we first discuss related work in Sec-

tion 8. We then study the difference of spamming andlegitimate SMS activities in Section 4, which lays thefoundation of the Greystar system. In Section 5 we in-troduce our methodology for identifying grey numbers.We then present the design of Greystar in Section 6 andevaluate it in Section 7.

4 Analyzing SMS Activities of Spammersand Legitimate Users

We first formally define SMS spamming activities. Dur-ing a spamming process, a spammer selects (followinga certain strategy) a sequence of target phone numbers,X := {x1,x2, · · · ,xi, · · · } (1 ≤ i ≤ n), to send SMS mes-sages to over a time window T . Each target phone num-ber is a concatenation of two components, the 3-digit

5

Page 7: Greystar: Fast and Accurate Detection of SMS Spam Numbers ...€¦ · SMS communication as other mobile numbers do (e.g., those associated with smartphones), they thereby form a grey

6 22nd USENIX Security Symposium USENIX Association

area code xai , which is location specific, and the 7-digit

subscriber number xsi . Note that we only examine US

phone numbers (which have 10 digits excluding the lead-ing country code “1”). Phone numbers of SMS sendersfrom other countries which follow the same North Amer-ican Numbering Plan (NANP) are removed before thestudy. All the statistics in this section are calculatedbased on a whole month data from January 2012. Tocompare the activities of spam numbers and legitimatenumbers, we obtain an equal amount of samples fromboth groups. In particular, the spam numbers are identi-fied from victim spam reports and the legitimate numbersare randomly sampled from the remaining SMS sendersappearing in the month-long CDR data set. Both samplesof phone numbers are checked by fraud agents before theanalysis to remove false positives and false negatives.

4.1 SMS Sending RatesWe first compare the SMS sending rates of known spamnumbers and legitimate numbers.We measure the send-ing rate at the granularity of hours, i.e., the average num-ber of unique recipients a phone number communicateswith hourly.The CCDF curves of the sending rates areshown in Fig. 3.

From Fig. 3, spam numbers have a much higher SMSsending rate than legitimate numbers. This is not surpris-ing given the purpose of spamming is to reach as manyvictims as possible within a short time period. In par-ticular, more than 95% of spam numbers have a sendingrate above 5 and more than 70% spam numbers exhibita sending rate above 50. In contrast, more than 97%of the legtimate numbers have a sending rate below 5.As we can see in Section 6, by enforcing a threshold onthe sending rate, we can filter out most of the legitimatenumbers without missing many spam numbers.

Due to their high spamming rates, at the node-Bs thatspam numbers are connected to, we find that the sheervolume of spamming traffic is astonishing. Spammingtraffic can exceed normal SMS traffic by more than 10times. Even at RNCs, which serve multiple node-Bs,traffic from spamming can account for 80% to 90% of to-tal SMS traffic at times. Such a high traffic volume fromspammers can exert excessive loads on the network, af-fecting legitimate SMS traffic. Furthermore, since SMSmessages are carried over the voice control channel, ex-cessive SMS traffic can deplete the network resource,and thus can potentially cause dropped calls and othernetwork performance degradation. Meanwhile, the in-creasing malware app instances that propagate throughthe SMS channel also emphasize the importance of re-stricting SMS spam activities in cellular networks.

We note that, although most legitimate numbers sendSMS at low rates (e.g., below 50), due to the large pop-

ulation size of the legitimate numbers, there are stillmany of them with high sending rates indistinguishablefrom those of spam numbers. Investigation shows thatthey belong to organizations which use the SMS serviceto disseminate information to their stakeholders, e.g.,churches, schools, restaurants, etc. How to distinguishthese legitimate intensive SMS senders from SMS spam-mers is the main focus of our Greystar system.

4.2 Spammer Target Selection StrategyWe next study how spammers select spamming targets.We characterize their target selection strategies at twolevels, i.e., how spammers choose area codes and howthey select phone numbers within each area code.We define the metric area code relative uncertainty

(rua) to measure whether a spammer favors phone num-bers within certain area codes. The rua is defined as:

rua(X) :=H(Xa)

Hmax(Xa)=

−∑q∈Q P(q) logP(q)

log|Q|,

where P(q) represents the proportion of target phonenumbers with the same area code q and |Q| is the totalnumber of area codes in the US. Intuitively, a large rua(e.g., greater than 0.7) indicates that the spammer uni-formly chooses targets across all the area codes. In con-trast, a small rua means the targets of the spammer comefrom only a few area codes.We next define a metric random spamming ratio to

measure how spammers select targets within each areacode. Let Pa be the proportion of active phone num-bers5 within area code a. For a particular spammingtarget sequence Xa of a spam number, if the spammerrandomly chooses targets, the proportion of active phonenumbers in Xa should be close to Pa. Otherwise, we be-lieve the spammer has some prior knowledge (e.g., withan obtained target list) to select specific phone numbersto spam. Based on this idea, we carry out a one sided Bi-nomial hypothesis test for each spammer and each areacode to see if the corresponding target selection strat-egy is random within that area code. The random spam-ming ratio is then defined as the proportion of area codeswithin which a spammer selects targets randomly (i.e.,the test fails to reject the randomness hypothesis with P-value=0.05). Note that, for each spam number, only areacodes with more than 100 victims are tested to ensure thevalidity of the test.

5The active phone numbers are identified as all registered phonenumbers inside the carrier’s billing database who have unexpired ser-vice plans. We find that the active numbers are uniform across all areacodes, possibly due to frequent phone number recycling within carriernetworks (e.g., phone numbers originally used by landlines are reas-signed to mobile phones) and users switching between cellular carrierswhile retaining the same phone numbers.

6

Page 8: Greystar: Fast and Accurate Detection of SMS Spam Numbers ...€¦ · SMS communication as other mobile numbers do (e.g., those associated with smartphones), they thereby form a grey

USENIX Association 22nd USENIX Security Symposium 7

Page 9: Greystar: Fast and Accurate Detection of SMS Spam Numbers ...€¦ · SMS communication as other mobile numbers do (e.g., those associated with smartphones), they thereby form a grey

8 22nd USENIX Security Symposium USENIX Association

few numbers which have sent no more than 1 SMS mes-sage during the one month period. For a majority ofthese numbers, all the messages they have received arespam (as indicated by the fact that most probability massis squeezed to a small region close to 1). This impliesthat these SMS inactive numbers are good indicators ofspamming activities, i.e., SMS senders who communi-cate with them are more likely to be spammers.

5 SMS Grey Phone Number Space

In order to utilize these SMS inactive numbers for spamdetection, we want to first answer the following ques-tions. Why do these numbers have a low volume of SMSactivity? Is there an inexpensive way to identify a sta-ble set of such numbers for building the detection sys-tem? To answer these questions, we carry out an in-depth analysis of SMS inactive users. We then definegrey phone space and propose a method for identifyingthe grey phone space using CDR records. In the end,we study properties of grey phone space and show thepotential of using it to detect spamming activities.

5.1 Investigating Service PlansCellular carriers often provide their customers with a richset of features to build their personal service plans. Usersare free to choose the best combination of features tobalance their needs and the cost. For example, a fre-quent voice caller often opts in an unlimited voice planand a user who watches online videos a lot can choosea data plan with a larger data cap. Therefore, serviceplans encode demographic properties of the associatedusers. We hence study the correlations between differentservice plan features and SMS activeness to understandthese SMS inactive users.More specifically, we extract all the service plans as-

sociated with the legitimate user samples, which includefeatures related to voice, data and SMS services. We cal-culate the Pearson correlation coefficients of the SMS ac-tiveness and individual plan features (treated as binaryvariables). The features are then ranked according to thecorrelation values. We summarize the top 5 features thatare positively and negatively correlated with SMS active-ness in Table 1.

Top 5 negatively correlated Top 5 positively correlatedText restricted Monthly unlimited voice/text

Voice restricted Messaging unlimitedText msg pay per use Rollover family plan

Voice/data prepaid Unlimited SMS/MMSLarge cap data plans Small cap data plans

Table 1: Corr. of activeness and plan features.

The top 5 features with negative correlations are inthe first column of Table 1. Many of these SMS in-active users are enrolled in the pay-per-use SMS plan,a common economical choice for users who rarely ac-cess SMS services. Interestingly, a large number of SMSinactive users have restrictions on their voice/text plansand have been simultaneously enrolled in large cap dataplans. Such restrictions only apply for mobile users withdata only devices, such as tablets and laptop data cards,etc. In contrast, the top 5 features with positive corre-lations are summarized in the second column. Most ofSMS active users have unlimited SMS plans, a favorablechoice of frequent SMS communicators. Many of themhave also enrolled in small cap data plans and unlimitedMMS plans, which are dedicated for smartphone users.

Though service plans demonstrate clear distinctionsbetween SMS inactive and active users, relying on ser-vice plans to identify SMS inactive users is not effec-tive in practice due to two reasons. First, service planschange frequently, especially when users upgrade theirdevices. Second, query service plan information per-sistently during run time can be very expensive. Fortu-nately, our analysis above also reveals that service plansare strongly correlated with the device types, e.g., dataonly device users are less active compared to smartphoneusers. Can we use device types as a proxy to identifySMS inactive users instead? We shall explore such pos-sibilities in the following section.

SMS towards data only devices. Like phones, laptopsand other data only devices are also equipped with SIMcards and hence, once connected to the network, are ableto receive SMS messages. We therefore can capture CDRrecords to these devices at MSCs. However, manufactur-ers often restrict text usage on these devices by maskingthe APIs related to SMS functions. Meanwhile, at thebilling stage, text messages to these data only devices(with a text restricted plan) are not charged by the carrier.There are exceptions such as laptops enrolled in regulartext messaging plans, however, such cases are rare basedon our observations.

5.2 Identifying Grey Phone SpaceThe device associated with each phone number can befound in the CDR data based on the first eight-digit TACof the IMEI. We use the most updated TAC to devicemapping from the UMTS carrier in January 2013 andhave identified 27 mobile device types (defined by thecarrier) which we summarize in Table 2. We note thatfiner grained analysis at individual device level is alsofeasible. However, we find that, except for the vehicletracking devices which we shall see soon, devices withineach category have strong similarity in their SMS active-ness distributions. Hence we gain little by defining grey

8

Page 10: Greystar: Fast and Accurate Detection of SMS Spam Numbers ...€¦ · SMS communication as other mobile numbers do (e.g., those associated with smartphones), they thereby form a grey

USENIX Association 22nd USENIX Security Symposium 9

Number of active TNs (log and normalized)

Gre

y TN

per

cent

age

(nor

mal

ized)

a 5a 20a 50a

b2b

3b4b

Figure 8: Grey number distribution.

Pct. of grey nbrs touched (log and normalized)

CD

F

a 5a 20a 100a 500a

0.0

0.2

0.4

0.6

0.8

1.0

Spam nbrUserUser|grey

Figure 9: Grey ratio.

Prop. of grey numbers accessed (normalized)

Den

sity

0 5a 10a 15a

020

4060

8010

0

Spam numberLegitimate number

Figure 10: Distr. of θ and θ ∗.

numbers at the device level.

Type ExamplesData-only

Laptop data cards, tablets, netbooks, eReaders, 3Gdata modems, etc.

M2M Security alarms, telematics, vehicle tracking de-vices, point-of-sale terminals, medical devices, etc.

Phone Smartphones, feature phones, quick messagingphones, PDAs, etc.

Table 2: Device categories and examples.

Fig. 7 shows the CDF distributions of SMS active-ness of phone numbers associated with different devicetypes. We observe three clusters of CDF curves. Thefirst one consists of curves concentrating at the top-leftcorner, representing devices with very low SMS active-ness. This cluster covers all data only devices and a ma-jority of machine-to-machine devices (see [11] for morediscussions of M2M devices). The second cluster lies inthe middle of the plot, which includes all phone devices.The third cluster contains only one M2M device type,which covers all vehicle tracking devices. Interestingly,the curve of such devices shows a bi-modal shape, wheresome devices communicate frequently using SMS whileother devices mainly stay inactive. Based on Fig. 7, wedefine grey numbers as the ones that are associated withdevices in the first cluster, i.e., data only devices andM2M devices excluding the vehicle tracking device cate-gory. The collection of all grey number are referred to asthe grey phone space. The grey numbers are representa-tives of a subset of SMS inactive users7. Meanwhile, thegrey phone space defined in this way is stable because it

7We use devices in the first cluster as our definitions of grey space,however, as we have seen in Fig. 7, even within the grey number cate-gories there are still (a very few) numbers that are highly active in SMScommunication. The proposed beta-binomial classification model (dis-cussed in detail in Section 6) will take into account this fact. Intuitively,the model detects a spam number only when it is observed to have sig-nificant interaction with the grey space. Given a majority of the greynumbers that are SMS inactive, the chance that a phone number is mis-

is tied to mobile devices instead of specific phone num-bers, whose behaviors can change over time (e.g., whena user upgrades the device). Furthermore, grey numberscan be identified directly based on the IMEIs in the CDRdata with little cost, as opposed to querying and main-taining service plan information for individual users.

5.3 Characterizing Grey Phone SpaceWe next study the distribution of grey numbers and showhow grey phone space can help us detect spamming ac-tivities.Fig. 8 shows the size of each area code in the phone

space (the x-axis, in terms of the number of active phonenumbers) and the proportion of grey phone numbers outof all active phone numbers in that area code (the y-axis).The correlation coefficient of two dimensions is close to0, indicating that grey numbers exist in both densely andsparsely populated areas. The wide distribution of greynumbers ensures a better chance of detecting spam num-bers equipped with random spamming strategies. To il-lustrate this point, we calculate the proportion of greynumbers out of all the numbers accessed by spam num-bers (red solid curve) and legitimate users (blue dottedcurve). We observe that a predominant portion of legit-imate users never touch grey phone space. In fact, lessthan 1% of the users have ever accessed grey numbers inthe 1 month observation period. In addition, we show thesame distribution for legitimate users (who have sent toat least 50 recipients in a month) conditioned on havingtouched at least one grey number. Compared to the spamnumbers which tend to access more grey numbers (redsolid curve), these legitimate users communicate withmuch fewer grey numbers. In most cases, the accessof grey numbers is triggered by users replying to spamnumbers who usually use M2M devices to launch spam.

classified as a spam number due to its interaction with these outliers inthe grey space is very small.

9

Page 11: Greystar: Fast and Accurate Detection of SMS Spam Numbers ...€¦ · SMS communication as other mobile numbers do (e.g., those associated with smartphones), they thereby form a grey

10 22nd USENIX Security Symposium USENIX Association

5.4 Discussion: Greyspace vs. Darkspace

In addition to the grey phone space, the “dark” phonespace (i.e., formed by unassigned phone numbers) canalso be a choice for detecting spam activities usingthe same technique proposed in this paper. Analogousconcepts of grey IP addresses and dark IP addressesfor detecting anomalous activities have been exploredin [12,13]. However, unlike IP addresses which are oftenassigned to organizations in blocks (i.e., sharing the sameIP prefix), the phone number space is shared by differ-ent cellular service providers, landline service providersand even (IP) TV providers. Even if some phone num-bers are assigned in blocks initially to a certain provider,the frequent phone number assignment changes causedby new user subscription, old user termination, recy-cling of phone numbers and phone number porting in/outbetween different providers will ultimately result in theshared ownership of the phone number space as we haveseen today. For example, different cellular and landlineproviders can have phone numbers under the same legit-imate area code. It is difficult to tell which phone num-ber belongs to which provider without inquiring the rightprovider.This poses significant challenges when we want to

identify dark (unassigned) phone numbers. As darkphone numbers can be anywhere in the phone numberspace (within legitimate area codes) and can belong toany provider, it is rather difficult to determine a darknumber, at least from the perspective of a single provider.For instance, just because a phone number is not assignedto any user/device belonging to a particular provider, itdoes not necessarily mean that such a number is dark. Inother words, accurate detection of dark numbers requiresthe collaboration of all the owners of the phone num-ber space, which is an intractable task. Meanwhile, suchdark number repository needs to be updated frequentlyto reflect the changes of phone number assignments.In comparison, grey numbers can be defined easily

with respect to a particular provider: these are phonenumbers assigned to devices belonging to customers ofthat provider where there are usually less SMS activitiesoriginated from these numbers (devices). Meanwhile,whether a number is grey is readily available to us (basedon the existing the IMEI numbers inside CDR records)without any extra work.

6 System Design

In this section, we first present an overview of Greystar.We then introduce the detection model and how wechoose parameters for the model.

6.1 System OverviewThe logic of Greystar is illustrated in Alg. 1, which runsperiodically at a predefined frequency. In our experi-ment, we run Greystar hourly. Greystar employs a timewindow of W (e.g., W equals 24 hours in our studies).The footprint of each SMS originating number s, e.g.,the sets of grey and non-grey numbers accessed by s (de-noted as Gs and Ns, respectively), are identified from theCDR data within W . After that, a filtering process isconducted which asserts two requirements on originat-ing numbers to be classified, i.e., in the past 24 hours:i) the sender is active enough (which has sent messagesto no less than M = 50 recipients. Recall the high send-ing rates of known spam numbers in Fig. 3); and ii) thesender has touched at least one grey number. These twocriteria, especially the second one, can help significantlyreduce the candidates to be classified in the follow-upstep. In fact, we find that, on average, less than 0.1%of users send SMS to grey numbers in each day. Moreimportantly, these users cover a majority of active SMSspammers in the network as we shall see in Section 7.As a consequence, this filtering step can noticeably re-duce the system load as well as potential false alarms.

Algorithm 1 Greystar algorithm.1: Input: CDR records D from the past W = 24 hours, M=50;2: Output: Spam number candidates C;3: From D, extract all SMS senders Orig;4: for each s ∈ Orig do5: Extract the CDR records associated with s: Ds ⊂ D;6: From Ds, identify the grey numbers Gs and non-grey

numbers Ns accessed by s;7: if |Gs|+ |Ns| ≥ M and |Gs| > 0 then8: if detect spamnbr(Gs, Ns)=1 then9: C := C∪{s};

10: end if11: end if12: end for

Once a sender passes the filtering process, the functiondetect spamnbr is called to classify the sender into eithera spam number or a legitimate number based on Gs andNs associated with that sender. In this paper, we proposea novel Beta-Binomial model for building the classifier,which we explain in detail next.

6.2 Classifier DesignWe assume a random SMS spammer selects spammingtargets following a two-step process. First, the spammerchooses a specific target phone number block. Second,the spammer uniformly chooses target phone numbersfrom that block. Let θ denote the density of grey num-bers in the target block and X := {xi},1 ≤ i ≤ n be the

10

Page 12: Greystar: Fast and Accurate Detection of SMS Spam Numbers ...€¦ · SMS communication as other mobile numbers do (e.g., those associated with smartphones), they thereby form a grey

USENIX Association 22nd USENIX Security Symposium 11

sequence of target phone numbers selected. Meanwhile,let k be the number of grey numbers in X . The target se-lection process can then be formulated as the followinggenerative process.

1. Choose a target block with grey number density θ ;

2. Choose xi ∼ Bernoulli(θ ), 1 ≤ i ≤ n;

We note that θ varies as a spammer chooses differ-ent phone number blocks. The choice of phone num-ber blocks is arbitrary. For example, A spammer canchoose a large phone block across multiple area codes ora small one consisting of only a fraction of phone num-bers within one area code. Therefore, θ itself can beconsidered as a random variable. We assume θ follows aBeta distribution8, i.e., θ ∼ Beta(α,β ), with a probabil-ity density function as:

P(θ |α,β ) =Γ(α + β )

Γ(α)Γ(β )θ α(1−θ )β

,

where Γ is the gamma function. Therefore, the randomvariable k follows a Beta-Binomial distribution:

P(k|n,α,β ) =

(

nk

)

Γ(k + α)Γ(n− k + β )

Γ(n + α + β )

Γ(α + β )

Γ(α)Γ(β )

The target selection process of legitimate users can beexpressed using the same process. Because legitimateusers tend to communicate less with grey numbers, theircorresponding θ ∗’s are usually much smaller. Let α∗ andβ ∗ be the parametrization of the Beta distribution asso-ciated with θ ∗. For a phone number that has accessed ntargets, out of which k are grey numbers, we classify itas a spam number (i.e., detect spamnbr returns 1) if

P(spammer|k,n)

P(legitimate|k,n)=

P(k|n,α,β )P(spammer)P(k|n,α∗,β ∗)P(legitimate)

> 1,

where the first equation is derived using the Bayes theo-rem. It is equivalent to

P(k|n,α,β )

P(k|n,α∗,β ∗)>

P(legitimate)P(spammer)

= η

In practice, it is usually unclear how many spammers arein the network, therefore, to estimate η directly is chal-lenging. We instead choose η through experiments.

8In Bayesian inference, the Beta distribution is the conjugate priorprobability distribution for the Bernoulli and binomial distributions. In-stead of using the Bernoulli model, we can model the second stage ofthe target selection process as sampling from a multinomial distribu-tion corresponding to different device types. In this case, the conjugateprior distribution of the multinomial parameters is the Dirichlet distri-bution. However, our preliminary experiments show little performancegain from applying the more sophisticated model in comparison to theincreased computation cost.

6.3 Parameter Selection

There are five parameters to be estimated in the classifier,α, β , α∗, β ∗ and η . We use the data from January 2012 todetermine these parameters. To obtain ground truth, wesubmit to the fraud agents a list of all the SMS sendersthat i) have sent to more than 50 recipients in a 24 hourtime window; and ii) at least one of the recipients is grey(recall the filtering criteria in Algorithm 1). Fraud agentscarry out investigation on these numbers for us and labelspam numbers in the list. We then divide the January datainto two subsets, the first two weeks of data for fittingthe Beta-binomial models (i.e., to determine the first fourparameters) and the rest of data is reserved for testing theclassifier to estimate η .

In particular, using the training data set, we estimatethe parameters for two Beta-binomial models using max-imum likelihood estimation. With the estimated pa-rameters, we illustrate the probability density functionθ ∼ Beta(α,β ) and θ ∗ ∼ Beta(α∗, β ∗) in Fig. 10. Thedensity functions agree with our previous observationsin Fig. 9. The mass of the probability function corre-sponding to the legitimate users concentrates on a narrowregion close to 0, implying that legitimate users commu-nicate much less with grey numbers than non-grey num-bers. In contrast, the density associated with spam num-bers widely spreads out, indicating more grey numbersare touched by spam numbers due to their random targetselection strategies.We evaluate the accuracy of the classifier given differ-

ent choices of η on the test data set and the ReceiverOperating Characteristic (ROC) curve is displayed inFig. 11. The x-axis represents the false alarm rate (orthe false positive rate) and the y-axis stands for the truedetection rate (or the true positive rate). From Fig. 11,with a certain η , Greystar can detect more than 85%spam numbers without producing any false alarm. Wewill choose this η value in the rest of our experiments9.

7 Greystar Evaluation

In this section, we conduct an extensive evaluation ofGreystar using five months of CDR data and compare itwith the methods based on victim spam reports in termsof accuracy, detection delay and the effectiveness in re-ducing spam traffic in the network.

9Note that the exact parameter values used in Greystar are propri-etary and we are not able to release them in the paper. We have alsotested the choice of η using different partitioning of the training/testdata. The η remains stable across experiments.

11

Page 13: Greystar: Fast and Accurate Detection of SMS Spam Numbers ...€¦ · SMS communication as other mobile numbers do (e.g., those associated with smartphones), they thereby form a grey

12 22nd USENIX Security Symposium USENIX Association

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

False positive rate

True

pos

itive

rate

Figure 11: ROC curve (false positiverate vs. true positive rate.

Jan Feb Mar Apr May

Num

ber o

f TN

s0

2000

4000

6000

8000

1000

0

ConfirmedMissedAdditional

Figure 12: Accuracy evaluation (incomparison to victim spam reports).

1 2 5 10 20 50 100 500

0.0

0.2

0.4

0.6

0.8

1.0

Delay (hours in log scale)

CD

F

Report 1+Report 3+

Figure 13: Detection speed comparedto spam report based methods.

7.1 Accuracy Evaluation

To estimate the accuracy and the false alarm rate, weagain consult with the fraud agents to check the num-bers from Greystar detection results. False negatives (ormissed detections), on the other hand, are more difficultto identify. Given the huge number of negative examplesclassified, we are unable to have all of them examined bythe fraud agents to identify all missed detections becauseof the high manual investigation cost. As an alternativesolution, we compare Greystar detection results with vic-tim spam reports to obtain a lower bound estimate of themissed detections.

More formally, let Sg denote the detection results fromGreystar and Sc be the spam numbers contained in thevictim spam reports received during the same time pe-riod. We define missed detections of Greystar as Sc −Sg.In addition, we define additional detections of Greystaras Sg − Sc to measure the value brought by Greystar tothe existing spam defense solution. The monthly accu-racy evaluation results are displayed in Fig. 12.

The blue bars in Fig. 12 illustrate the spam numbersvalidated by fraud agents in each month. Greystar is ableto detect thousands of spam numbers per month. The as-cending trend of detected spam numbers coincides withthe increase of victim spam reports in the five-month ob-servation window. This implies that Greystar is able tokeep up with the increase of spam activities. In additionto the large number of true detections, Greystar is highlyaccurate given only two potential false alarms are identi-fied by fraud agents in 5 months. Interestingly, these twonumbers are associated with tenured smartphone userswho suddenly behave abnormally and initiate SMS mes-sages to many recipients whom they have never commu-nicated with in the past. We suspect these users havebeen infected by SMS spamming malware that launchspam campaigns from the users’ devices without theirconsent. To identify SMS spamming malware and hence

Time (hours) in a week

# sp

am m

sgs

(nor

mal

ized)

02a

4a6a

8a10

a12

a

0 50 100 150

Total Report 1+ Greystar

Figure 14: Number of spam messages after restriction.

removing such false alarms will be our future work.In comparison to the victim spam reports, Greystar

detects over 1000 addition spam numbers that were notreported by spam victims while missing less than 500monthly. Meanwhile, although a majority of the spamnumbers detected by Greystar are also reported by spamvictims, Greystar can detect these numbers much fasterthan methods based on victim reports, and consequentlycan suppress more spam messages in the network. Weillustrate this point in the next section.

7.2 Detection Speed and Benefits to Cellu-lar Carriers

We note that, to reduce noise, cellular carriers often relyon multiple spam reports (e.g., K reports) from differentvictims to confirm a spam number. We refer to such acrowdsourcing method as the K+ algorithm. To evaluatethe speed of Greystar, we compare it with two versions ofthe K+ algorithms, namely, 1+ and 3+. Comparing with

12

Page 14: Greystar: Fast and Accurate Detection of SMS Spam Numbers ...€¦ · SMS communication as other mobile numbers do (e.g., those associated with smartphones), they thereby form a grey

USENIX Association 22nd USENIX Security Symposium 13

1+ supplies us with the lower bound of the time differ-ence and comparison with 3+ illustrates the real benefitbrought by Greystar to practical spam defense solutions.More specifically, we measure how many hours Greystardetects a spam number ahead of 1+ and 3+, respectively.Fig. 13 shows the CDF curves of the comparison results,where we highlight the location on the x-axis correspond-ing to 24 hours with a green vertical line. We observe thatGreystar is much faster than K+ algorithms. For exam-ple, Greystar is one day ahead of 1+ in 50% of the casesand is one day before 3+ in more than 90% of the times.We find that, on average, it takes less than 1.2 hours

for Greystar to detect a spam number after it starts spam-ming (i.e., starts sending messages to more than 50 vic-tims in an hour). The fast response time of Greystar isaccredited to the much larger population of grey num-bers, from which Greystar can gather evidence to detectmore spam numbers more quickly. In addition, collect-ing evidence passively from grey numbers eliminates thedelay during the human reporting process (recall Fig. 2).Therefore, Greystar is characterized with a much fasterdetection speed than the K+ algorithm. Such a gain inthe detection speed can lead to more successful reductionof spam traffic in the network. We illustrate this pointnext.

For simplicity, we assume a spam number can be in-stantly restricted after being detected. We run simulationon a one week dataset (the first week of January 2012)and calculate the number of spam messages appearingin each hour assuming a particular spam detection algo-rithm is deployed exclusively in the network. The resultsare illustrated in Fig. 14. The total spam messages arecontributed by known spam numbers observed in thatweek. We observe that Greystar can successfully sup-press the majority of spam messages. During peak hourswhen the total number of spam messages exceeds 600K,only around 150K remains after Greystar is deployed. Inother words, Greystar leads to an overall reduction of75% of spam messages during peak hours. In compar-ison, 1+ only guarantees a spam reduction of 50% dueto long detection delay. We note that, due to the noisein the spam reports, cellular providers often employ K+(K ≥ 3) instead of 1+ to avoid false alarms. In this case,the benefit from Greystar is even more substantial.

7.3 Analysis of Missed DetectionsIn this section, we investigate the missed detections(false negatives) from Greystar, i.e., the spam numbercandidates that were not detected by Greystar but havebeen reported by spam victims. There are around 500such numbers in each month and totally around 27Kmissed detections. We note that we focus only on a sub-set of the candidates who are customers of the cellular

network under study, for whom we have access to a muchricher set of information sources to carry out the inves-tigation. We believe the conclusions from analyzing thissubset of candidates also apply for other candidates out-side the network.

We classify these candidates into three groups basedon the volume of the associated CDR records.

No volume. We do not observe any CDR record for19.5% of the numbers. We inquiry the SMS billingrecords for these numbers and find that many of theminitiate a vast amount of SMS traffic to foreign countries,such as Canada and Jamaica, etc., and hence no CDRrecord has been collected to trigger Greystar detection.

Low volume. We find around 27% of the missed de-tections have accessed less than 50 recipients during theobservation period. We study the text content inside thevictim spam reports to understand the root cause of thesemissed detections. The most popular text content areparty advertisements and promotions from local restau-rants. Users are likely to have registered with these mer-chants in the past and hence received ads from them. Forthe rest of the numbers, we find many send out spammessages to advertise mobile apps and premium SMSservices. From the users’ comments posted on online fo-rums and social media sites [14, 15], we find two of theadvertised apps are messenger/dating apps which haveissues with their default personal settings. Without man-ual correction, these apps, once initiated, will send outfriend requests to a few random users of the apps. Spammessages from the remaining numbers are also likely tobe sent out without users’ consent, especially the onesthat broadcast premium SMS services. We suspect theyare caused by apps abusing permissions or even behav-iors of malware apps. For example, one app advertisedby spam is reported to contain malware that sent SMStext to the contact list on the infected device, where thetext contains a URL for downloading that malware.

High volume. The rest of the phone numbers send SMSto a large number of recipients. From the reported spamtext, we find 7.1% of them belong to legitimate advertiserwho broadcast to registerred customers and are somehowreported by the recipients. For the rest of numbers, wefind their spam topics are quite different from those ofthe detected ones. In particular, 11% of these numbersare associated with adult sites or hotlines, in comparisonto only 0.06% among the detected numbers. Meanwhile,17.6% of them advertise local shopping deals, as op-posed to only 2.1% among the detected ones. Such dif-ference suggests that these spam victims somehow gaveout their phone numbers to spammers, e.g., while visitingmalicious sites to register services or to purchase prod-ucts. In addition, we extract the voice call history associ-ated with these high volume candidates. Interestingly, we

13

Page 15: Greystar: Fast and Accurate Detection of SMS Spam Numbers ...€¦ · SMS communication as other mobile numbers do (e.g., those associated with smartphones), they thereby form a grey

14 22nd USENIX Security Symposium USENIX Association

find that about 4% of these numbers have initiated phonecalls to many terminating numbers in the past. We sus-pect that these spammers employ auto-dialers to harvestactive phone numbers (i.e., the ones that have answeredthe calls) from the phone number space. With the listof active phone numbers, spammers can send spam moreeffectively and avoid detection in the mean time.

Admittedly, there are spam numbers in these threecategories that are missed by Greystar because they areequipped with a target number list obtained through auto-dialing or social engineering techniques (for example,accurate target lists can potentially be obtained by ap-plying techniques discussed in [16]). SMS traffic fromthese users is not differentiable from that of the legiti-mate users. However, we emphasize that these misseddetections only account for less than 9% of all the spamnumbers detected and they will not have a significant im-pact on the efficacy of Greystar for reducing the overallspam traffic. In fact, we find that, on average, the misseddetections sent 37% less spam messages in comparisonto the spam numbers detected by Greystar. On the otherhand, we do see the needs of combining Greystar andother methods to build a more robust defense solution.For example, many malicious activities can be better de-tected by correlating different channels (e.g., voice, SMSand data). Meanwhile, cellular carriers can collaboratewith mobile marketplace to detect and control suspiciousapps that can potentially initiate spam.

8 Related Work

The demographic features and network behaviors ofSMS spammers were analyzed in [6]. [16] investigatedthe security impact of SMS messages and discussed thepotential of denying voice service by sending SMS tolarge and accurate phone hitlists at a high rate. Mean-while, [16] also discussed several ways of harvesting ac-tive phone numbers, which can potentially be employedby SMS spammers to generate accurate target numberlists to launch spam campaign more efficiently and toevade detection. Similar short message services carriedby the data channel were also studied. For example, [17]characterized spam campaigns from “wall” messages be-tween Facebook users. [18–21] analyzed Twitter spam.[22, 23] studied talkback spam on weblogs. Meanwhile,akin to SMS spammers, the behaviors of email spam-mers were characterized in [24–27]. In comparison, wenot only study the strategies of SMS spammers but alsopropose an effective spam detection solution based onour analysis.

In addition to the victim spam reports mentioned ear-lier, network behaviors of spammers, e.g., sending pat-terns, have been used in SMS spam detection, suchas [28]. Similar network statistics based methods de-

signed for email spam detection can also be applied foridentifying SMS spam, such as [29–32]. However, thesemethods often suffer from large false positive rates, be-cause many legitimate customers can exhibit SMS send-ing patterns similar to those of spammers. In contrast,Greystar utilizes a novel concept of grey phone spaceto detect spam numbers, which yields an extremely lowfalse alarm rate.

Some systems have been developed in the form ofsmartphone apps to classify spam messages on user mo-bile devices [33–35]. However, not all mobile devicessupport executing such apps. Furthermore, from a user’sperspective, this method is a late defense as the spammessage has already arrived on his/her device and theuser may already be charged for the spam message.Moreover, the high volume of spam messages that havealready traversed the cellular network may have resultedin congestion and other adverse network performanceimpacts. Greystar is deployed inside the carrier networkand hence do not have these drawbacks. As we have seenin Section 7, Greystar can quickly detect spam numbersonce they start spamming and hence significantly reducespam traffic volume in the network.

Similar to our work, many works have leveraged un-wanted traffic for anomaly detection, such as Internetdark space [13, 36], grey space [12], honeynet [37, 38]and failed DNS traffic [39], etc. We are the first to ad-vance the notion of grey phone space and propose a novelstatistical method for identifying SMS spam using greyphone space.

9 Conclusion and Future Work

In this paper, we presented the design of Greystar, an in-novative system for fast and accurate detection of SMSspam numbers. Greystar monitors a set of grey phonenumbers, which signify impending spam activities tar-geting a large number of mobile users, and employs anadvanced statistical model for detecting spam numbersaccording to their interactions with grey phone numbers.Using five months of SMS call detail records collectedfrom a large cellular network in the US, we conductedextensive evaluation of Greystar in terms of the detec-tion accuracy and speed, and demonstrated the great po-tential of Greystar for reducing SMS spam traffic in thenetwork.

Our future work will focus on applying Greystar to de-tect other suspicious activities in cellular networks, suchas telemarketing campaigns. Meanwhile, we will corre-late Greystar detection results with cellular data traffic todetect malware engaged in such spamming activities.

14

Page 16: Greystar: Fast and Accurate Detection of SMS Spam Numbers ...€¦ · SMS communication as other mobile numbers do (e.g., those associated with smartphones), they thereby form a grey

USENIX Association 22nd USENIX Security Symposium 15

Acknowledgments

The work was supported in part by the NSF grants CNS-1017647 and CNS-1117536, the DTRA grant HDTRA1-09-1-0050. We thank Peter Coulter, Cheri Kerstetter andColin Goodall for their useful discussions and construc-tive comments. Finally, we thank our shepherd, PatrickTraynor, for his many suggestions on improving the pa-per.

References

[1] Federal communications commission. Spam:unwanted text messages and email, 2012.http://www.fcc.gov/guides/spam-unwanted-text-messages-and-email.

[2] Mobile spam texts hit 4.5 billion.http://www.businessweek.com/news/2012-04-30/mobile-spam-texts-hit-4-dot-5-billion-raising-consumer-ire.

[3] C. Baldwin. 350,000 different typesof spam sms messages were tar-geted at mobile users in 2012, 2013.http://www.computerweekly.com/news/2240178681/350000-different-types-of-spam-SMS-messages-were-targeted-at-mobile-users-in-2012.

[4] 69% of mobile phone users get text spam, 2012.http://abcnews.go.com/blogs/technology/2012/08/69-of-mobile-phone-users-get-text-spam/.

[5] Y. Jin, N. Duffield, A. Gerber, P. Haffner, W.-L.Hsu, G. Jacobson, S. Sen, S. Venkataraman, andZ.-L. Zhang. Making sense of customer tickets incellular networks. In Proc. of the 30th IEEE In-ternational Conference on Computer Communica-tions, 2011.

[6] I. Murynets and R. Jover. Crime scene investiga-tion: Sms spam data analysis. In Proc. of the 12thACM Internet Measurement Conference, 2012.

[7] S. Sinha, M. Bailey, and F. Jahanian. ImprovingSPAM blacklisting through dynamic thresholdingand speculative aggregation. In Proc. of the 17thAnnual Network and Distributed System SecuritySymposium, 2010.

[8] J. Jung and E. Sit. An empirical study of spam traf-fic and the use of DNS black lists. In Proc. of the4th ACM Internet Measurement Conference, 2004.

[9] A. Ramachandran, N. Feamster, and D. Dagon. Re-vealing botnet membership using dnsbl counter-intelligence. In Proc. of the 2nd Workshop on

Steps to Reducing Unwanted Traffic on the Inter-net, 2006.

[10] Y. Xie, F. Yu, K. Achan, R. Panigrahy, G. Hulten,and I. Osipkov. Spamming botnets: signatures andcharacteristics. In Proc. of the 2008 ACM SIG-COMM Annual Conference, 2008.

[11] M. Shafiq, L. Ji, A. Liu, J. Pang, and J. Wang.A first look at cellular machine-to-machine traffic:large scale measurement and characterization. InProc. of the 2012 ACM International Conferenceon Measurement and Modeling of Computer Sys-tems, 2012.

[12] Y. Jin, G. Simon, K. Xu, Z.-L. Zhang, and V. Ku-mar. Gray’s anatomy: dissecting scanning activitiesusing ip gray space analysis. In Proc. of the 2ndWorkshop on Tackling Computer Systems Problemswith Machine Learning Techniques, 2007.

[13] R. Pang, V. Yegneswaran, P. Barford, V. Paxson,and L. Peterson. Characteristics of internet back-ground radiation. In Proc. of the 4th ACM Internetmeasurement conference, 2004.

[14] Sms watchdog. http://www.smswatchdog.com.

[15] 800notes - Directory of unknown callers.http://www.800notes.com.

[16] W. Enck, P. Traynor, P. McDaniel, and T. La Porta.Exploiting open functionality in sms-capable cellu-lar networks. In Proc. of the 12th ACM Conferenceon Computer and Communications Security, 2005.

[17] H. Gao, J. Hu, C. Wilson, Z. Li, Y. Chen, andB. Zhao. Detecting and characterizing social spamcampaigns. In Proc. of the 10th ACM Internet Mea-surement Conference, 2010.

[18] S. Ghosh, B. Viswanath, F. Kooti, N. Sharma,G. Korlam, F. Benevenuto, N. Ganguly, andK. Gummadi. Understanding and combating linkfarming in the twitter social network. In Proc. ofthe 21st International World Wide Web Conference,2012.

[19] K. Thomas, C. Grier, V. Paxson, and D. Song. Sus-pended accounts in retrospect: an analysis of Twit-ter spam. In Proc. of the 11th ACM Internet Mea-surement Conference, 2011.

[20] C. Yang, R. Harkreader, J. Zhang, S. Shin, andG. Gu. Analyzing spammers’ social networks forfun and profit: a case study of cyber criminalecosystem on twitter. In Proc. of the 21st Inter-national World Wide Web Conference, 2012.

15

Page 17: Greystar: Fast and Accurate Detection of SMS Spam Numbers ...€¦ · SMS communication as other mobile numbers do (e.g., those associated with smartphones), they thereby form a grey

16 22nd USENIX Security Symposium USENIX Association

[21] C. Grier, K. Thomas, V. Paxson, and M. Zhang.@spam: the underground on 140 characters or less.In Proc. of the 17th ACM Conference on Computerand Communications Security, 2010.

[22] E. Bursztein, P. Lam, and J. Mitchell. Track-back spam abuse and prevention. In Proc. of the2009 ACM workshop on Cloud computing security,2009.

[23] E. Bursztein, B. Gourdin, and J. Mitchell. Reclaim-ing the blogosphere talkback a secure linkback pro-tocol for weblogs. In Proc. of the 16th Euro-pean Symposium on Research in Computer Secu-rity, 2011.

[24] C. Kreibich, C. Kanich, K. Levchenko, B. Enright,G. Voelker, V. Paxson, and S. Savage. On thespam campaign trail. In Proc. of the 1st USENIXWorkshop on Large-Scale Exploits and EmergentThreats, 2008.

[25] C. Kreibich, C. Kanich, K. Levchenko, B. Enright,G. Voelker, V. Paxson, and S. Savage. Spamcraft:An inside look at spam campaign orchestration. InProc. of the 2nd USENIX Workshop on Large-ScaleExploits and Emergent Threats, 2009.

[26] C. Kanich, C. Kreibich, K. Levchenko, B. Enright,G. Voelker, V. Paxson, and S. Savage. Spamalytics:An empirical analysis of spam marketing conver-sion. Communications of the ACM, 52(9):99–107,2009.

[27] A. Pathak, F. Qian, C. Hu, M. Mao, and S. Ranjan.Botnet spam campaigns can be long lasting: evi-dence, implications, and analysis. In Proc. of the2009 ACM International Conference on Measure-ment and Modeling of Computer Systems, 2009.

[28] Q. Xu, E. Xiang, Q. Yang, J. Du, and J. Zhong. Smsspam detection using noncontent features. Intelli-gent Systems, IEEE, 27(6):44 –51, 2012.

[29] T. Ouyang, S. Ray, M. Rabinovich, and M. Allman.Can network characteristics detect spam effectivelyin a stand-alone enterprise? In Proc. of the 12thPassive and Active Measurement conference, 2011.

[30] M. Sirivianos, K. Kim, and X. Yang. Introducingsocial trust to collaborative spam mitigation. InProc. of the 30th IEEE International Conference onComputer Communications, 2011.

[31] S. Hao, N. Syed, N. Feamster, A. Gray, andS. Krasser. Detecting spammers with snare: spatio-temporal network-level automatic reputation en-gine. In Proc. of the 18th USENIX Security Sym-posium, 2009.

[32] A. Pitsillidis, K. Levchenko, C. Kreibich,C. Kanich, G.M. Voelker, V. Paxson, N. Weaver,and S. Savage. Botnet judo: Fighting spam withitself. In Proc. of the 17th Annual Network andDistributed System Security Symposium, 2010.

[33] K. Yadav, P. Kumaraguru, A. Goyal, A. Gupta,and V. Naik. Smsassassin: crowdsourcing drivenmobile-based system for sms spam filtering. InProc. of the 12th Workshop on Mobile ComputingSystems and Applications, 2011.

[34] G. Cormack, J. Hidalgo, and E. Sanz. Feature en-gineering for mobile (sms) spam filtering. In Proc.of the 30th international ACM SIGIR conference,2007.

[35] H. Tan, N. Goharian, and M. Sherr. $100,000 PrizeJackpot. Call now! Identifying the pertinent fea-tures of SMS spam. In Proc. of the 35th AnnualACM SIGIR Conference, 2012.

[36] E. Wustrow, M. Karir, M. Bailey, F. Jahanian, andG. Huston. Internet background radiation revis-ited. In Proc. of the 10th ACM Internet measure-ment conference, 2010.

[37] The honeynet project, 2012.http://project.honeynet.org/.

[38] G. Dunlap, S. King, S. Cinar, M. Basrai, andP. Chen. Revirt: enabling intrusion analysis throughvirtual-machine logging and replay. In Proc. of the2nd USENIX Symposium on Operating Systems De-sign and Implementation, 2002.

[39] N. Jiang, J. Cao, Y. Jin, L. Li, and Z.-L. Zhang.Identifying suspicious activities through dns failuregraph analysis. In Proc. of the 8th IEEE Interna-tional Conference on Network Protocols, 2010.

16