[ieee multimedia technology (ic-bnmt 2010) - beijing, china (2010.10.26-2010.10.28)] 2010 3rd ieee...

Proceedings of IC-BNMT20 10

A SELF LEARNING MODEL FOR DETECTING SIP

MALFORMED MESSAGE ATTACKS

Sohail Aziz, Mehroz Gul

Computer Science Department, National University of Computer and Emerging Sciences, Islamabad, Pakistan sohail_ [email protected], [email protected]

Abstract

This paper analyses the vulnerabilities exist in SIP protocol, and how these vulnerabilities can be exploited by attackers to attack the SIP based networks i.e VoIP and IMS [IP Multimedia Subsystem]. An attack tool is developed to exploit those vulnerabilities and a two-gram self learning solution is proposed to protect SIP based networks from these attacks.

Keywords: SIP malformed messages, self learning, SIP fuzzing , malformed message detection ,twogram detection model, SIP attack.

1 Introduction

Fixed and mobile networks have gone through big transition in last 20 years. Over the years different standards were introduced for facilitating the users. Efforts have been made to lowering the cost while improving the efficiency of the communication. Now the invention ofVoIP and the standardization of IP multimedia subsystem (IMS) changes the definition of communication. V oIP lower the cost with improved voice quality while IMS promise to provides all the multimedia services independent of transport medium.

However these advanced technologies also open a new horizon for the attackers. Being open to internet , both VoIP and IMS are vulnerable to a large number of application layer attacks. Session Initiation Protocol (SIP) is the primary protocol for multimedia communications in VoIP and IMS. However defined SIP grammar left many loop holes which can be exploited by hackers and attackers to attack the SIP infrastructure.

In this paper we have proposed a "self-learning" solution to detect the anomalous SIP messages attacks. Our model learns the pattern of a normal SIP message from the benign traffic dataset and detects the anomalous SIP messages by computing its deviation from the model of normality. The rest of the paper is arranged as follows. Section two will describe the problem statement, then we

978-1-4244-6769-3/10/$26.00 ©2010 IEEE

711

will discuss the attack tool used for creating malformed messages , then we describe the test bed and detection framework and then conclude the paper with results section.

2 Problem statement

SIP is an application-layer signalling protocol for creating, modifying, and terminating multimedia sessions between one or more participants [1]. This is the primary control protocol used in VoIP and IMS for multimedia sessions. Both VoIP and IMS are open to internet and hence provides a new horizon for the attackers to drop off the SIP servers by exploiting the vulnerabilities exist in SIP protocol. SIP is a text based protocol like HTTP and hence is vulnerable to "malformed" message attacks. The tem "malformed" referred to any such SIP message which does not conform to the defined protocol standard. A malformed message attack can have severe effects on the SIP servers. Like flood attack , this can ultimately cause a Denial of Service (DoS) by crashing a SIP server or can cause significant application delays otherwise. Below are the possible effects , a malformed SIP message attack can cause.

2.1 Denial of Service (DoS)

The malformed SIP attack can fairly cause DoS by crashing the SIP parser or taking the SIP server to an undefined state. A server is considered to be crashed when its stops performing the expected functionality and also stops providing service to the user. Denial of Service (DoS) is the primary effect of session tear down. Specific-user DoS or wholesale DoS can occur, depending upon the target. A side effect of session tear down is that the proxy may not be aware of the calls being town down and will not have proper call records [2].

2.2 Application delays

Application delays are experienced by the legitimate users when the server's resources, CPU, bandwidth, memory are consumed in processing useless tasks. Significant application delays can be

caused by sending anomalous SIP message which contains unexpected message format and/or values. This can lead to the SIP server to an undefined state or at least keep it busy in processing anomalous messages.

2.3 Privacy compromise

The security of the user information is the major motive of SIP security realm. A malformed SIP attack namely SQL injection, can cause a major damage to the important database records. Such kind of attacks target database layer of the application which can cause manipulation, addition, deletion and table drops in the database. This kind of activity can result lethal information lose that tend to diminish the usability of SIP by miles.

3 Attack tool

There are many SIP fuzzer available publicly , however they only provide a static dumps of malformed SIP messages. These tools also , can't be used for mimic attacks. We have developed our own attack tool. Fiugure-l shows the basic modules of the attack tool. Our attack tool creates the anomalous messages which are the superset of the publicly available fuzzers. Its functions can be classified as following:

3.1 SIP message grammar

We will first discuss the different loop holes exist in SIP grammar and then explain how these can be exploited to create anomalous SIP messages.

Below are some ABNF (Augmented Backus-Naur Form) grammar definitions of different SIP fields.

Request-Line = Method SP Request-URI SP SIP-Version CRLF

Method = INVITEm / ACKm / OPTIONSm / BYEm / CANCELm /

REGISTERm / extension-method

transport-param = "transport=" ( "udp" / "tcp" /

"sctp" / "tls" / other-transport)

other-transport = token

token 1 *(alphanum / "-" / "." / "!" / "%" / "*,, /

"-" / "+" / "." / "'" / "�,, )

If we talk about Request-Line definition, RequestLine is defined by a method and method is defined as all legal SIP methods like INVITE ,ACK , OPTIONS etc plus an extension method which is undefined and can be exploited. Likewise other-transport in transport-param left open. These all undefined fields are kept for future use but this openness made SIP protocol open for malformed message attacks.

715

3.2 Malformed generation process

Malformed messages are generated by mutating different string sequences at all possible positions in a SIP message .. Below, a legal Request line of a SIP message is shown

INVITE sip:fast @opensip.com SIP/2.0

However following are also the valid SIP request lines according to the SIP grammar mentioned above.

Abcdefghijklmnopqrstuvwxyz sip:[email protected] SIP/2.0

%s%d%x%l%s%d%x%l%s%d%x%l sip:[email protected] SIP/2.0 \275\267\288\290\245 sip:[email protected] SIP/2.0

These valid request lines can cause severe problems for a SIP parser in server and/or SIP client.A malformed message can be one of the following types. 1) Formatted string sequences-Formatted strings have special meaning in parsers and compilers. Large sequences of formatted strings can cause buffer overflow in the SIP server . 2)Ansi-Escape sequences-Ansi-Escape characters also have some special meaning in the computer system i.e \b \r .Large string sequences of such characters can cause buffer overflow and undefined state in servers. 3) UTF-8 sequences- UTF characters are used to define many different languages and also for special symbols. However there are some UTF sequences declared invalid. If those invalid utf-8 sequence mutate in different position in a SIP message, can cause problems for SIP parser which includes buffer overflow, undefined state and server crash. 4) Space/Null replication-Spaces and Nulls are valid characters used in SIP message. However their presence can be exploited as they act as token characters. Large space/null sequences can cause buffer overflow, undefined state , infinite loop and even server crash. 5) ASCII characters replication-Many ASCII characters acts as token in a SIP message.

From: "SUNRISE" <sip:[email protected]:5065>;tag=as2cd43bI8

A sample from field of a SIP message is shown above. It can be seen that many ASCII characters are being used as token string some of them are { : , < , > , @ , ; }. Token strings are processed differently and are usually used as a condition in compilers and parsers. If these characters are replicated at the position of their occurrences , they can make parser busy in infinite loop or undefined state.

Above defined fuzz types are mutated in a normal SIP message. Formatted String Sequences, AnsiEscape Sequences, UTF-8 Sequences, Space Replication, Null Replication are done by inserting these sequences at every possible index of a normal SIP message i.e.

n I Insert( St , Pi) i=O St := {Formatted strings , Null , Space , UTF-8, Ansi-Escape, Ascii} ; where n is the SIP message length , St is the one of the above defined fuzz types and Pi is the i-th index of SIP message. However ASCII mutation is done differently. If x is any ASCII character , it is mutated only where it originally occur.

255 n I I Insert( Xi , Pj) o i=O, j=O

where n is the total occurrences of an ASCII character Xi and Pj is the position of that particular ASCII character. Following is the example of the colon (:) mutation. From :::::::::::::::::: "SUNRISE" <sip::::::::::: :[email protected]::::::::::::::5065>;tag =as2cd43bl8

4 Self learning intrusion detection system:

This section will explain the proposed "self learning" system for anomaly detection in SIP message . . The system is "self-learning", as it is able to automatically retrain itself in order to adapt changes in the SIP message content.

Figure 1. Self learning model

The system first go through a learning phase and learns the pattern of a normal SIP message of various kinds , once it passed through learning phase , it has enough information in Self database to detect whether the newly arrived message is a normal message or anomalous one. Figure 1 shows the modular structure of the intrusion detection system. Feature extractor extract the feature from a SIP message content and passed it to learning module. Learning module stores these features in as self database. Once the training phase

746

ends, anomaly detector uses the features stored in self database to compute the deviation from the newly arrived message's features and based on the deviation from the normal model it declare a message as normal or anomalous. Below are the different methods used for feature extraction and anomaly detection.

4.1 Feature extraction

Two Different methodologies are used for feature extraction. These are one-gram and two-gram.

4.1.1 One-gram

One-gram means moving the sliding window of size one , over the message content and recording the frequencies of each ASCII character. Figure 2 shows the vector space of all the ASCII character from 0 to 255 attained at the end of complete message.

'0' -m� � :m��m-�5' 1-Gram Frequency Model

Figure 2. One gram

4.1.2 Two-gram

Two gram is similar to one-gram except the window size is two at this time. Again Sliding window of size two is moved upon the SIP message content, by one and frequencies of ACII pairs are recorded. At the end of complete message scan , we would have a vector space of ASCII character's pair as given below.

'00'- �IP' ·····�DP ' ..... 62' .... ·60' _"255,255 ' ..... � � . . ..... "

c:b c:b 1 1

2-Gram Frequency Model

Figure 3. Two gram

4.2 Anomaly detection

Anomaly detector uses the following distance formulas to compute the distance between the features in self database and of newly arrived message.

4.2.1 Mahalanobis distance

Mahalanobis distance is most commonly used as a multivariate outlier statistics. It is computed by the following formula.

D 2 = (x -Il)' 2: -1(x -Il)

with 2: the covariance matrix of the distribution. D is called the Mahalanobis distance of the

point x to the mean Il of the distribution. In the figure 4 both A and B are at the same mahalanobis distance from centre o.

Figure 4. Mahalanobis distance

4.2.2 Character distance

This is the second method used for calculation the distance between the normal model and anomalous message. The total frequencies of all ASCII characters (in one-gram) and pair of ASCII characters (in two-gram) are recorded in learning phase. Distance between the ASCII characters of newly arrived message and that of learned model ,is computed by following formula.

D= F(X) - Max (F(X)) + stdev(F(X))

Where F(Xi) is the maximum frequency of the character X in self database and stdev(x) is the standard deviation of the frequency of X.

4.3 Strategy one one-gram

4.3.1 Feature extraction

The primary feature for this strategy is the frequency count of individual ASCII characters in a SIP message. First a SIP message is classified according to the message type e.g INVITE , CANCEL, BYE etc and then its number of individual ASCII character's frequency is recorded as shown in the figure 5. The example below explains how a string from a SIP message assigns a vectors space. Via: SIP/2.0/UDP , -� ... % .. � .... � .... - '"

1-GramFrequencyModel

Figure 5. Feature extraction

At the end of this phase , we will have number of histograms of different SIP types , representing the normal model of the ASCII characters in that particular message type.

4.3.2 Anomaly detection

1) Method-I (Character Distance)_In this approach, distances between individual ASCII characters of SIP message and of those in self

747

database, are computed using the character distance formula. A SIP message is declared as anomalous if its individual character's distance is more than a specific threshold value.

'8' Occurrence in Normal Msg 'B' in Anomalous Msg

Figure 6. Character distance

This diagram shows the domain of normality that is defined while training with different distance values with variable Dl, D2, D3 and D4. Node A,B and C defines the occurrence of the ASCIIs

2) Method-II (Mahalanobis Distance)-In this approach, the Mahalanobis distance of the individual messages is calculated with the self database. A SIP message is filtered on the basis of the fact that how much it deviate from the normal model. As some characters occur more frequently than others so different weights are assigned to the different characters. ASCII characters ASCII '0' -

'47' , '58' - '64 ', '91' - '96' and '123' - '255' are assigned ideally tuned weight based on how frequent they occur in the normal message. This methodology is very efficient for detecting minor fuzzing, because this method creates diversity of in between the distance of the normal and malicious packets with best results for anomaly detection.

A

8

M

Before Weight Implemenlation

Figure 7 Mahalanobis distance

A

N o

D1

After Weight Implementation

In the above diagrams the node A and B defined the mahalanobis distance domain for the normal packet distance in the training phase with distances D 1 and D2. Node M is the malicious packets that contains with a very minor fuzzing of special characters with distance D3 that makes it difficult to point it as a malicious one yet the application will allow this entity to move forward towards the SIP server as normal Packet . For solving this problem we introduces a methodology of assigning weights to the special characters that increased the diversity of M by distance D4. In Figure 7 we can see the after

8

applying the weight the malicious packet gets prominent as outlier and can be easily detected as malicious packet

4.4 Strategy two two-gram:

4.4.1 Feature extraction:

I Gram model never gave the required results that pushed us to go for the 2 Gram model. The calculation of 2 Gram model is same like the calculation we did in I Gram. The only difference is that while doing all the operations we take 2 alphabets as a single entity for the frequency count (and other operations) instead of single ASCII (in I-Gram Model). The example below explains how a string from a SIP message assigns a vectors space.

Via: SIP/2.0/UDP ........ .

2·Gram Frequency Model

Figure 8. Feature extraction

4.4.2 Anomaly detection

I) Method-I (Character Distance)-In this scheme, Distance of pair of characters (2-gram) is calculated with those of self database using character distance formula. There is not much differences with that of one-gram scheme except, the number of characters are now two instead on one (in one-gram).

2) Method-II (Mahalanobis Distance)-Likewise one-gram scheme , distance of the newly arrived message is calculated with those of self database but this time the number of characters are two instead of one. However it is to note that to calculate mahalanobis distance , a complete message's histogram is required , and only after a decision is made whether a message is anomalous or normal.

5 Results

In this section will discuss the detection accuracy and false positive analysis of the above mentioned techniques for SIP malformed messages .The goal of these approaches was to save the CPU processing time in useless malformed messages processing and also to not reject any normal SIP message.

5.1 Strategy one one-gram:

Figure 9 shows the true positive rate verses the false positive rate for both One-gram anomaly detection schemes i.e Character Distance and Mahalanobis Distance. X-axis represents the false

748

positive while the y-axis shows the true posItIve rate ranging from 0 to 100 percent. Initially the true positive rate for Mahalanobis Distance increases rapidly and reaches to 86 percent but after it stays there and varying threshold factor only increases the false alarm. In contrast , Character Distance scheme gives only 44 percent detection rate with lowest i.e 3 percent false alarm ,while varying threshold factor causes the false alarm increase. It can be concluded that the Mahalanobis Distance gives better true positive and false alarm rate in contrast to Character Distance. However 86 percent detection with 3 percent still is not a practical solution because in the diverse/realtime fragile SIP systems, 14 percent anomalous messages can cause severe application delays or even complete service failure. Three percent false positive means 30 legitimate requests would be rejected out of 1000 requests, which is simply intolerable for a commercial organization. To improve these results many techniques were applied e.g. different characters were assigned different weights based on their frequency in normal messages, but this could not bring considerable increase in true positive rate. Then we move on to the two-gram model for better detection and fewer false alams.

ROC curve

1.1 1

0.9 /' III > 0.8

:;:; 0.7 .U; 0.6 0 0.5 a.

III 0.4 2 0.3

I- 0.2 0.1

0

0 0.2 0.4 0.6 0.8 1 1.2

False Positive

Figure 9. ROC strategy-I ,One gram

5.2 Strategy two two-gram:

-- 1-gram

Character

Distance

-- 1-gram

Mahalanobis

Distance

Figure-IO shows the ROC curves for both twogram anomaly detection schemes. It is apparent from the graph that the two-gram gave better results in terms of true positive and false positive , as compare to one-gram. It was expected because in these schemes, not only the character's frequency is recorded but the principle of locality is also exploited by recording the every single pair of the message's characters. However in contrast to one-gram, Character Distance anomaly detection gives better results than Mahalanobis Distance here. Detection rate is 99.9 percent with 0.0028 percent false alarms for the Character Distance while Mahalanobis Distance almost gives the same true positive rate but with 12 percent false alarms. Mahalanobis Distance anomaly detection also needs to go through the

whole message for computing deviation while Character Distance didn't. This also makes the Character Distance more efficient in terms of detection speed. Performance of this scheme IS

discussed in next section.

ROC curve

1.1 1 .1-,---------

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 --!--------o. 1 :;::::;t:;:::::;=;:::::::;::=;:::::;:::::;::::=:::;� o I -0.1 0 O. o. o. o. o. o. o. o. O. 1 1.

1 2 3 4 5 6 7 8 9

False Positive

Figure 10. ROC , 2-gram

6 Performance evaluation:

2-gram

Character

Distance

--2-gram

rv1ahalanobis

Distance

Different anomaly detection schemes have been discussed above. In this section will discuss the processing overhead of the two-gram Character Distance anomaly detection scheme, as it gives best true positive-false alarm rate. As we know the self learning anomaly detection models mostly depends on the training dataset , and the number of training messages are directly proportional to the detection rate and inversely proportional to the detection speed. However our 2-gram anomaly detection model doesn't depends on the number of training messages for the detection speed. Once the model has learned on the given number of benign messages , it gives almost constant detection time for any anomalous message detection that is 0.3184 milli seconds. Figure-II shows the detection rate for the varying number of training messages. X-axis shows the number of messages while the y-axis shows the detection rate for two-gram Character Distance anomaly detection scheme. It can be observed that the detection rate continuously increasing as the number of training messages increase. It starts from 99.95 at 50 training messages and reaches at 99.9 percent at 350 training messages.

---1.0001

" 0.9999 / �

0.9998 / c: / 1- True Positve I .2

'g 0.9997

/ � 0.9996

/ 0.9995

0.9994

0 100 200 300 400

No of Training Messages

Figure 11. Two-gram character distancetraining size vs detection rate

Figure-12 shows the training time for the varying number of training messages. The time shown here

749

is computed over a system consist of dual core 2 Giga Hz processor and 2GB ram , running Microsoft Windows Vista. It almost take 28 milli seconds to process a SIP message in training phase while once the training phase ends , it takes average of 0.3184 milli seconds to detect variety of malformed messages.

12000 VI

"C 10000 c:: 0 <J 8000 '" VI .- 6000 E

4000 c::

'" 2000

/ ) 1-- Trainig Tirre 1

/' /

E I- 0

0 1 00 200 300 400

No of Trainig Messages

Figure 12. Gram character distancetraining size vs time

References

[1] 1. Rosenberg, H. Schulzrinne, G. Camarillo, A.

Johnston,J. Peterson, R. Spark, M. Handley, E.

Schooler, Session Initiation Protocol, RFC 3261,

June 2002.

[2] Mark Collier ,Basic Vulnerability Issues for SIP

Security, mark. [email protected].

[3] The IMS: IP Multimedia Concepts and Services" by Miikka Poikselka, Aki Niemi, Hisham

Khartabil, Georg Mayer (John Wiley & Sons)

2006

[4] Rfc 3261 , www.ietforglrfclrfc3261.

[5] ref [Sip _ Security030051] session tear down effect

[6] Ehlert, S. and Zhang, G. and Geneiatakis, D. and

Kambourakis, G. and Dagiuklas, T. and Markl, J.

and Sisalem, D., Two layer Denial of Service

prevention on SIP VoIP infrastructures,

Computer Communications,2008

[7] Geneiatakis, D. and Dagiuklas, T. and

Lambrinoudakis, C. and Kambourakis, G. and

Gritzalis, S, "Novel protecting mechanism for

SIP-based infrastructure against malformed

message attacks: Performance evaluation study",

Proc. of the 5th International Conference on Communication Systems, Networks and Digital

Signal Processing (CSNDSP'06), 2006.

[8] Geneiatakis, D. and Kambourakis, G. and

Lambrinoudakis, C. and Dagiuklas, T. and

Gritzalis, S," A framework for protecting a SIP

based infrastructure against malformed message

attacks", Computer Networks, vol 51 , NO 1O,pp.

2580-2593, 2007

[9] www.ee.oulu.fi/research/ouspg/protos/testing/cO

7/sip

[10] http://sourceforge.net/proj ects/voiperi

[11] http://www.infiltrated.net/asteroid!

[ieee multimedia technology (ic-bnmt 2010) - beijing, china (2010.10.26-2010.10.28)] 2010 3rd ieee...

Documents