skype morph

12
SkypeMorph: Protocol Obfuscation for Tor Bridges Hooman Mohajeri Moghaddam Baiyu Li Mohammad Derakhshani Ian Goldberg Cheriton School of Computer Science University of Waterloo Waterloo, ON, Canada {hmohajer,b5li,mderakhs,iang}@cs.uwaterloo.ca ABSTRACT The Tor network is designed to provide users with low- latency anonymous communications. Tor clients build cir- cuits with publicly listed relays to anonymously reach their destinations. However, since the relays are publicly listed, they can be easily blocked by censoring adversaries. Con- sequently, the Tor project envisioned the possibility of un- listed entry points to the Tor network, commonly known as bridges. We address the issue of preventing censors from detecting the bridges by observing the communications be- tween them and nodes in their network. We propose a model in which the client obfuscates its mes- sages to the bridge in a widely used protocol over the Inter- net. We investigate using Skype video calls as our target protocol and our goal is to make it difficult for the censor- ing adversary to distinguish between the obfuscated bridge connections and actual Skype calls using statistical compar- isons. We have implemented our model as a proof-of-concept pluggable transport for Tor, which is available under an open-source licence. Using this implementation we observed the obfuscated bridge communications and compared it with those of Skype calls and presented the results. Keywords Tor, bridges, Skype, pluggable transport, steganography, protocol obfuscation, censorship circumvention 1. INTRODUCTION Tor [21] is a low-latency anonymous communication over- lay network. In order to use Tor, clients contact publicly known directory servers, a fraction of the Tor network re- sponsible for tracking the topology of the network and node states. Directory servers allow clients to obtain a list of volunteer-operated relay nodes (also known as“onion routers”). The client then chooses some of these relays using the Tor software and establishes a circuit through these nodes to its desired destination. Clients’ traffic is then routed through the Tor network over their circuits, hiding users’ identities and activities. The Tor network not only provides anonymity, but also censorship resistance. To access a website censored in a user’s home country, the user simply connects to the Tor network and requests the blocked content to be delivered to him. However, since a list of Tor relays can be retrieved from publicly known directory servers, blocking all Tor con- nections can be simply done by blocking access to all Tor Figure 1: This graph from metrics.torproject.org shows the population of returning users accessing the Tor network without using bridges from Oct 2009 to March 2010, from China. It shows that, after Jan 2010, the Tor network is not accessible from China without using bridges. [32] relays based on their IP addresses. There have been many attempts to block access to Tor by regional and state-level ISPs. For instance, Figure 1 shows the blocking of the whole Tor network by the Great Firewall of China as of January 2010. In order to counteract this problem, the Tor project sug- gested using bridges — unlisted relays used as entry points to the Tor network. Since bridges are not publicly listed, the censoring authority cannot easily discover their IP ad- dresses. Although bridges are more resilient to censorship, McLachlan and Hopper [36] showed that it is still possible to identify them, as they accept incoming connections un- conditionally. To solve this problem, BridgeSPA [43] places some restrictions on how bridges should accept incoming connections. As the censorship techniques improve, however, more so- phisticated methods are being deployed to discover and block bridges. There have been reports of probes performed by hosts located in China, aimed quite directly at Tor bridges [46, 47]. The investigation revealed that after a Tor client within China connected to a US-based bridge, the same bridge re- ceived a series of Tor connection initiation messages from dif- ferent hosts within China and after a while the client’s con- nection to the bridge was lost. We have recently witnessed state-level SSL blocking [40] and blocking of Tor connec- tions in Iran based on the expiry time of the SSL certificate 1

Upload: soul420

Post on 26-Oct-2014

152 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Skype Morph

SkypeMorph: Protocol Obfuscation for Tor Bridges

Hooman Mohajeri Moghaddam Baiyu LiMohammad Derakhshani Ian Goldberg

Cheriton School of Computer ScienceUniversity of WaterlooWaterloo, ON, Canada

{hmohajer,b5li,mderakhs,iang}@cs.uwaterloo.ca

ABSTRACTThe Tor network is designed to provide users with low-latency anonymous communications. Tor clients build cir-cuits with publicly listed relays to anonymously reach theirdestinations. However, since the relays are publicly listed,they can be easily blocked by censoring adversaries. Con-sequently, the Tor project envisioned the possibility of un-listed entry points to the Tor network, commonly known asbridges. We address the issue of preventing censors fromdetecting the bridges by observing the communications be-tween them and nodes in their network.

We propose a model in which the client obfuscates its mes-sages to the bridge in a widely used protocol over the Inter-net. We investigate using Skype video calls as our targetprotocol and our goal is to make it difficult for the censor-ing adversary to distinguish between the obfuscated bridgeconnections and actual Skype calls using statistical compar-isons.

We have implemented our model as a proof-of-conceptpluggable transport for Tor, which is available under anopen-source licence. Using this implementation we observedthe obfuscated bridge communications and compared it withthose of Skype calls and presented the results.

KeywordsTor, bridges, Skype, pluggable transport, steganography,protocol obfuscation, censorship circumvention

1. INTRODUCTIONTor [21] is a low-latency anonymous communication over-

lay network. In order to use Tor, clients contact publiclyknown directory servers, a fraction of the Tor network re-sponsible for tracking the topology of the network and nodestates. Directory servers allow clients to obtain a list ofvolunteer-operated relay nodes (also known as“onion routers”).The client then chooses some of these relays using the Torsoftware and establishes a circuit through these nodes to itsdesired destination. Clients’ traffic is then routed throughthe Tor network over their circuits, hiding users’ identitiesand activities.

The Tor network not only provides anonymity, but alsocensorship resistance. To access a website censored in auser’s home country, the user simply connects to the Tornetwork and requests the blocked content to be delivered tohim. However, since a list of Tor relays can be retrievedfrom publicly known directory servers, blocking all Tor con-nections can be simply done by blocking access to all Tor

Figure 1: This graph from metrics.torproject.org shows thepopulation of returning users accessing the Tor networkwithout using bridges from Oct 2009 to March 2010, fromChina. It shows that, after Jan 2010, the Tor network is notaccessible from China without using bridges. [32]

relays based on their IP addresses. There have been manyattempts to block access to Tor by regional and state-levelISPs. For instance, Figure 1 shows the blocking of the wholeTor network by the Great Firewall of China as of January2010.

In order to counteract this problem, the Tor project sug-gested using bridges — unlisted relays used as entry pointsto the Tor network. Since bridges are not publicly listed,the censoring authority cannot easily discover their IP ad-dresses. Although bridges are more resilient to censorship,McLachlan and Hopper [36] showed that it is still possibleto identify them, as they accept incoming connections un-conditionally. To solve this problem, BridgeSPA [43] placessome restrictions on how bridges should accept incomingconnections.

As the censorship techniques improve, however, more so-phisticated methods are being deployed to discover and blockbridges. There have been reports of probes performed byhosts located in China, aimed quite directly at Tor bridges [46,47]. The investigation revealed that after a Tor client withinChina connected to a US-based bridge, the same bridge re-ceived a series of Tor connection initiation messages from dif-ferent hosts within China and after a while the client’s con-nection to the bridge was lost. We have recently witnessedstate-level SSL blocking [40] and blocking of Tor connec-tions in Iran based on the expiry time of the SSL certificate

1

Page 2: Skype Morph

generated by the Tor software [20].As the censorship arms race is shifting toward the char-

acteristics of connections, Appelbaum and Mathewson pro-posed a framework for developing protocol-level obfuscationplugins for Tor called pluggable transports [11]. These trans-ports appear to the Tor client to be SOCKS proxies; anydata that the Tor client would ordinarily send to a bridgeis sent to the pluggable transport SOCKS proxy instead,which delivers it to the bridge in an obfuscated way. Devel-opers can use this framework to build their own transportsthat can hide Tor traffic in other protocols. On one side, thetransport obfuscates Tor messages in a different form of traf-fic, e.g., HTTP, and on the other side it translates the HTTPtraffic back into Tor traffic. Pluggable transports provide aneasy way to resist client-to-bridge censorship and the ulti-mate goal is that censoring ISPs that are inspecting packetsbased on their characteristics will be unable to discover theTor traffic obfuscated by a transport.

So far the only available pluggable transport is“obfsproxy”[34]which passes all traffic through a stream cipher. We extendthis previous work to address its limitation of not outputtinginnocuous-looking traffic; our method greatly reduces thechances of obfuscated bridge connections being detected bypowerful censors. We also note that simply adding the tar-get protocol headers to packets would not be successful whenfacing deep packet inspection (DPI) methods [31]. Dusi etal. [22] also suggested that obfuscating inside an encryptedtunnel might not be enough to withstand statistical classi-fiers since some features of encrypted tunnels such as packetsizes and inter-arrival times of packets can still be distin-guished.

For the purpose of our experiment we chose the Skype [6]protocol as our target communication for several reasons.First, Skype enables users to make free, unlimited and en-crypted voice and video calls over the Internet which hasled to its huge popularity [3] and therefore the amount ofSkype traffic in today’s Internet is relatively high. Second,Skype video calls transfer a reasonable amount of data ina short period of time, making it a desirable form of targettraffic since it will not introduce too much of a bottleneckto the Tor connection. Third, Skype communications are allencrypted [2], so it provides an encrypted channel for theTor traffic.

1.1 Our ContributionsWe explore methods for Tor protocol obfuscation and in-

troduce SkypeMorph, a system designed to encapsulate Tortraffic into a connection that resembles Skype video traffic.We provide the following contributions:

• Tor traffic obfuscation: SkypeMorph makes the com-munication between the bridge and the client appearto be a Skype video call, which is our target proto-col. Protocol obfuscation is greatly needed when fac-ing large-scale censorship mechanisms, such as deeppacket inspection.

• Innocuous-looking traffic: A client who wishes toaccess a SkypeMorph bridge runs our software along-side his usual Tor client and instructs his Tor clientto use the SkypeMorph software as a transport. Uponstartup, SkypeMorph first attempts a Skype login pro-cess and then establishes a Skype call to the intendeddestination; i.e., the bridge. Once the bridge receives

the call, the client innocuously drops the call and usesthe channel to send the obfuscated Tor messages. Wegive comparisons between the output of SkypeMorphand actual video calls of Skype and we conclude thatfor the censoring adversary it would be difficult to dif-ferentiate between the two. Consequently, a censorwould be required to block a great portion of legit-imate connections in order to prevent access to theobfuscated Tor messages.

• UDP-based implementation: Since Skype mainlyuses UDP as the transport protocol, we also use UDP.The choice of UDP as the transport protocol will alsobe useful when Tor datagram designs [37] are rolledout.

• Improved traffic shaping: Traffic Morphing as pro-posed by Wright et al. [49] is based on the premise ofefficiently morphing one class of traffic into another.However, the authors neglected one key element of en-crypted channels, namely the inter-packet delay be-tween consecutive packets, from their design scope.SkypeMorph extends the previous work to fully repro-duce the characteristics of Skype calls.

• Comparison between traffic shaping methods:We compare different modes of implementing trafficshaping and describe how each of them performs interms of network overhead. In particular, we exploretwo methods, namely naıve traffic shaping and ourenhanced version of Traffic Morphing, and comparethem.

• Proof-of-concept implementation: An open-sourceproof-of-concept implementation of SkypeMorph is avail-able online at:

http://crysp.uwaterloo.ca/software/

The outline of the remainder of the paper is as follows.In Section 2 we discuss related work and in Section 3 weformalize our threat model and design goals. Section 4 cov-ers some background and we present our architecture andimplementation in Sections 5 and 6. We present our resultsin Section 7, discuss possible future work in Section 8, andconclude in Section 9.

2. RELATED WORK

2.1 Information Hiding and SteganographyHiding information within subliminal channels has been

studied extensively in the last three decades. Simmons [42]stated the problem for the first time and proposed a solutionbased on digital signatures. Currently the topic is studiedunder the term steganography or the art of concealed writing,and it has recently attracted a lot of attention in digitalcommunications [39].

Employing a steganographic technique, one needs to con-sider two major factors: the security and efficiency of themethod. Hopper et al. [24] proposed a construction thatcan formally be proven to be secure and can be applied toa broad range of channels. The OneBlock stegosystem de-scribed in their work, however, needs an expected number ofsamples from the channel that is exponential in the numberof bits transmitted.

2

Page 3: Skype Morph

2.2 Image SteganographyHiding information within pictures is a classic form of

steganography. Least Significant Bit (LSB) based imagesteganographic techniques [26] — using a small fraction ofthe information in each pixel in a cover image to send theactual data — is a common method and Chandramouli etal. [18] showed the upper bounds for the capacity of suchchannels. There are also some existing open-source stegano-graphic tools available, such as outguess [4] and steghide [8].Collage [17] is a recently developed system that uses user-

generated content on social-networking or image-sharing web-sites such as Facebook and Flickr to embed hidden messagesinto cover traffic, making it difficult for a censor to blockthe contents. Collage has a layered architecture: a “mes-sage vector layer” for hiding content in cover traffic (usingoutguess internally as their steganographic tool) and a“ren-dezvous mechanism” for signalling. The authors claim theoverhead imposed by Collage is reasonable for sending smallmessages, such as Web browsing and sending email.

2.3 Voice over IP and Video StreamingWright et al. [50] studied the effectiveness of security mech-

anisms currently applied to VoIP systems. They were ableto identify the spoken language in encrypted VoIP callswhich were encoded using bandwidth-saving Variable BitRate (VBR) coders. They did so by building a classifierwhich could achieve a high accuracy in detecting the spokenlanguage when a length-preserving encryption scheme wasused and they concluded that the lengths of messages leaka lot of information. Further experiments showed that it ispossible to uncover frequently used phrases [48] or unmaskparts of the conversation [45], when the same encryptionmethod is employed for confidentiality.

The statistical properties these attacks exploit are lessprevalent in streaming video data, rather than the audiodata they consider; therefore, those algorithms should notbe able to distinguish our Tor traffic disguised as Skype videotraffic from real Skype video traffic. Nonetheless, we con-sider the question of matching SkypeMorph traffic to thehigher-order statistics of Skype video traffic to fully resem-ble Skype communication to a censor.

Previous work has shown some success in identifying whethera target video is being watched, using information leakage ofelectromagnetic interference (EMI) signatures in electronicdevices [23] or revealing which videos in a database are beingviewed in a household by throughput analysis [41]. However,those methods require the purported video to be selectedfrom a set known in advance. SkypeMorph, on the otherhand, attempts to disguise its traffic as a real-time videochat, which would not be in such as set.

VoIP services have also been used for message hiding. Forexample, Traffic Morphing [49] exploits the packet size dis-tribution of VoIP conversations to transmit hidden messages(we will return to this method in section 4). Another exam-ple of steganographic communications over voice channelsis TranSteg [35], in which the authors try to re-encode thevoice stream in a call with a different codec, resulting insmaller payload size. Therefore, the remaining free spacecan be used for sending the hidden messages. The short-coming of this method is that the most of the bandwidthis allocated to the actual voice conversation, leaving only alimited space for steganograms.

2.4 Steganography over Encrypted ChannelsAlthough steganographic models similar to those men-

tioned above are powerful, they impose relatively large over-heads on our channel. Therefore, we used a combination ofmethods suggested for encrypted communications [22, 49].We argue that on an encrypted communication such as thoseof Skype calls, every message appears to be random (sincewe expect the encryption scheme to output a randomly dis-tributed bit string), thus exploiting the channel history isnot required for cover traffic and we can perform signifi-cantly better than the OneBlock stegosystem, Collage, orTranSteg. The only important characteristics of encryptedchannels, as suggested by previous works, are packet sizesand inter-arrival times of consecutive packets [14, 33, 22].Hence, a protocol obfuscation layer only needs to reproducethese features for an encrypted channel.

3. THREAT MODEL AND DESIGN GOALSWe assume that the user is trying to access the Internet

through Tor, while his activities are being monitored by astate-level ISP or authority, namely “the censor”, who cancapture, block or alter the user’s communications based onpre-defined rules and heuristics. Therefore, we consider ad-versarial models similar to anti-censorship solutions such asTelex [51], Cirripede [25] and Decoy Routing [28]. In par-ticular, the censor is able to block access to Tor’s publiclylisted routers, and to detect certain patterns in Tor’s trafficand hence block them [20]. This is also true for other ser-vices or protocols for which the censoring authority is able toobtain the specification. Examples include protocols in thepublic domain, e.g., HTTP, and services whose provider canbe forced or willing to reveal their implementation details.

However, we assume that the censoring authority is notwilling to block the Internet entirely, nor a large fractionof the Internet traffic. The censoring authority is also un-willing to completely block popular services, such as VoIPprotocols. Thus, the filtering is based on a “black list” ofrestricted domains and IP addresses, accompanied by a listof behavioural heuristics that may suggest a user’s attemptto circumvent censorship; for example, a TCP SYN packetfollowing a UDP packet to the same host may indicate aspecial type of proxy using port knocking [29]. Bissias etal. showed how such heuristics can be employed to detectcertain traffic patterns in an encrypted channel [14].

This assumption is realistic since usually the cost of over-blocking is not negligible, so if the censor used a small“whitelist” of allowed content and hosts, then every new website orhost on the Internet would need to sign up with the censorin order to be accessible by nodes within its control. This isa quite cumbersome task and seems unreasonable.

Also, we assume encrypted communications, including Skypecalls, are not blocked unless the censor has evidence that theuser is trying to evade the censorship. Moreover, we assumethat the censor does not have access to information aboutparticular bridges, including their IP addresses and SkypeIDs; otherwise it can readily block the bridge based on thisinformation. (We will discuss in Section 8 how a bridgeusing SkypeMorph can easily change its IP address if it isdetected by the censor.) SkypeMorph users, however, canobtain this information from out-of-band channels, includ-ing email, word-of-mouth, or social networking websites. Wealso note that it is possible to have multiple Skype calls use

3

Page 4: Skype Morph

a single IP address but with different ports, to allow usersbehind NAT to make simultaneous calls using a shared IPaddress.

In our model, we are trying to facilitate connections tobridges outside the jurisdiction of the censor where it hasno control over the network nodes. However, the censorcan set up its own SkypeMorph bridges and distribute theirinformation.

In general, SkypeMorph aims to build a layer of protocolobfuscation for Tor bridges,1 with the following goals:

• Hard to identify: SkypeMorph outputs encryptedtraffic that resembles Skype video calls. The details ofhow we try to minimize the chances of being detectedby the censor are discussed in Section 4.

• Hard to block: Since the outputs of SkypeMorphgreatly resemble Skype video calls, in order to blockSkypeMorph, the censor would need to block Skypecalls altogether, which we assume it is unwilling to do.

• Plausible deniability: The only way to prove thata node is actually using SkypeMorph software is tobreak into a user’s machine or to coerce him to divulgehis information. Otherwise, communicating throughSkypeMorph should look like a normal Skype videochat.

4. BACKGROUND

4.1 SkypeSkype [6] is a proprietary “voice over IP” (VoIP) service

that provides voice and video communications, file transfer,and chat services. With millions of users and billions of min-utes of voice and video conversations, Skype is undoubtedlyone of the most popular VoIP services available [3].

Protection mechanisms and code obfuscation techniquesused in the Skype software have made it difficult to learnabout its internals, so there is no open-source variant of theSkype application. However, there have been attempts to re-verse engineer and analyze the application [12, 13]. The find-ings from these attempts and our own experiments, along-side some insights from the Skype developers have estab-lished the following facts:

• Skype encrypts messages using the AES cipher anduses RSA-based certificates for authentication [2, 13].Also our experiments showed that Skype utilizes someform of message authentication and would not acceptaltered messages. Thus an eavesdropper is neither ableto access the content of a packet nor can he alter themin such a way that is not detectable. All that is possi-ble to such an attacker is selective packet dropping ordenial of service.

• There are three types of nodes in the Skype network:server nodes, which handle users’ authentication whenthey sign in, normal nodes, which can be seen as peersin the P2P network, and supernodes, which are thosepeers with higher bandwidth; supernodes can facili-tate indirect communication of peers behind firewallsor NAT [13, 15, 44].

1Note that our technique can be applied to Tor’s public ORs,but since blocking public ORs based on their IP address isa trivial task, we choose to present it for bridges.

• Skype calls are operated in a peer-to-peer architectureand users connect directly to each other during a call,unless some of the participants cannot establish directconnections. This is where supernodes come into thepicture to facilitate the call.

• In our experiments with Skype we noticed that when aSkype call takes place there are some TCP connectionswhich are mainly used for signalling. These TCP con-nections remained open even after the call is dropped.The Skype client listens to a customizable UDP portfor incoming data, but when UDP communication isnot possible, it falls back to TCP [12, 13].

• Skype has a variety of voice and video codecs and se-lects among them according to bandwidth, networkspeed and several other factors [15, 16].

The facts that Skype traffic is encrypted and very popularmakes it a good candidate for the underlying target trafficfor our purpose. We will explore this more in Section 5.

The choice of Skype video, as opposed to voice, calls as thetarget protocol in SkypeMorph is motivated by the fact thatin voice calls, usually at any time only one party is speakingand thus we would need to consider this “half-duplex” effectin our output stream. However, this is not the case in videocalls since both parties send almost the same amount ofdata at any given time during a video conversation, makingthe implementation of SkypeMorph easier, and not requiringthe client or bridge to withhold data until it is its turn to“speak”.

4.1.1 Bandwidth ControlNetwork congestion control and bandwidth throttling is

a major concern in online applications. In order to be ableto accommodate for changes in the network status, a trafficcontrol mechanism is essential and Skype uses a congestiondetection mechanism to back off whenever it is no longer pos-sible to communicate at the current rate. Skype voice callswere shown to have a few number of possible bitrates [16],however, as shown in Figure 2, Skype video calls seem toenjoy much more flexibility in terms of bandwidth usage2.Figure 2a suggests that by limiting the available bandwidth,Skype’s bandwidth usage drops significantly at each step toa level far below the available rate (this phenomena is morenoticeable in higher rates) and then it builds up again toachieve the maximum rate possible. Also Skype is able todetect whether it is possible to send at a higher rate, asdepicted in Figure 2b. Therefore, by a similar rate lim-iting technique, SkypeMorph is able to transfer data at areasonable rate which at the same time complies with thebandwidth specified by the bridge operator.

4.2 Naïve Traffic ShapingTo achieve a similar statistical distribution of packet sizes

in the output of our system to that of target process, abasic approach would be to simply draw samples from thepacket size distribution of the target process and send theresulting size on the wire. Thus, if there is not enough dataavailable from Tor, we have to send packets without usefulinformation and this imposes some additional overhead on

2The degree of flexibility in bandwidth usage of Skype videocalls is due to availability of different frame rates and videocodecs.

4

Page 5: Skype Morph

0 500 1000 1500 2000 2500 3000 3500 40000

2

4

6

8

10

12

14

16

18x 10

4

Time (Seconds)

Bandw

idth

(B

yte

s/S

econd)

Bandwidth Limit

Skype Bandwidth Usage

(a) Decreasing bandwidth available to Skype

0 500 1000 1500 2000 2500 3000 3500 40000

2

4

6

8

10

12

14

16

18x 10

4

Time (Seconds)

Bandw

idth

(B

yte

s/S

econd)

Bandwidth Limit

Skype Bandwidth Usage

(b) Increasing bandwidth available to Skype

Figure 2: Bandwidth usage by Skype under different net-work situations. Figure 2a shows the drops in the Skypetransfer rate while decreasing the network bandwidth andFigure 2b shows the increase in the rate.

the network. An alternate approach is to consider the in-coming packet sizes from the source distribution, which ishow Traffic Morphing deals with the problem.

4.3 Traffic MorphingWe briefly mention how the original Traffic Morphing [49]

method works. Traffic Morphing attempts to counter an ad-versary who is trying to distinguish traffic from the sourceprocess from that of the target process through statisticalmeans. As previously discussed, the only statistical tracesthat the attacker might be able to collect from encryptedtraffic are packet sizes and timing attributes. Traffic Mor-phing aims at obfuscating the packet size distribution byassuming that probability distributions of the source anddestination processes are available.

Assume that the probability distribution of the sourceprocess is denoted by the vector X = [x1, . . . , xn]T , wherexi is the probability of the ith packet size. Similarly Y =[y1, . . . , yn]T denotes the target process packet size distribu-tion. Traffic Morphing finds matrix A for which we haveY = AX such that the number of additional bytes neededto be transmitted is minimal. Using this technique requiressome considerations, for example dealing with larger samplespaces or overspecified constraints, which are discussed inthe original paper.

Even though the underlying premise of Traffic Morphingis that if the source process generates a sufficiently largenumber of packets, the output of the morphing will converge

in distribution to that of the target, it only considers packetsizes in the encrypted traffic. We extend this technique,below, by introducing inter-packet timing to it as well.

4.4 Higher-Order StatisticsAlthough reproducing Skype packet size and inter-packet

delay distributions is a step towards defeating censoring fire-walls, DPI tools can take advantage of higher-order statisticsin our encrypted channel to distinguish it from a Skype videocall. We observed that there are second and third orderstatistics, discussed in the appendix, in the Skype traces. Weensure that SkypeMorph respects those higher-order statis-tics in the packets and timings it outputs. An alternative forpreserving all the characteristics of the Skype video call is touse the output of the audio and video encoder shipped withthe Skype software to generate the statistics. If a live videosource is available, we can run the encoder on it while theTor bridge connection is in progress, and use the resultingpacket sizes and inter-packet delays output by the encoderdirectly (replacing the encrypted packet contents with ourown encrypted Tor data, of course). In this way, we can en-sure that our traffic accurately mimics that of a real Skypevideo call.

5. SKYPEMORPH ARCHITECTURESkype, like any other instant messaging or voice and video

calling/conferencing application, performs an authenticationstep before it allows a user to join the network. A user needsto sign up with the Skype website and obtain a usernameand password for authentication. The user then inputs thesecredentials to the Skype software to use them in the authen-tication process. After the user authenticates himself to thenetwork, he is able to make calls or send messages. Due tothe proprietary nature of the Skype protocol, it is unclearhow the login process is initiated and proceeds. The sameis true for the call setup phase. Therefore, we have usedthe runtime executable and collection of APIs provided bySkype in a package called SkypeKit [7]. The SkypeKit APIsenable programmers to log in to the Skype network and havealmost the same functionality as the Skype application, in-cluding making voice and video calls and sending files andtext messages. It is therefore highly suitable for our causeand allows us to perform the login and call initiation pro-cesses. The basic setup is discussed next and details of ourimplementation will appear in Section 6.

5.1 Setup

• Step 1: The bridge, which we denote by S, selectsa UDP port number, PS and uses the SkypeKit APIto log in to the Skype network, with a predefined setof credentials.3 After successfully logging in to Skype,the bridge will listen for incoming calls. The bridgemakes its Skype ID available to clients in much thesame way that bridges today make their IP addresses

3Skype allows multiple logins, so it might seem reasonableto share the same username and password for every bridge.However, in that case all the messages sent to a certainSkype ID will be received by all the bridges currently loggedin with that ID, which is an undesirable setting. We there-fore require that every bridge has its own exclusive creden-tials, which are made available to the SkypeMorph bridgesoftware on startup.

5

Page 6: Skype Morph

and port numbers available — using Tor’s BridgeDB [9]service, for example.

• Step 2: The client, denoted by C, picks a UDP portnumber, PC and uses the same method to log in to theSkype network, using its own credentials.

• Step 3: The client generates a public key PKC . Afterthat, it checks to see whether the bridge is online andsends a Skype text message of the form PKC : IPC :PC , to the bridge, where IPC is the IP address of theclient.

• Step 4: Upon receiving the message from the client,the bridge generates a public key PKS and sends thefollowing text message PKS : IPS : PS back to theclient.

• Step 5: The bridge and client each compute a sharedsecret using the public keys they obtained and theclient sends a hash of resulting key to the bridge.

• Step 6: The bridge then checks the received hash andif it matches the hash of its own secret key, it sends amessage containing “OKAY”.

• Step 7: If step 6 is successful and the client receivesOKAY, it initiates a Skype video call to the bridge. Oth-erwise, it falls back to step 3 after a timeout.

• Step 8: The client keeps ringing for some randomamount time, then drops the call.

• Step 9: When the bridge notices the call is droppedit listens for incoming SkypeMorph messages on portPS . The bridge selects another UDP port for the Skyperuntime for it to listen for other incoming connections.

• Step 10: Afterwards, the client uses the shared keyand the UDP port obtained in previous steps to senddata.

Note that having the bridge switch to a different UDP portfor the next client connection should not arouse suspicionsince, as discussed in Section 3, this is how normal Skypecalls to multiple users behind NAT would appear to thecensor.

5.2 Traffic ShapingTor sends all of its traffic over TLS. We do not change this;

rather, we just treat the TLS data as opaque, and send theTLS data over our own encrypted channel, masquerading itas Skype video. This means that the data is encrypted byTor (multiple times), by TLS, and also by SkypeMorph.

After the above connection setup, it is possible to send there-encrypted TLS messages through the established channel.As discussed in previous sections, in order to maximize theresemblance to real Skype traffic, we modify the output ofour application to closely match that of Skype. The mod-ification is done on the packet sizes and inter-arrival timesof consecutive packets. For the packet sizes, two scenar-ios are considered: In the first scenario, we obtain our re-sulting packet sizes using the naıve traffic shaping methoddiscussed in Section 4.2. We use the higher-order statisticsmentioned above to produce joint probability distributionsfor the next inter-packet delay and packet size, given the

SkypeMorphSkypeMorph

Tor

Traffi

c

Tor Tra

ffic

Skype Video Traffic

Figure 3: Overview of the SkypeMorph Architecture. Thehistograms show the distribution of packet sizes in Tor (atthe bottom) and Skype video (at the top).

values outputted previously. We sample from this condi-tional distribution to produce the delay and size of the nextpacket.

For the second scenario, the Traffic Morphing method ofSection 4.3 is used; however, this method only supports first-order statistics.

The overview of the SkypeMorph architecture is shownin Figure 3. The red arrows represent the Tor traffic. Onone side the Tor traffic is passed to SkypeMorph, where thetraffic shaping mechanisms morph the traffic to resemble aSkype call on the wire. On the other side the Tor traffic isreconstructed.

6. IMPLEMENTATIONIn this section we describe our prototype implementation

of SkypeMorph on Linux, which is implemented in C andC++ with the boost libraries [1]. The prototype is built intotwo executable files called smclient and smserver, whichrealize the client C and the bridge S of the previous section,respectively. We describe in the following the two phases inthe execution of our prototype, namely, the setup and thetraffic-shaping phases.

6.1 Setup PhaseAs in the previous section, both the client and the bridge

log into Skype using the SkypeKit API, exchange publickeys, and then start their Skype video conversation. As wementioned in Section 4, there are various TCP connectionsaccompanied with the Skype video conversation, which stayactive even after the conversation is finished. In order to re-tain these TCP connections, our prototype implementationperforms the following tricks:

• A TCP transparent proxy component is built into ourprototype, which transparently relays all TCP connec-tions the Skype runtime outputs and receives. We takeadvantage of the TPROXY extension in iptables, whichis available in the Linux kernel since version 2.6.28.Corresponding iptables rules are created when ourprototype starts execution.

• As a consequence of retaining the Skype TCP con-nections, the Skype runtime has to stay active during

6

Page 7: Skype Morph

the SkypeMorph session. Hence the UDP port PC orPS will still be assigned to the Skype runtime whenSkypeMorph starts to tunnel Tor traffic; therefore,SkypeMorph has to operate on another UDP port. Toovercome this, the Skype runtime on the client op-erates on another UDP port, say P ′

C , and smclient

creates an iptables rule RC with SNAT target to al-ter the source port of all Skype UDP packets from P ′

C

to PC . When smclient starts to tunnel Tor traffic,it first deletes rule RC , and it then operates on UDPport PC . On the bridge side, the Skype runtime oper-ates on UDP port PS , and smserver runs on anotherUDP port P ′

S . When smserver starts to communi-cate with smclient, it first creates an iptables ruleRS that redirects traffic towards UDP port PS to P ′

S ;it then runs on P ′

S . Note that SkypeMorph starts itstunneling task only when the Skype video call is fin-ished; thus the iptables rules RC and RS affect onlythe client Skype runtime and the bridge SkypeMorph.This prevents the censor from noticing port changesbetween the genuine Skype video call traffic and theSkypeMorph traffic.

For the cryptographic features, we use the curve25519-

donna [30] library to generate elliptic-curve Diffie-Hellmankeys shared between smclient and smserver. Each Skype-Morph instance derives four keys: two for outgoing and in-coming message encryption, and another two keys for out-going and incoming message authentication purposes.

6.2 Traffic-shaping PhaseIn the traffic-shaping phase, an smclient and smserver

pair can be viewed together as a SOCKS proxy that relaysstreams between a Tor client and a Tor bridge. Between sm-

client and smserver, bytes in Tor streams are exchanged insegments by a simple reliable transmission mechanism overencrypted UDP communication, and they are identified bysequence numbers. Reliable transmission is supported by ac-knowledgments over sequence numbers. The cryptographyfunctions are provided by CyaSSL [19], a lightweight SSLlibrary also used in SkypeKit. We give more details below.

SkypeMorph UDP Packet LayoutFirst, we present the layout of the SkypeMorph UDP

packets transmitted between smclient and smserver. Weset the maximum size of a single packet to be 1460 bytes toavoid packet fragmentation, because 1500 bytes, includingthe IP header, is a common MTU over the Internet. Thus,besides a fixed 8-byte UDP header, each packet containsup to 1452 bytes of SkypeMorph data. The first 8 bytesare an HMAC-SHA256-64 message authentication code forthe remaining bytes in the packet. We use the 256-bit AEScounter mode stream cipher algorithm to encrypt the restof the packet, which, prior to encryption, is formatted intofive fields:

• type: This 1-byte field denotes the purpose of thepacket. Currently there are two types: regular dataand a termination message; the latter is used to in-form the packet receiver to terminate the communica-tion session.

• len: This is a 16-bit unsigned integer denoting the sizeof the contained Tor stream segment. This allows thepacket receiver to discard the padding data.

• seq: This field contains the sequence number, a 32-bitunsigned integer, of the first byte of the contained Torstream segment.

• ack: This field contains the ack number, a 32-bit un-signed integer used to identify those bytes that havebeen properly received.

• msg: This field is of length up to 1425 bytes and con-tains a Tor stream segment (of length len, above) andthe padding data (taking up the rest of the packet).

We output the MAC, followed by the random AES initialcounter, and then the encrypted payload, as seen in Figure 4.

Figure 4: SkypeMorph UDP packet body layout, where thesize (in bytes) is under the name for each field. The shadedparts are encrypted using 256-bit AES counter mode. Allbytes after the mac field are included in the HMAC-SHA256-64 computation.

Traffic Shaping OracleWe next discuss the traffic shaping oracle component,

which controls the sizes and timings of each successive UDPpacket to be sent. We implement both the naıve and theTraffic Morphing methods in the oracle to compare them.The goal of the oracle is to provide traffic shaping parame-ters.

When using the naıve method, the oracle first reads thenth-order distributions of the packet sizes and inter-packetdelays of Skype traffic, as noted in the appendix. We cur-rently have gathered data for up to n = 3, but nothing inprinciple prevents us from gathering more. For each query,the oracle remembers the last n answers x1, . . . , xn, wherexn is the last packet size output, and the xi alternate be-tween packet sizes and inter-packet delay times. It thenselects the nth-order distribution X of inter-packet delays,conditioned on the values of x1, . . . , xn, and randomly drawsan inter-packet delay te from X. Next the size of the packetse is outputted similarly from the distribution X ′ of sizes,where X ′ depends on x2, . . . , xn, se. The oracle responds tothe query with the pair (se, te).

For the traffic morphing method, the oracle first readsdistributions of the packet sizes of the Tor traffic, the inter-packet delays of the Skype traffic, and a pre-computed mor-phing matrix. We use the morpher library from the Mor-pher Project [27] to compute the expected packet size se.The Traffic Morphing morpher library does not take timingsinto account. It expects to receive a packet input to it, andto send out that packet immediately, possibly padded to anew size to emulate the target distribution. As such, thepacket timing distribution of the output of Traffic Morph-ing is identical to its input distribution, which is not whatwe want. We need to decouple the arriving packets from thesent packets, so arriving data is placed into a buffer, and weadopt the technique of the naıve method to sample from thepacket timing distribution to yield te. The oracle randomlyselects a packet size so from the Tor traffic packet size distri-bution, and calls the morpher library to compute the output

7

Page 8: Skype Morph

packet size se. The pair (se, te) is then the answer to thequery.

PacketizerThe communication between smclient and smserver is

handled by a packetizer, whose structure is shown in Fig-ure 5. The purpose of the packetizer is to relay Tor streamswith UDP packets such that the traffic exposed to the censoris indistinguishable from that of Skype video calls.

Figure 5: Structure of the packetizer.

The data stream received from Tor over the pluggabletransport SOCKS connection is first buffered in a sendingbuffer, and then retrieved in segments corresponding to thesizes produced by the traffic shaping oracle. On the otherend, the received Tor stream segments are rearranged inorder in a receiving buffer according to their sequence num-bers, and then form the incoming Tor stream.

The packetizer creates two threads, t_send and t_recv,such that:

• Thread t_send first queries the oracle for the expectedpacket size se and delay te. It then checks the sendingbuffer to determine if any re-transmission is needed,and it locates and reads up to se bytes from the send-ing buffer. Currently re-transmission is triggered whenthree duplicated ack numbers are received, which is anapproach found in most TCP implementations. Thenan encrypted UDP packet of size se is created with anynecessary random padding bytes. Next, t_send sleepsfor time te and then sends the packet out.

• Thread t_recv is blocked until a new UDP packet is re-ceived. It decrypts the packet to get a Tor stream seg-ment, which is then pushed into the receiving buffer.The receiving buffer returns the sequence number seqr

of the last byte that is ready to be committed to theTCP stream. Similar to TCP, seqr+1 is used as the acknumber to be sent in the next outgoing UDP packet.Any in-order segments that have been received are de-livered to Tor over the pluggable transport SOCKSconnection and are removed from the receiving buffer.

As we observed from Skype video call traffic, when thenetwork bandwidth is limited, the distributions of packetsizes and inter-packet delays change accordingly. To mimicthis behaviour, our prototype first determines bandwidth

changes by measuring the number r of re-transmissions oc-curring per second. Based on r, the oracle selects the mostrelevant distributions and computes the traffic shaping pa-rameters. The dependence on r can be tuned through ex-periments to match the most relavent distributions.

7. EXPERIMENTSWe performed our experiment in two parts. First we cap-

tured network traces of the Skype application to form abetter understanding of how it operates. Our experimen-tal testbed for this part consisted of several hosts runningdifferent operating systems, including Microsoft Windows,Linux and mobile devices. Using these traces we were ableto obtain an empirical distribution of packet sizes and inter-packet arrival times for Skype video calls, which were usedas input to the SkypeMorph traffic-shaping oracle for draw-ing samples and generating the morphing matrix. Next, weused the SkypeMorph proxy for browsing and downloadingover a bridge set up on a node in the same local network asthe Tor client.

Figure 6 shows the cumulative distributions of packet sizesand inter-arrival delays between consecutive packets for Skype-Morph (both with the naıve and enhanced Traffic Morph-ing traffic shapers) versus the original Tor distribution andthe Skype video call distribution we obtained in the firstpart of the experiment. The graphs depict how closely theSkypeMorph output follows that of the Skype video calls,both in packet sizes and inter-packet delays; indeed, allthree lines overlap almost perfectly. Also, the Kolmogorov-Smirnov test [38] for both the packet sizes and inter-packetdelays shows no statistically significant difference betweenthe Skype video and SkypeMorph distributions, using ei-ther the naıve traffic-shaping or the traffic morphing meth-ods. This is of course expected, as the traffic shaping oracleis designed to match the Skype distribution. In addition,using the naıve traffic-shaping method, we also match thehigher-order Skype traffic distributions. The distribution ofregular Tor traffic, however, is considerably different, andthis shows the utility of our method. The original TrafficMorphing [49] technique does not take into account timings,so its distributions would match Skype video for the packetsizes, but regular Tor traffic for the inter-packet timings.

In order to evaluate the performance of SkypeMorph, wetried downloading the g++ Debian package from a mirror lo-cated in South America4 by directly connecting to a bridgeoperated by us at a rate of 40–50 KB/s. Using the samebridge, we downloaded the file over SkypeMorph, using eachof the naıve traffic shaping and enhanced Traffic Morph-ing methods, and compared the average download speed,network bandwidth used, and overhead percentage. We re-peated the experiment 25 times for each method. The resultsare given in Table 1.

The overhead given is the percentage that the total net-work bandwidth (including TCP/IP or UDP/IP headers,retransmissions, padding, TLS, etc.) exceeds the size of thefile downloaded. Although the very high variance makes ithard to see just by comparing the summary statistics in thetable, the raw data shows that normal bridge traffic con-sistently incurs a 12% overhead, due to overheads incurredby Tor, TLS, and TCP/IP. The overhead of SkypeMorph

4http://ftp.br.debian.org/debian/pool/main/g/gcc-4.4/g++-4.4_4.4.5-8_i386.deb

8

Page 9: Skype Morph

0 200 400 600 800 1000 1200 14000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Em

piric

al C

DF

Packet size (bytes)

Skype Video Calls

SkypeMorph (Naive Method)

SkypeMorph (Traffic Morphing)

Tor Traffic

(a) Packet size distribution

0 0.005 0.01 0.015 0.02 0.0250

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Em

piric

al C

DF

Inter−packet delay (seconds)

Skype Video Calls

SkypeMorph (Naive Method)

SkypeMorph (Traffic Morphing)

Tor Traffic

(b) Inter-packet delays distribution

Figure 6: Experimental cumulative distributions for Skype video, SkypeMorph, and Tor. We show packet size and inter-packetdelay distributions in (a) and (b) respectively.

Table 1: Download speed (goodput), network bandwidth used, and overhead imposed by (a) Normal Tor-over-TCP, (b)SkypeMorph-over-UDP with naıve traffic shaping, and (c) SkypeMorph-over-UDP with our enhanced Traffic Morphing.

Normal SkypeMorph SkypeMorphbridge (Naıve Shaping) (Traffic Morphing)

Goodput 200± 100 KB/s 33.9± 0.8 KB/s 34± 1 KB/sNetwork bandwidth used 200± 100 KB/s 43.4± 0.8 KB/s 43.2± 0.8 KB/s

Overhead 12%± 1% 28%± 2% 28%± 3%

is a little more than twice that; we incur the extra cost ofsending padding when not enough data is available to fillthe packet size informed by the traffic shaping oracle. Wesee that the naıve traffic shaping method and the enhancedTraffic Morphing method perform very similarly; indeed, theKolmogorov-Smirnov test reports that there is no statisti-cally significant difference between the results of those twomethods (p > 0.5). Do note, however, that this overhead in-cludes no silent periods, i.e., times for which we have no Tortraffic in our buffer, and so everything sent on the wire ispadding. Taking these silent periods into account, the over-head is increased by the current bandwidth usage (in thisexperiment, about 43 KB/s). This is the same behaviour asan ordinary Skype video call; data is transmitted at an ap-proximately constant rate, whether or not the participantsare actually communicating.

8. DISCUSSION AND FUTURE WORKOverhead. As seen in the previous section, we found

that when inter-packet timing is introduced to the TrafficMorphing technique, the traffic becomes less distinguishablefrom Skype traffic (the target distribution), but it also be-comes less effective in reducing the overhead. The overheadin SkypeMorph is highly dependent on how much Tor trafficis available to the proxy. SkypeMorph will always send thesame amount of data as a real Skype video connection would;if there is not that much useful Tor traffic to send, the rest ispadding. Hence, if we experience many silent periods—whenthe proxy’s sending buffer is empty—the overhead grows dueto the padding sent by the proxy.

Mobile Bridges. A side advantage of SkypeMorph isthat bridges can easily change their IP addresses and ports,without having to re-distribute contact information to clientsor the BridgeDB. With SkypeMorph, all a client needs toknow to contact a bridge is its Skype ID. This makes itharder for censors to block bridges, even once they are found.

Skype Protocol. SkypeKit allows peers to exchangestreaming data through the Skype network. However, thedata sent to the Skype network might be relayed by othernodes in the network and this can impose an overhead onthe Skype network, which is not desired. Therefore, we de-liberately chose not to use this feature of SkypeKit. Skype-Morph data is sent directly from the client to the bridge; itis disguised as Skype data, but it is not sent over the Skypenetwork.

Attacks on SkypeMorph. In order to be able to blocka SkypeMorph bridge, the censor either needs to totally banSkype communications, or it has to verify the existence ofSkypeMorph on a remote Skype node. This is only pos-sible if the censor already knows the IP address or SkypeID of the bridge, which we have already excluded from ourthreat model. Also note that although we are not tryingto prevent threats that may arise if the content of a Skype-Morph handshake is disclosed by Skype, it is still possibleto use steganographic methods to hide the handshake ininnocuous-looking messages.

The censoring authority can as well run its own Skype-Morph bridges and distribute its descriptor to users. Eventhough this is possible, users’ privacy is not threatened be-cause of this, as they are still selecting their own relays and

9

Page 10: Skype Morph

connecting to the Tor network. Therefore, the censor canonly detect that a user is connected to its own instance ofSkypeMorph.

SkypeMorph and Other Protocols. Our current Skype-Morph implementation is able to imitate arbitrary encryptedprotocols over UDP. The target protocol, Skype in our case,can be replaced by any encrypted protocol that uses UDP aslong as distributions of packet sizes and inter-arrival timesare available. The source protocol, Tor, can also be replacedby an arbitrary TCP protocol. Note that if Traffic Morphingis going to be used, the morphing matrix needs to be recal-culated for every pair of source and target protocols basedon their distributions. Moreover, the current formulation ofTraffic Morphing is not amenable to higher-order statistics.However, if naıve traffic shaping is used, the system is actu-ally completely independent of the source protocol and it ispossible to mimic higher-order statistics.

Skype Public API. Currently we are using the SkypeKitSDK for our implementation of SkypeMorph. This requiresusers to purchase the package from the Skype developerswebsite and obtain Skype certificates for the keys used bythe SDK. As an alternative, the Skype Public API can beused to communicate with a running Skype application [5].It provides functionalities similar to those of SkypeKit andmakes it possible to use the original Skype software to per-form the handshake over the Skype network without requir-ing users and bridges to purchase additional software.

SkypeMorph Software. Our current proof-of-conceptimplementation targets the Linux operating system. Wewould naturally like to extend the work to support Win-dows and other platforms. The only obstacle is the portredirection described in Section 6.1. This can be handled ei-ther with native firewalling support, or possibly by runningLinux-based software on a Tor-aware home router [10].

Also as discussed in Section 4.4, we plan to experimentwith using the audio and video encoders shipped with theSkype in order to more easily match Skype’s packet size andtiming patterns perfectly.

9. CONCLUSIONSWe have presented SkypeMorph, a pluggable transport

for Tor that disguises client-to-bridge connections as Skypevideo traffic. We present two methods to morph Tor streamsinto traffic with indistinguishable packet sizes and timingsto Skype video; the first method uses naıve traffic shaping toemulate the target distribution, independent of the sourcedistribution. The second method takes the source distribu-tion into account, enhancing Wright et al.’s Traffic Morph-ing [49] to also account for packet timings. The two methodshave statistically similar performance, but the naıve trafficshaping method is much easier to implement, is unaffectedby a changing source distribution, and can match the higher-order patterns in Skype traffic. While our methods are ef-fective at matching the desired distributions, they come atsome cost in extra bandwidth used between the client andthe bridge—but no more so than if an actual Skype videocall were in progress. Our software is freely available, and iseasily adaptable to other encrypted UDP-based target pro-tocols.

AcknowledgementsThe authors wish to thank George Kadianakis for his helpful

discussions and his suggestions on how we can improve ourwork. We also thank NSERC, Mprime, and The Tor Projectfor funding this work.

10. REFERENCES[1] Boost C++ Libraries. http://www.boost.org/.

[Online; accessed April 2012].

[2] Does Skype use encryption? https://support.skype.

com/en/faq/FA31/Does-Skype-use-encryption.[Online; accessed April 2012].

[3] Number of Skype users in 2010.http://about.skype.com/press/2011/05/

microsoft_to_acquire_skype.html. [Online; accessedApril 2012].

[4] OutGuess. http://www.outguess.org/. [Online;accessed April 2012].

[5] Skype Desktop API Reference Manual. http://developer.skype.com/public-api-reference.[Online; accessed April 2012].

[6] Skype Software. http://skype.com. [Online; accessedApril 2012].

[7] SkypeKit.http://developer.skype.com/public/skypekit.[Online; accessed April 2012].

[8] Steghide. http://steghide.sourceforge.net/.[Online; accessed April 2012].

[9] Tor BridgeDB. https://gitweb.torproject.org/bridgedb.git/tree.[Online; accessed April 2012].

[10] Torouter. https://trac.torproject.org/projects/tor/wiki/TheOnionRouter/Torouter, 2011. [Online,accessed May 2012].

[11] J. Appelbaum and N. Mathewson. Pluggabletransports for circumvention. https://gitweb.torproject.org/torspec.git/blob/HEAD:

/proposals/180-pluggable-transport.txt. [Online;accessed April 2012].

[12] S. A. Baset and H. G. Schulzrinne. An Analysis of theSkype Peer-to-Peer Internet Telephony Protocol. InINFOCOM 2006. 25th IEEE International Conferenceon Computer Communications. Proceedings, pages1–11, 2006.

[13] P. Biondi and F. Desclaux. Silver Needle in the Skype.BlackHat Europe, http://www.blackhat.com/presentations/bh-europe-06/

bh-eu-06-biondi/bh-eu-06-biondi-up.pdf, March2006. [Online; accessed April 2012].

[14] G. Bissias, M. Liberatore, D. Jensen, and B. Levine.Privacy Vulnerabilities in Encrypted HTTP Streams.In Privacy Enhancing Technologies, pages 1–11. 2006.

[15] D. Bonfiglio, M. Mellia, M. Meo, N. Ritacca, andD. Rossi. Tracking Down Skype Traffic. In INFOCOM2008. 27th IEEE Conference on ComputerCommunications., pages 261 –265, 2008.

[16] D. Bonfiglio, M. Mellia, M. Meo, D. Rossi, andP. Tofanelli. Revealing Skype Traffic: WhenRandomness Plays with You. In Proceedings of the2007 conference on Applications, technologies,architectures, and protocols for computercommunications, SIGCOMM ’07, pages 37–48, 2007.

[17] S. Burnett, N. Feamster, and S. Vempala. Chippingaway at censorship firewalls with user-generated

10

Page 11: Skype Morph

content. In Proceedings of the 19th USENIXconference on Security, USENIX Security’10, 2010.

[18] R. Chandramouli and N. Memon. Analysis of LSBbased image steganography techniques. In Proceedingsof International Conference on Image Processing,volume 3, pages 1019 –1022, 2001.

[19] CyaSSL. Embedded SSL Library for Applications,Devices, and the Cloud.http://www.yassl.com/yaSSL/Home.html.

[20] R. Dingledine. Iran blocks Tor; Tor releases same-dayfix. https://blog.torproject.org/blog/iran-blocks-tor-tor-releases-same-day-fix,Sept. 2011. [Online; accessed April 2012].

[21] R. Dingledine, N. Mathewson, and P. Syverson. Tor:The Second-Generation Onion Router. In Proceedingsof the 13th conference on USENIX SecuritySymposium, SSYM’04, pages 303–320, 2004.

[22] M. Dusi, M. Crotti, F. Gringoli, and L. Salgarelli.Tunnel Hunter: Detecting application-layer tunnelswith statistical fingerprinting. Computer Networks,53(1):81 – 97, 2009.

[23] M. Enev, S. Gupta, T. Kohno, and S. N. Patel.Televisions, video privacy, and powerlineelectromagnetic interference. In Proceedings of the18th ACM conference on Computer andcommunications security, CCS ’11, pages 537–550,New York, NY, USA, 2011. ACM.

[24] N. Hopper, L. von Ahn, and J. Langford. ProvablySecure Steganography. IEEE Transactions onComputers, 58(5):662 –676, 2009.

[25] A. Houmansadr, G. T. Nguyen, M. Caesar, andN. Borisov. Cirripede: Circumvention Infrastructureusing Router Redirection with Plausible Deniability.In Proceedings of the 18th ACM conference onComputer and communications security, CCS ’11,pages 187–200, 2011.

[26] N. F. Johnson, Z. Duric, and S. Jajodia. Informationhiding: Steganography and watermarking—attacks andcountermeasures. Kluwer Academic Publishers, 2000.

[27] G. Kadianakis. Morpher.https://gitorious.org/morpher/morpher.

[28] J. Karlin, D. Ellard, A. W. Jackson, C. E. Jones,G. Lauer, D. P. Mankins, and W. T. Strayer. DecoyRouting: Toward Unblockable InternetCommunication. In Proceedings of the USENIXWorkshop on Free and Open Communications on theInternet (FOCI 2011), August 2011.

[29] M. Krzywinski. Port knocking: Networkauthentication across closed ports. SysAdminMagazine, 12(6):12–17, 2003.

[30] A. Langley. curve25519-donna.http://code.google.com/p/curve25519-donna/.[Online; accessed April 2012].

[31] C. S. Leberknight, M. Chiang, H. V. Poor, andF. Wong. A Taxonomy of Internet Censorship andAnti-Censorship. http://www.princeton.edu/~chiangm/anticensorship.pdf. [Online; accessedApril 2012].

[32] A. Lewman. China blocking Tor: Round Two.https://blog.torproject.org/blog/

china-blocking-tor-round-two, Mar. 2010. [Online;accessed April 2012].

[33] M. Liberatore and B. N. Levine. Inferring the sourceof encrypted HTTP connections. In 13th ACMconference on Computer and communications security,pages 255–263, 2006.

[34] N. Mathewson. The Tor Project, A simple obfuscatingproxy.https://gitweb.torproject.org/obfsproxy.git.[Online; accessed April 2012].

[35] W. Mazurczyk, P. Szaga, and K. Szczypiorski. UsingTranscoding for Hidden Communication in IPTelephony. ArXiv e-prints, Nov. 2011.

[36] J. McLachlan and N. Hopper. On the risks of serving

whenever you surf: Vulnerabilities in ToraAZsblocking resistance design. In Proceedings of the 8thACM workshop on Privacy in the electronic society,WPES ’09, pages 31–40, 2009.

[37] S. J. Murdoch. Moving Tor to a datagram transport.https://blog.torproject.org/blog/

moving-tor-datagram-transport. [Online; accessedApril 2012].

[38] NIST/SEMATECH. e-Handbook of StatisticalMethods. http://www.itl.nist.gov/div898/handbook/index.htm.[Online; accessed April 2012].

[39] F. Petitcolas, R. Anderson, and M. Kuhn. InformationHiding - A Survey. Proceedings of the IEEE, specialissue on protection of multimedia content, pages 1062–1078, 1999.

[40] I. M. Program. Persian cyberspace report: internetblackouts across Iran;. http://iranmediaresearch.com/en/blog/101/12/02/09/840. [Online; accessedApril 2012].

[41] T. S. Saponas, J. Lester, C. Hartung, S. Agarwal, andT. Kohno. Devices that tell on you: privacy trends inconsumer ubiquitous computing. In Proceedings of16th USENIX Security Symposium on USENIXSecurity Symposium, SS’07, pages 5:1–5:16, Berkeley,CA, USA, 2007. USENIX Association.

[42] G. J. Simmons. The Prisoners Problem and theSubliminal Channel. In Advances in Cryptology:CRYPTO 1983, pages 51–67. Springer, 1984.

[43] R. Smits, D. Jain, S. Pidcock, I. Goldberg, andU. Hengartner. BridgeSPA: improving Tor bridgeswith single packet authorization. In Proceedings of the10th annual ACM workshop on Privacy in theElectronic Society, WPES ’11, pages 93–102, 2011.

[44] K. Suh, D. R. Figueiredo, J. Kurose, and D. Towsley.Characterizing and Detecting Skype-Relayed Traffic.In INFOCOM 2006. Proceedings of 25th IEEEInternational Conference on ComputerCommunications., 2006.

[45] A. White, A. Matthews, K. Snow, and F. Monrose.Phonotactic Reconstruction of Encrypted VoIPConversations: Hookt on fon-iks. In IEEE Symposiumon Security and Privacy (SP), pages 3 –18, 2011.

[46] T. Wilde. Knock Knock Knockin’ on Bridges’ Doors.https://blog.torproject.org/blog/

knock-knock-knockin-bridges-doors, Jan. 2012.[Online; accessed April 2012].

[47] P. Winter and S. Lindskog. How China Is BlockingTor. Technical report, Karlstad University, 2012.

11

Page 12: Skype Morph

0 0.005 0.01 0.015 0.02 0.0250

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Delay (Seconds)

Em

piric

al C

DF

(a) Second-order statistics (size - delay)

0 200 400 600 800 1000 1200 14000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Packet Size (Bytes)

Em

piric

al C

DF

(b) Second-order statistics (delay - size)

Figure 7: Second-order statistics of a Skype video call.

[48] C. Wright, L. Ballard, S. Coull, F. Monrose, andG. Masson. Spot Me if You Can: Uncovering SpokenPhrases in Encrypted VoIP Conversations. In IEEESymposium on Security and Privacy (SP), pages 35–49, 2008.

[49] C. Wright, S. Coull, and F. Monrose. TrafficMorphing: An efficient defense against statisticaltraffic analysis. In Proceedings of the Network andDistributed Security Symposium - NDSS ’09. IEEE,February 2009.

[50] C. V. Wright, L. Ballard, F. Monrose, and G. M.Masson. Language Identification of Encrypted VoIPTraffic: Alejandra y Roberto or Alice and Bob? InProceedings of 16th USENIX Security Symposium,2007.

[51] E. Wustrow, S. Wolchok, I. Goldberg, and J. A.Halderman. Telex: Anticensorship in the NetworkInfrastructure. In Proceedings of the 20th USENIXSecurity Symposium, 2011.

APPENDIXIn this section we examine the higher-order statistics inSkype video calls.

• Second-order Statistics: We explored the space ofsecond-order statistics in Skype calls by measuring or-dered pairs (si, ti) where si is the size of a packet senton the wire and ti is the delay until the next packet.

0 200 400 600 800 1000 1200 14000

0.2

0.4

0.6

0.8

1

Packet Size (Bytes)

Em

piric

al C

DF

(a) Third-order statistics (size - delay - size)

0 0.005 0.01 0.015 0.02 0.0250

0.2

0.4

0.6

0.8

1

Delay (Millisecond)E

mpiric

al C

DF

(b) Third-order statistics (delay - size - delay)

Figure 8: Third-order statistics of a Skype video call, for afixed delay in Figure 8a and a fixed size in Figure 8b.

For all possible packet sizes the empirical shown in Fig-ure 7a.5

The same experiment was repeated with all (ti, si) pairswith ti being the delay and si the packet size sent on thewire after ti milliseconds. As Figure 7b shows, smallerdelays tend to result in larger packet sizes followingthem.

• Third-order Statistics: For third-order statistics weconsidered tuples (si, ti, s

′i), where si and s′i are consec-

utive packet sizes and ti is the delay between the twopackets. Thus, for all possible s′i we formed the dis-tribution P [s′i|(si, ti)] and compared them. Figure 8ashows this distribution for a fixed ti, demonstratingthat there are third-order statistics not explained bythe second-order statistics.

Similarly, we also considered tuples of the form (ti, si, t′i).

Figure 8b shows the distribution of P [t′i|(ti, si)] for afixed si, again revealing nontrivial third-order statis-tics.

• Higher-order Statistics: Extending this method tohigher orders is a straightforward task and dependingon the precision needed, we can find these statistics upto the desired order. This allows us to fully mimic thetraffic characteristics of a Skype call.

5As the space of all possible tuples was huge, we binned thepacket sizes and delays to make the figure easier to read.

12