ad-hoc secure two-party computation on mobile devices

17
This paper is included in the Proceedings of the 23rd USENIX Security Symposium. August 20–22, 2014 • San Diego, CA ISBN 978-1-931971-15-7 Open access to the Proceedings of the 23rd USENIX Security Symposium is sponsored by USENIX Ad-Hoc Secure Two-Party Computation on Mobile Devices using Hardware Tokens Daniel Demmler, Thomas Schneider, and Michael Zohner, Technische Universität Darmstadt https://www.usenix.org/conference/usenixsecurity14/technical-sessions/presentation/demmler

Upload: others

Post on 19-Dec-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

This paper is included in the Proceedings of the 23rd USENIX Security Symposium.

August 20–22, 2014 • San Diego, CA

ISBN 978-1-931971-15-7

Open access to the Proceedings of the 23rd USENIX Security Symposium

is sponsored by USENIX

Ad-Hoc Secure Two-Party Computation on Mobile Devices using Hardware Tokens

Daniel Demmler, Thomas Schneider, and Michael Zohner, Technische Universität Darmstadt

https://www.usenix.org/conference/usenixsecurity14/technical-sessions/presentation/demmler

USENIX Association 23rd USENIX Security Symposium 893

Ad-Hoc Secure Two-Party Computation onMobile Devices using Hardware Tokens

Daniel Demmler, Thomas Schneider, and Michael Zohner

Technische Universitat Darmstadt, Germany{daniel.demmler,thomas.schneider,michael.zohner}@ec-spride.de

AbstractSecure two-party computation allows two mutually dis-trusting parties to jointly compute an arbitrary functionon their private inputs without revealing anything but theresult. An interesting target for deploying secure compu-tation protocols are mobile devices as they contain a lotof sensitive user data. However, their resource restrictionmakes the deployment of secure computation protocols achallenging task.

In this work, we optimize and implement thesecure computation protocol by Goldreich-Micali-Wigderson (GMW) on mobile phones. To increase per-formance, we extend the protocol by a trusted hardwaretoken (i.e., a smartcard). The trusted hardware token al-lows to pre-compute most of the workload in an initial-ization phase, which is executed locally on one deviceand can be pre-computed independently of the later com-munication partner. We develop and analyze a proof-of-concept implementation of generic secure two-partycomputation on Android smart phones making use ofa microSD smartcard. Our use cases include privateset intersection for finding shared contacts and privatescheduling of a meeting with location preferences. Forprivate set intersection, our token-aided implementationon mobile phones is up to two orders of magnitude fasterthan previous generic secure two-party computation pro-tocols on mobile phones and even as fast as previouswork on desktop computers.

1 Introduction

Secure two-party computation allows two parties to pro-cess their sensitive data in such a way that its privacyis protected. In the late eighties, Yao’s garbled cir-cuits protocol [Yao86] and the protocol of Goldreich-Micali-Wigderson (GMW) [GMW87] showed the fea-sibility of secure computation. However, secure com-putation was considered to be mostly of theoretical in-

terest until the Fairplay framework [MNPS04] demon-strated that it is indeed practical. Since then, many op-timizations have been proposed and several frameworkshave implemented Yao’s garbled circuits protocol (e.g.,FastGC [HEKM11]) and the GMW protocol (e.g., theframework of [CHK+12]) on desktop PCs.

Motivated by the advances of secure computationon desktop PCs, researchers have started to investigatewhether secure computation can also be performed inthe mobile domain. Mobile devices, in particular smart-phones, are an excellent environment for secure compu-tation, since they accompany users in their daily lives andtypically hold contact information, calendars, and pho-tos. Users also store sensitive data, such as passwordsor banking information on their devices. Moreover, typ-ical smartphones are equipped with a multitude of sen-sors that collect a lot of sensitive information about theirusers’ contexts. Therefore, it is of special importance toprotect the privacy of data handled in the mobile domain.

In contrast to desktop PCs, mobile devices are ratherlimited in computational power, available memory, com-munication capabilities, and most notably battery life.Although mobile phones have seen an increase in pro-cessing speed over the past years, they are still about oneorder of magnitude slower than typical desktop comput-ers when evaluating cryptographic primitives (cf. §5.4).These differences are due to the CPU architectures hav-ing a more restrictive instruction set and being opti-mized for low power consumption rather than perfor-mance, since mobile devices are battery-powered andlack active cooling. Moreover, the limited size of themain memory requires the programmer to carefully han-dle data objects in order to avoid costly garbage col-lections on Java-based Android smartphones. Networkconnections of mobile devices are almost exclusively es-tablished via wireless connections that have lower band-width and higher, often varying latency compared towired connections. Tasks that are computationally in-

894 23rd USENIX Security Symposium USENIX Association

tensive or require long send/receive operations should beavoided when a mobile device is running on battery, assuch tasks quickly drain the battery charge and therebyreduce the phone’s standby time. Instead, such oper-ations could be pre-computed when the mobile deviceis connected to a power source, which usually happensovernight. These limitations pose a big challenge for effi-cient secure computation and cause generic secure com-putation protocols to be several hundred times slower onmobile devices than on desktop PCs [HCE11], even inthe semi-honest adversary model.

To enable secure two-party computation in the mobiledomain, solutions have been developed that outsourcesecure computation to the cloud, e.g., [KMR12, Hua12,CMTB13]. However, recent events have shown thatcloud service providers can be forced to give away datato third parties that are not necessarily trusted, such asforeign government agencies. Even if the employed pro-tocols ensure that the cloud provider learns no informa-tion about the users’ sensitive data, he can still learn andhence be forced to reveal meta-information such as thefrequency of access, communication partners involved,the computed function, or the size of the inputs. More-over, these server-aided approaches require the mobiledevice to be connected to the Internet which might not bepossible in every situation or may cause additional costs.

An alternative solution, which we also use in thiswork, is to outsource expensive operations to a trustedhardware token that has very limited computational re-sources and is locally held by one of the communica-tion partners.1 Such hardware tokens are increasinglybeing adopted in practice, e.g., trusted platform modules(TPMs). Their adoption is particularly noteworthy onmobile devices in the form of smartcards that are the ba-sis for subscriber identity modules (SIM cards), as wellas for mobile payment or ticketing systems. A first ap-proach for outsourcing Yao’s garbled circuits protocol tosuch a trusted hardware token was proposed in [JKSS10].However, this protocol requires the function to be knownin advance and uses costly symmetric cryptographic op-erations during the online phase. We give an alternativesolution that removes these drawbacks.

1.1 Outline and Our Contributions

In this work, we introduce a scheme for token-aided ad-hoc generic secure two-party computation on mobile de-vices based on the GMW protocol. After introducingpreliminaries (§2) we detail our setting and trust assump-tions that are similar to the ones in a TPM scenario (§3).We outline how a trusted hardware token can be used

1This locality is also a security feature, as external adversaries eitherneed to corrupt the token before it is shipped to the user or later getphysical access to break into it.

to shift major parts of the workload into an initializa-tion phase that can be pre-computed on the token, inde-pendently of the later communication partner (§4), e.g.,while the mobile device is charging. We thereby obtaina token-aided scheme that is well-suited for efficient anddecentralized (ad-hoc) secure computation in the mobiledomain. We implement and evaluate our scheme (§5)and demonstrate its performance using typical securecomputation applications for mobile devices, such as se-curely scheduling a meeting with location preferencesand privacy-preserving set intersection (§6). We com-pare our scheme to related work (§7) and conclude andpresent directions for future work (§8). More detailed,our contributions are as follows.

Token-Aided Ad-Hoc Secure Two-Party Computa-tion on Mobile Devices (§4) We develop a token-aidedsecure computation protocol which offloads the mainworkload of the GMW protocol to a pre-computationphase by introducing a secure hardware token T , held byone party A (cf. §3). T is issued by a trusted third partyand provides correlated randomness [Hua12, Chap. 6]to both parties that is later used in the secure computa-tion protocol. To prepare the secure computation, theother party B obtains seeds for his part of the correlatedrandomness from T via an encrypted channel. To fur-ther increase flexibility, we describe how to make thepre-computation independent of the size of the evaluatedfunction | f |, at the cost of a t · log2 | f | factor communi-cation overhead between T and B, where t is the sym-metric security parameter. In contrast to Yao-based ap-proaches [MNPS04,JKSS10,HCE11,HEK12] and previ-ous realizations of the GMW protocol [CHK+12, SZ13,ALSZ13], our protocol offers several benefits as summa-rized in Tab. 1 (cf. §4.5 for details).

Table 1: Comparison with related work.

Property Yao Token Yao GMW Ours[HCE11] [JKSS10] [CHK+12] §4

f unknown ininit phase

� � � �

ad-hoc com-munication� t · | f |

� � � �

� | f | cryptooperations inad-hoc phase

� � � �

Implementation (§5) We implement our token-aidedprotocol for semi-honest participants and evaluate itsperformance using two consumer-grade Android smart-phones and an off-the-shelf smartcard. Thereby, we pro-vide an estimate for the achievable runtime of generic se-

USENIX Association 23rd USENIX Security Symposium 895

cure computation in the mobile domain. Our implemen-tation enables a developer to specify the functionalityas a Boolean circuit, which can, for instance, be gener-ated from a high-level specification language. We showthat the performance of our token-aided pre-computationphase is comparable to interactively generating the cor-related randomness using oblivious transfer.

Applications (§6) We demonstrate the practical feasi-bility of the GMW protocol on mobile devices by per-forming secure two-party computation on two smart-phones using various privacy-preserving applicationssuch as availability scheduling (§6.1), location-awarescheduling (§6.2), and set-intersection (§6.3). Mostnotably, for private set-intersection, our token-aidedscheme outperforms related work that evaluates genericsecure computation schemes on mobile devices [HCE11]by up to two orders of magnitude and has a performancethat is comparable with secure computation schemes thatare executed in a desktop environment [HEK12].

2 Preliminaries

In the following, we define our notation (§2.1) and thead-hoc scenario (§2.2), and give an overview of oblivi-ous transfer (§2.3) and the GMW protocol (§2.4). We de-scribe Yao’s garbled circuits in the full version [DSZ14].

2.1 NotationWe denote the two parties that participate in the securecomputation as A and B. We use the standard notationfor bitwise operations, i.e., x⊕ y denotes bitwise XOR,x∧ y bitwise AND, and x||y the concatenation of two bitstrings x and y. We refer to the symmetric security pa-rameter as t and the function to be evaluated as f .

2.2 Ad-Hoc ScenarioIn an ad-hoc secure two-party computation scenario, twoparties that do not necessarily know each other in ad-vance want to spontaneously perform secure computa-tion of an arbitrary function f on their private inputs xand y. Traditionally, secure computation protocols con-sist of two interactive phases: the setup phase (indepen-dent of x and y) and the online phase. We extend thissetting by a local init phase as depicted in Fig. 1.

The init phase takes place at any time before theparties have identified each other and is used for pre-processing. In the setup phase, the parties have deter-mined their communication partner, establish a commu-nication channel, and know an upper bound on the func-tion size | f |. In the online phase, the parties provide theirprivate inputs x and y to the function f that they want to

A B

Init Phase Setup Phase Online Phase

A B...?Af(x,y)x y

f

Figure 1: The three secure computation phases.

evaluate and begin the secure computation. The ad-hoctime is the combined time for setup and online phase.

2.3 Oblivious Transfer

Oblivious transfer (OT) is a fundamental building blockfor secure computation. In an OT protocol [Rab81], thesender inputs two strings (s0,s1). The receiver inputs abit c ∈ {0,1} and obtains sc as output without reveal-ing to the sender which of the two messages he choseand without the receiver learning any information abouts1−c. OT protocols, such as [NP01], require public-keycryptography and make OT a relatively costly operation.OT extension [IKNP03] allows to increase the efficiencyof OT by extending a small number of t base OTs toa large number n � t of OTs whilst only using O(n)symmetric cryptographic operations. Optimizations tothe OT extension protocol of [IKNP03] were suggestedin [ALSZ13], which allow the parties to reduce theamount of data sent per OT. Moreover, [ALSZ13] de-scribes a more efficient variant of the OT extension pro-tocol for computing random OT, where the sender ob-tains two random values as output of the OT protocol.

2.4 The GMW Protocol

In the GMW protocol [GMW87], two (or more) partiescompute a function f , represented as Boolean circuit ontheir private inputs by secret sharing their inputs usingan XOR secret sharing scheme and evaluating f gate bygate. Each party can evaluate XOR gates locally by com-puting the XOR of the input shares. AND gates, on theother hand, require the parties to interact with each otherby either evaluating an OT or by using a multiplicationtriple [Bea91] as shown in the full version [DSZ14]. Fi-nally, all parties send the shares of the output wires to theparty that shall obtain the function output. The main costfactors in GMW are the total number of AND gates inthe circuit, called (multiplicative) size | f |, and the high-est number of AND gates between any input wire andany output wire, called (multiplicative) depth d( f ).

Because an interactive OT is required for each ANDgate, it was believed that GMW is very inefficient com-pared to Yao’s garbled circuits. However, in [CHK+12]it was shown that by using OT extension [IKNP03]and OT pre-computation [Bea95] many OTs can be

896 23rd USENIX Security Symposium USENIX Association

pre-computed efficiently in an interactive setup phase.Thereby, all use of symmetric cryptographic operationsis shifted to the setup phase, leaving only efficient one-time pad operations for the online phase. Additionally,the setup phase only requires an upper bound on | f |to be known before the secure computation. Follow-upwork of [SZ13] demonstrated that, by using OT to pre-compute multiplication triples in the setup phase, the on-line phase can be further sped up. Multiplication triplesare random-looking bits ai,bi, and ci, for i ∈ {A,B}, sat-isfying (cA⊕ cB)= (aA⊕aB)∧ (bA⊕bB), that are heldby the respective parties and used to mask private dataduring the secure computation. This masking is donevery efficiently, since no cryptographic operations are re-quired. In [ALSZ13] it was shown that multiplicationtriples can be generated interactively using two randomOTs. [Hua12] proposed to let a trusted server generatethe multiplication triples and send (ai,bi,ci) to party iover a secure channel via the Internet. In our work, wepropose to do this locally, without knowing the commu-nication partner in advance.

3 Our Setting

In our setting, depicted in Fig. 2, we focus on ef-ficient ad-hoc secure computation between two semi-honest (cf. §3.1) parties A and B who each hold a mo-bile device, which are approximately equally powerfulbut significantly weaker than typical desktop computersystems. The parties’ devices are connected via a wire-less network and are battery-powered.

T

A B

Figure 2: The parties involved in the secure computation.

A holds a general-purpose tamper-proof hardware to-ken T that has very few computational resources. T ispowered by A, and its functionalities are limited to thestandard functionalities described in §3.2. A and T areconnected via a physical low-bandwidth connection andcommunicate via a fixed interface. B and T communi-cate via A, i.e., every message that B and T exchange, isseen and relayed by A. Note that this directly requires allcommunication between B and T to be encrypted suchthat it cannot be read by A. We assume that T behavessemi-honestly, and is issued by a third party, external toand trusted by both A and B (cf. §3.2).

3.1 Adversary Model

We assume that both parties behave semi-honestly inthe online phase, i.e., they follow the secure computa-tion protocol, but may try to infer additional informationabout the other party’s inputs from the observed mes-sages. To the best of our knowledge, all previous work onsecure computation between two mobile phones is basedon the semi-honest model (cf. §7.1). The semi-honestmodel is suitable in scenarios where the parties want toprevent inadvertent information leakage and for deviceswhere the software is controlled by a trusted party (e.g.,business phones managed by an IT department) or wherecode attestation can be applied. Moreover, this modelgives an estimate on the achievable performance of se-cure computation. We outline how to extend our protocolto malicious security in the full version [DSZ14].

3.2 Trusted Hardware Token

We use the term trusted hardware token T to refer toa tamper-proof, programmable device, such as a Javasmartcard, that offers a restricted set of functionali-ties. Such functionalities include, for instance, hash-ing, symmetric and asymmetric encryption/decryption,secure storage and secure random number generation.A detailed summary of standard smartcard functionali-ties is given in [HL08]. The hardware token is passive,i.e., it cannot initiate a communication by itself and onlyresponds to queries from its host. It contains both per-sistent and transient memory. T is physically protectedagainst attacks and is securely erased if it is opened byforce. Each token holds an asymmetric key pair, similarto an endorsement key used in TPMs [TCG13], wherethe public key is certified by a known trusted third partyand allows unique identification of T .

Tiny Trusted Third Party T acts as a tiny trustedthird party that behaves semi-honestly. This assumptionis similar to the TPM model that is widely used in desk-top environments. T only provides correlated random-ness that is later used in the secure computation and doesnever receive any of A or B’s private inputs. We assumethat only certified code is allowed to be executed on T ,and that T can only actively deviate from the protocolif the hardware token’s manufacturer programmed it tobe malicious. We assume the code certification was car-ried out by a trusted third party, and argue that both themanufacturer and the certification authority would facesevere reputation loss if it was discovered that they builtbackdoors into their products. Moreover, we assume thatneither A nor B colludes with the hardware token man-ufacturer. This non-collusion assumption is a commonrequirement for outsourced secure computation schemes

USENIX Association 23rd USENIX Security Symposium 897

such as [Hua12,KMR12,CMTB13] and enables the con-struction of efficient protocols. Finally, note that, al-though T is in A’s possession, A cannot easily corrupt Tor obtain its internal information, since T is assumed tobe tamper-proof and does not reveal internal secrets, i.e.,the costs of an attack are higher than the benefits frombreaking T ’s security. This assumption also holds if Acolludes with or impersonates B.

Protection Against Successful Hardware AttacksA malicious adversary could try to break into the hard-ware token. If such an attack is successful, the follow-ing standard countermeasures can be used to prevent fur-ther damage: A binding between token and key pair canbe realized by using techniques such as physically un-cloneable functions (PUFs), however, we are not awareof solutions that are available in commercial products.To bind a token to a certain mobile device or person,T ’s certificate could be personalized with one or mul-tiple values that are unique per user and that can be ver-ified over an off-band channel, such as the user’s tele-phone number or the ID of the user’s passport. Anotherline of defense can be certificate revocation lists (CRLs)that allow the users to check if a token is known to becompromised or malicious.

4 Token-aided Mobile GMW

In the following section, we give details on our token-aided GMW-based protocol on mobile devices. Our goalis to minimize the ad-hoc time, i.e., the time from es-tablishing the communication channel between A and Buntil receiving the results of the secure computation. Weconsider the init phase to not be time critical, but we tryto keep its computational overhead small.

A A B

InitvPhaseMTvGen.v(§4.1)

SetupvPhaseSeedvTransferv(§4.2)

OnlinevPhaseCircuitvEvaluation

T T T

A B...

Figure 3: The three phases, workload distribution, andcommunication in our token-aided scheme.

An overview of our protocol is given in Fig. 3. Thegeneral idea is to let the hardware token generate mul-tiplication triples from two (or more) seeds in the initphase that are independent of the later communicationpartner (§4.1). In the setup phase, T then sends oneseed to A and the other seed over an encrypted channel

to B (§4.2). The token thereby replaces the OT protocolin the setup phase and allows pre-computing the multipli-cation triples independently of the communication part-ner. The online phase of the GMW protocol remains un-changed. In order to overcome the restriction that thefunction size needs to be known in advance, we describea method that pre-computes several multiplication triplesequences of different size and only adds a small com-munication overhead in the setup phase (§4.3). Finally,we analyze the security of our protocol (§4.4) and com-pare its performance to previous solutions (§4.5).

4.1 Multiplication Triple Pre-Computationin the Init Phase

In the original GMW protocol, A and B interactivelycompute their multiplication triples (an

A,bnA,c

nA) and

(anB,b

nB,c

nB) in the setup phase using 2n random OT ex-

tensions (cf. §2.4). Instead, we avoid this overhead inthe setup phase and let T pre-compute the multiplica-tion triples in the init phase as shown in Fig. 4: T firstgenerates random seeds and then expands these seeds in-ternally into the multiplication triples and sends cn

A to A.

d

sB

cnBbnBanB

kB

sA

bnAanA

kA

cnA

T

A

cnA

Seed Expansion

Seed Generation

Figure 4: Multiplication triple pre-generation in the initphase between A and T .

Seed Generation In the seed generation step, T gen-erates two seeds sA = GkA(d) and sB = GkB(d) us-ing a cryptographically strong Pseudo-Random Gener-ator (PRG) G, two master keys kA and kB, and a statevalue d, which is unique per multiplication triple se-quence and can be instantiated with a counter. The twomaster keys kA and kB are constant for all multiplica-tion triple sequences and have to be generated and storedonly once. Thereby, T has to store only the uniquestate value d in its internal memory for every multipli-cation triple sequence. Note that the only values that willleave the internal memory of T are the seeds sA and sB

898 23rd USENIX Security Symposium USENIX Association

that will be sent in the setup phase to A and B, respec-tively (cf. §4.2). In order to ensure that sB is not sent outtwice, we require sA to be queried before sB and deletethe state value d as soon as sB has been sent out. A secu-rity analysis of this scheme is given in §4.4.

Seed Expansion The seed expansion step com-putes a valid multiplication triple sequence from theseeds sA and sB by computing (an

A,bnA) = GsA(dA) and

(anB,b

nB,c

nB) = GsB(dB) and setting the remaining value

cnA = (an

A⊕anB)∧ (bn

A⊕bnB)⊕ cn

B, where dA and dB arepublicly known state values of A and B, respectively.Due to the limited memory of the hardware token, the se-quence cn

A is computed block-wise such that T requiresonly a fixed amount of memory, independently of n, andeach block is sent to A, who stores it locally. Note thatthe values (an

A,bnA,a

nB,b

nB,c

nB) do not need to be stored,

since they can be expanded from sA and sB, respectively.

4.2 Seed Transfer in the Setup PhaseIn the setup phase, the hardware token sends the seeds sAand sB to A and B, respectively, and the parties generatetheir multiplication triples as depicted in Fig. 5. A ob-tains his seed sA directly from T and can read the se-quence cn

A, which was obtained in the init phase, fromits internal flash storage. B’s seed sB, on the other hand,cannot be sent in plaintext from T to B as the communi-cation between the token and B is relayed over A, whichwould allow A to intercept sB. We therefore require thecommunication between B and T to be encrypted and Tto authenticate itself to B with a certificate.

d

sB

kB

sA

kA

T

A

sA

bnAanA

B

sB

bnBanB cnBcnA

Figure 5: Seed transfer and seed expansion in the setupphase. sB is sent from T to B over a secure channel.

An encrypted and one-way authenticated communica-tion channel can be established using a key agreementprotocol from a wide variety of choices (cf. [MvOV96]).We choose two protocols that allow us to handle dif-ferent attacker models: For security against a malicious(active) A we use TLS [IET08] (with RSA for public-key crypto, AES for symmetric encryption, and HMAC

as message authentication code) and for security againsta semi-honest (passive) A we use KAS1-basic [NIS09](with AES for symmetric encryption), cf. the full ver-sion [DSZ14] for details. Both schemes use T ’s public-key certificate that is signed by a trusted third party. Forevery new connection this certificate is verified by B andoptionally checked against a CRL and/or is checked to beconsistent with A’s identity over an out-of-band channelto protect against hardware attacks (cf. §3.2).

4.3 Multiplication Triple Composition

The multiplication triple generation described until nowrequires the function size n = | f | to be known before-hand. While this may be the case for some functions,e.g., for set intersection using bitwise AND (cf. §6.1),the size of other functions depends on the number ofinputs, e.g., the number of contacts in the addressbook (cf. §6.3). The naive solution to not know-ing n in advance would be to generate several mul-tiplication triple sequences of fixed size � in the initphase and send their �n/�� seeds in the setup phase,when n is known. However, on average this approachwastes �/2 multiplication triples and requires to send�n/�� multiplication triple seeds. Thus, a smaller �results in fewer wasted multiplication triples but morecommunication overhead, while a higher � results inmore wasted multiplication triples but less communica-tion. Since typical function sizes in secure computationrange from millions [HEKM11] to even a billion ANDgates [CMTB13], an appropriate � is difficult to choose.

Instead of generating fixed-length blocks of multipli-cation triple sequences, we propose to generate m mul-tiplication triple sequences s0, ...,sm−1 in the init phase,where si contains 2i (0 ≤ i < m) multiplication triples.In the setup phase, we then send a set of multiplicationtriple seeds {sk |nk = 1}, where nk is the k-th bit of n.This approach requires sending at most �log2 n� seeds.As communication between T and A is the bottleneck inour implementation, we set the smallest size of a multi-plication triple sequence such that it fits into one packet.

4.4 Security Analysis

In this section, we briefly analyze the security of our pro-tocol for each of the three secure computation phases.

Init Phase In the init phase no private inputs are in-volved and B is unknown. Therefore, A can only try tomanipulate the token, which is hard since the hardwaretoken is tamper-proof. Moreover, A receives only its cAshares that do not reveal anything about B’s shares or T ’sinternal state, due to the cryptographically strong PRG.

USENIX Association 23rd USENIX Security Symposium 899

Setup Phase The only attack a malicious A could playin the setup phase, is to impersonate B. This attackis prevented, since every seed sB can only be queriedonce (cf. §4.1). The communication between the hard-ware token and B is done through an encrypted channel,so that A cannot get access to those messages. For activesecurity, we use TLS and add a MAC to every packet toprevent modifications and avoid replay attacks. B can-not actively attack the token since all communication tothe hardware token is controlled by A. Obviously, anyparty can drop or ignore messages, but we exclude thissimple denial of service attack from our system modelsince we assume both parties to be willing to participatein the secure computation. The seeds that each party ob-tains from the hardware token do also not reveal any ad-ditional information since they are directly output froma cryptographically strong PRG to which the hardwaretoken’s internal state is used as seed.

Online Phase The security for the online phase di-rectly carries over from the GMW protocol, as we donot introduce any modifications to this phase.

4.5 Performance Comparison

We show that the asymptotic performance of our protocolimproves over existing solutions. A summary is shown inTab. 1 on page 2 and a more detailed comparison is givenin the full version [DSZ14]. An experimental evaluationof our protocol is provided in §5.4 and its performanceon applications is evaluated in §6.

Asymptotic Performance The init phase of our pro-tocol is, unlike [JKSS10], independent of a concrete in-stance of f and can thus be pre-computed without know-ing a communication partner. During the setup phase, thecommunication complexity of our protocol is only O(t)(or O(t · log2 | f |) if | f | is unknown), which improvesupon the communication of Yao’s protocol and the GMWprotocol [CHK+12, SZ13, ALSZ13] with O(t · | f |) com-munication. Both parties have to do O(| f |/b) sym-metric cryptographic operations to expand their seeds.2

The online phase is the first phase where f needs to beknown. Here, A and B send O(| f |) bits in d( f ) rounds,where d( f ) is the depth of f . The parties’ computationcomplexity is negligible, as no cryptographic operationsare evaluated. This is the biggest advantage over Yao’sgarbled circuits protocol [MNPS04, JKSS10, HCE11],where O(| f |) symmetric cryptographic operations haveto be evaluated during the online phase.

2In our implementation, we instantiate the PRG G with AES-128-CTR, which has block size b = 128.

Concrete Performance For 80 bit security, the bestknown instantiation of Yao’s garbled circuits protocol(resp. the GMW protocol) require per AND gate 240 bit(resp. 164 bit) communication and 4+ 1 (resp. 12+ 0)evaluations of symmetric cryptographic primitives in thesetup+online phase. In comparison, our solution re-quires only 4 bit communication and 0.04+ 0 fixed-keyAES operations per AND gate.

5 Implementation

This section details the implementation of our scheme.We introduce the smartcard that we use to instantiatethe hardware token (§5.1), give an overview of our An-droid implementation (§5.2), outline our benchmarkingenvironment (§5.3), and experimentally compare the OTextension-based multiplication triple generation to ourhardware token-based protocol (§5.4).

5.1 G&D Mobile Security CardIn our implementation we instantiate the trusted hard-ware token T with the Giesecke & Devrient (G&D) Mo-bile Security Card SE 1.0 (MSC). It is embedded into amicroSD card that additionally contains 2 GB of separateflash memory. The MSC is based on an NXP SmartMXP5CD080 micro-controller that runs at a maximum fre-quency of 10 MHz, has 6 kB of RAM, 80 kB of persistentEEPROM, and is based on Java Card version 2.2.2. Notethat an applet can only use 1,750 Bytes of the 6 kB RAMfor transient storage. The MSC has co-processors for3DES, AES and RSA that can be called from a JavaCard applet, as well as native routines for MD5, SHA-1and SHA256. The MSC runs the operating system G&DSm@rtCafe Expert 5.0 which manages the installed JavaCard applets, personalization data, and communicationkeys. The communication between the Android operat-ing system and the MSC is done by a separate service viathe SIMalliance Open Mobile API.

5.2 ArchitectureThe architecture of our implementation is depicted inFig. 6. To support flexibility and extensibility, our modu-lar architecture consists of the Application that specifiesthe functionality, the GMWService that performs securecomputation, and the MTService that performs the mul-tiplication triple generation and transfer. All communi-cation between A and T is done via the MSC Smart-card Service supplied by G&D. The Application can beimplemented by a designer and specifies the desired se-cure computation functionality as a Boolean circuit thatcan, for instance, be compiled from a high-level cir-cuit description language such as the Secure Function

900 23rd USENIX Security Symposium USENIX Association

Definition Language (SFDL) [MNPS04, MLB12] or thePortable Circuit Format (PCF) [KMSB13].

The GMWService implements the GMW protocol andperforms the secure computation, given a circuit descrip-tion and corresponding inputs. The MTService generatesthe multiplication triples using either OT extension (OT-Ext) based on the memory efficient implementationof [HS13] including the optimizations from [ALSZ13]or, if one of the parties holds a hardware token, ourtoken-aided protocol of §4. If a hardware token ispresent, the MTService manages the multiplication triplegeneration during the init phase by querying the tokenand storing the received cA sequences. For the MSC, themultiplication triple generation on T is performed viaa Java Card applet (MT JC Applet) that implements thefunctionality in §4.1 and is accessible through the JavaCard interface. Our implementation can be installed as aregular Android app and does not require root access tothe smartphone or a custom firmware.

A B

G&D MSCSCService OT-Ext

T

. . .

GMWService

MTService

f;x zA

jfj

Application

aA;bA;cA

GMWService

MTService

OT-Ext

zB

Application

f;y

jfj aB;bB;cB

. . .

MT JC Applet

. . .

Figure 6: Modular architecture design.

Secure computation is performed by having an Appli-cation running on each smartphone, which specifies thefunction f both parties want to compute securely. Fromthis function the Application generates a circuit descrip-tion, which it sends to the GMWService. The GMWSer-vice interprets the circuit and queries the MTServicefor the required number of multiplication triples | f |.The MTServices on both smartphones then communi-cate with each other and check whether one of the smart-phones holds a hardware token (A in Fig. 6). If so, bothMTServices perform the seed transfer protocol (cf. §4.2),

expand the obtained seeds (A loads the correspondingcA sequences obtained in the init phase), and merge theobtained multiplication triple sequences (cf. §4.3). Ifno hardware token is present, the MTServices gener-ate the multiplication triples by invoking OT extension.The MTService then provides the multiplication triples(ai,bi,ci) for i ∈ {A,B} to the GMWService. Finally,the Applications send their inputs x and y, respectively,to the GMWService, which performs the secure compu-tation and returns the output zi = f (x,y).

5.3 Benchmarking EnvironmentFor our mobile benchmarking environment we use twoSamsung Galaxy S3’s, which each have a 1.4 GHz ARM-Cortex-A9 Quad-Core CPU, 1 GB of RAM, 16 GB ofinternal flash memory, a mircoSD card slot, and run theAndroid operating system version 4.1.2 (Jelly Bean). Forthe communication between the smartphones, we useWi-Fi direct. For the evaluation, we put the smartphonesnext to each other on a table. The G&D mobile securitycard is connected to the mircoSD card slot of one of thephones. We use the short-term security setting recom-mended by NIST [NIS12], i.e., a symmetric key size of80 bits and a public key size of 1,024 bit with a 160 bitsubgroup. We instantiated the pseudo-random genera-tor G that is used for seed expansion (cf. §4.1) with AES-128 in CTR mode. The hardware token generates multi-plication triple sequences of size 2m for 11≤m≤ 24. Weused m = 11 as lower bound on the size, since 2,048 isthe biggest size we can transfer from T to A with a singlepacket, and m = 24 as upper bound, since it was appro-priate for our case studies in §6. Finally, we point out thatour implementation is single-threaded and utilizes onlyone of the four available cores of our smartphones. Weleave the extension to multiple threads as future work.

5.4 Performance EvaluationFirst, we want to quantify the runtime differences be-tween the mobile and the desktop environment. Wemeasure the execution time for AES-128 in ECB modefor an identical single-threaded Java implementation inboth domains. The smartphone version is running with5.5 MB/s while the desktop version achieves 61.1 MB/s.The optimized AES-256 implementation of Truecrypt3,written in C/C++ and assembly, achieves 143.1 MB/s onthe same desktop machine, running without paralleliza-tion. For comparison, the smartcard (cf. §5.1) is runningAES-128 at a maximum speed of 16.7 KB/s.

In the following we evaluate the performance of ourtoken-based scheme (cf. §4) on smartphones, using TLSor KAS1-basic as key agreement protocol, and compare

3http://www.truecrypt.org

USENIX Association 23rd USENIX Security Symposium 901

it to the OT extension based multiplication triple genera-tion. In our evaluations we only include the time for initand setup phase, since the online phase is identical forboth approaches. Results for the online phase are givenin §6. All values are averaged over 10 measurements.

Fig. 7 gives an overview over the timings for the gen-eration of 2m (11 ≤ m ≤ 24) multiplication triples usingeither OT extension in the setup phase or the hardwaretoken (§4.1) in the init phase. Additionally, the setupphase using TLS and KAS1-basic is depicted, which in-cludes the seed transfer and the seed expansion of B.We always assume the worst case number of seeds to betransferred, i.e., for 224 multiplication triples, we trans-fer 24−10 = 14 seeds (cf. §4.3). Both axes in Fig. 7 aregiven in a logarithmic scale.

0.1

1

10

100

1000

10000

211 212 213 214 215 216 217 218 219 220 221 222 223 224

Token MTGen (Init, §4.1)OT Extension (Setup, §2.3)

Token TLS (Setup, §4.2)Token KAS1 (Setup, §4.2)

Run

time

[s]

Number of Multiplication Triples

Figure 7: Performance evaluation of the multiplicationtriple generation and setup phase.

We observe that OT extension on mobile devices isable to generate 224 multiplication triples in 1,529 s,corresponding to 10,971 multiplication triples per sec-ond. We ran the same code on two desktop PCs witha 2.5 GHz Intel Core2Quad CPU and 4 GB RAM,connected via Gigabit LAN and were able to compute224 multiplication triples in 139 s, which indicates aperformance decrease of factor 11. While the perfor-mance decrease on mobile devices compared to desktopcomputers was significantly less than the factor of 1000observed in [HCE11], it is still insufficient for effi-ciently computing complex functions such as private set-intersection, which typically requires millions of OTs.

In comparison, the multiplication triple generation ofthe hardware token during the init phase is able to gener-ate 224 multiplication triples in 2,883 s, corresponding to5,819 multiplication triples per second. For the hardwaretoken-based protocol we observe that the times for send-ing the seeds using the TLS and KAS1 key agreementprotocols grow very slowly with the number of multipli-cation triples, since the amount of data to be encrypted

and sent grows only with log2 | f |. Additionally, the TLS-based key agreement protocol (4.6 s for 211 multiplica-tion triples) is around factor 3 slower than the KAS1-based key agreement (1.3 s for 211 multiplication triples).

The overall computation and communication work-load of OT extension is substantially larger than in ourtoken-based scheme, but its multiplication triple gener-ation rate is not much faster. This can be explained bythe faster processing power of the smartphones comparedto that of T and the higher bandwidth of Wi-Fi directcompared to the relatively slow communication chan-nel between A and T . However, OT extension suffersfrom high energy consumption, due to the CPU utiliza-tion incurred by the symmetric cryptographic operations,as well as the Wi-Fi direct communication [PFW11].

We use PowerTutor4 to measure the energy consump-tion of the smartphone’s CPU for generating 219 multi-plication triples and compare the interactive evaluation ofrandom OT extensions with our smartcard solution. Notethat Fig. 8 only displays the CPU’s energy consumptionwhereas the energy consumption of Wi-Fi and the smart-card is not included. However, we argue that the en-ergy consumption of the smartcard is not a critical factor,since these operations can be performed when the phoneis charging. The Wi-Fi connection, on the other hand, isrequired for OT during the setup phase, thus increasingthe already high battery drain even further. Moreover,the OT computations have to be done on both devicessimultaneously, draining both devices’ batteries. There-fore, our token-based solution is particularly well-suitedfor the mobile domain, where energy consumption andbattery lifetime are critical factors.

0 10 20 30 40 50 60 70 80 90

Ener

gy C

onsu

mpt

ion

[Ws]

OT Extension (§2.3)Smartcard (§4.1)

Time [s]

2.5

5.0

7.5

10.0

12.5

15.0

17.5

Figure 8: Accumulated smartphone CPU energy con-sumption during the generation of multiplication triples.

4http://powertutor.org

902 23rd USENIX Security Symposium USENIX Association

6 Applications

To evaluate the performance of our protocols, we usethe mobile phones and setting as specified in §5.3 andconsider the following privacy-preserving applications:availability scheduling (§6.1), location-aware schedul-ing (§6.2), and set intersection (§6.3). We implementedthe applications and depict the performance results for anaverage of 10 iterations. We use KAS1-basic [NIS09] askey authentication scheme. We pre-generated the circuitsusing the framework of [SZ13], wrote them into a file,and read them on the smartphone. The time for readingthe circuit file is included in the setup phase.

6.1 Availability Scheduling

Privacy-preserving availability scheduling is a com-mon example for secure computation on mobile de-vices [HCC+01, BJH+11] and enables A and B to find apossible time slot for a meeting without disclosing theirschedules to each other. To schedule a meeting, A and Bspecify a duration and time frame for the meeting. Eachparty i ∈ {A,B} then divides the time frame when themeeting can take place (e.g., a week) into n time slotstni = (ti,1, ..., ti,n) and denotes each time slot ti, j ∈ {0,1}

as either free (ti, j = 1) or occupied (ti, j = 0). The partiescompute their common availability tn

Avail by computingthe bitwise AND of their time slots, i.e., tn

Avail = tnA∧ tn

B.Overall, this circuit has n AND gates and depth 1. Notethat the bitwise AND circuit performs a general function-ality and can, for instance, be used for privacy-preservingset intersection where elements are taken from a smalldomain [HEK12] or location matching [CADT13]. Forour experiments, we set the time frames s.t. meetingscan be scheduled between 8 am and 10 pm for one daydivided into 15 minute slots (n = 56 slots), one week di-vided into 15 minute slots (n= 392 slots), and one monthdivided into 10 minute slots (n = 2,604 slots). We depictour results in the upper half of Tab. 2.

The multiplication triple generation in the init phasecan be performed in several hundred milliseconds, sinceit requires only one (for 56 and 392 time slots) or two(for 2,604 time slots) packet transfers between T and A.The setup phase, more detailed the seed transfer proto-col, is the main bottleneck in this application, as T hasto perform asymmetric and symmetric cryptographic op-erations. Finally, the online phase requires only millisec-onds but has a high variance, due to the communicationover Wi-Fi direct and the small number of communica-tion rounds that are performed in the online phase.

For comparison, we evaluated the same circuit usingthe mobile Yao implementation of [HCE11] on the samephones, which took factor 1.6 (for the day time frame) upto factor 12 (for the month time frame) longer, cf. Tab. 2.

Table 2: Performance for availability and location-awarescheduling. | f | is the size of the circuit and d( f ) itsdepth. All values measured on smartphones (cf. §5.3).

Time Frame Day Week MonthAvailability Scheduling §6.1| f | / d( f ) 56 / 1 392 / 1 2,604 / 1Init [s] 0.37 (±1.6%) 0.37 (±1.6%) 0.73 (±1.0%)Setup [s] 1.3 (±13%) 1.3 (±13%) 1.3 (±13%)Online [s] 0.002 (±150%) 0.003 (±167%) 0.007 (±129%)Ad-Hoc [s] 1.3 (±13%) 1.3 (±13%) 1.3 (±13%)

Mobile Yao [HCE11]Ad-Hoc [s] 2.14 (±7.1%) 3.82 (±4.7%) 15.9 (±2.7%)

Location-Aware Scheduling §6.2| f | / d( f ) 39,864 / 69 280,605 / 87 1,872,206 / 106Init [s] 6.9 (±0.3%) 48.5 (±0.2%) 319.6 (±0.5%)Setup [s] 1.4 (±7.1%) 1.8 (±7.0%) 4.8 (±4.8%)Online [s] 0.16 (±35%) 0.82 (±7.4%) 5.9 (±18%)Ad-Hoc [s] 1.5 (±8.4%) 2.6 (±6.5%) 10.7 (±11%)

6.2 Location-Aware Scheduling

In the following we show that our system can be adaptedto compute arbitrary and complex functions. We intro-duce the location-aware scheduling functionality whichextends the availability scheduling of §6.1, s.t. the dis-tance between the users is considered as well. Thelocation-aware scheduling functionality takes into ac-count the user’s location in a time slot, computes the dis-tance between the users, verifies if a meeting is feasible,and outputs the time slot in which the users have to travelthe least distance to meet each other. We argue that thisapproach is practical, since such position information areoften already included in the users’ schedules.

In the location-aware scheduling scheme, we assumethat the user i ∈ {A,B} also inputs the location of theprevious appointment Pi and the next appointment Ni andthe distances that he can reach from his previous appoint-ment pi and from his next appointment ni (cf. Fig. 9 foran example). Such pi and ni can be computed in plain-text using the distance between Pi and Ni, the free timeuntil the next appointment and the duration of the meet-ing. The minimal distance among all time slots where thereachable ranges for A and B overlap is selected as finalresult. If successful, the function outputs the identifiedtime slot and for each user whether he should leave fromthe location of the previous or next appointment. A de-tailed description of the functionality is given in the fullversion [DSZ14]. We evaluate the scheme on the samenumber of time slots used in §6.1 (day, week, month) anddepict the performance in the lower half of Tab. 2.

Compared to availability scheduling, the location-aware scheduling circuit is significantly bigger and re-quires more communication rounds. When performingthe scheduling for a month, the circuit consists of 1.8 mil-lion instead of 2,604 AND gates for availability schedul-

USENIX Association 23rd USENIX Security Symposium 903

PA

pA nA

NA

NB

nBpBPB

Figure 9: Location-aware scheduling for one time slot of A and B with previous locations PA and PB, reachabledistances from previous appointments pA and pB, next locations NA and NB and corresponding reachable distancesnA and nB. The meeting can be scheduled between NA and PB as the reachable ranges overlap.

ing. The time for the init phase increases linearly withthe number of AND gates and requires 319 s when per-forming scheduling for a month. The time for the setupphase is increased less, since the seed transfer grows onlylogarithmically in | f | and the seed expansion is done ef-ficiently. The online phase is also slowed down substan-tially (6 s for a month time frame), but is still practical.

6.3 Private Set Intersection

Private set intersection (PSI) is a widely studied problemin secure computation and can be used for example tofind common contacts in users’ address books [HCE11].It enables two parties, each holding a set SA and SB withelements represented as σ -bit strings to determine whichelements both have in common, i.e., SA ∩ SB, withoutdisclosing any other contents of their sets. While manyspecial-purpose protocols for PSI exist, e.g., [CT10,CT12, CADT13], generic protocols mostly build onthe work of [HEK12], where the Sort-Compare-Shuffle(SCS) circuit was outlined. The idea is to have both par-ties locally pre-sort their elements, privately merge them,check adjacent values for equality, and obliviously shuf-fle the resulting values to hide their order.

We implement the SCS-WN circuit of [HEK12] whichuses a Waksman permutation network to randomly shuf-fle the resulting elements. We perform the compar-ison for bit sizes σ ∈ {24,32,160} and compare thead-hoc runtime of our protocol to the implementationof [HCE11] for σ ∈ {24,32}. The results from [HEK12]are compared to ours for σ ∈ {32,160}. The resultsare given in Fig. 10 and in Tab. 3. Note that [HCE11]and [HEK12] implement Yao’s garbled circuits protocolusing pipelining, whereas we use the GMW protocol.

1

10

100

128 256 512 1024 2048 4096

Ad-

Hoc

Run

time

[s]

Number of Inputs

[HEK12] SCS-WN (Desktop)§4 Token Ad-Hoc (Phone)

Figure 10: Private set intersection runtime for σ = 32 bitelements using our token-based protocol on two smart-phones (§5.3) and [HEK12] on two desktop PCs.

For a fair comparison, we ran the code from [HCE11]on our Samsung Galaxy S3 smartphones and observed anapproximate speedup of factor 2 compared to the mea-surements from their paper, that were made on olderhardware (two Google Nexus One phones). Note thatour performance results, as well as the values for the im-plementation of [HCE11] are benchmarked on mobiledevices connected via Wi-Fi Direct, while [HEK12] isbenchmarked on two desktop PCs (two Core2Duo E84003GHz PCs connected via 100 Mbps LAN).

From Fig. 10 we observe that, due to the seed transferin our setup phase (cf. §4.2), the Yao’s garbled circuitsimplementation of [HEK12] is faster for up to 256 inputs.However, the seed transfer time amortizes for larger in-puts and our token-based scheme outperforms the imple-

904 23rd USENIX Security Symposium USENIX Association

Table 3: Ad-hoc runtime of private set intersection where each party inputs n values of σ bits, measured on identicalmobile phones (§5.3). [HEK12] results are on PCs and taken from the paper (— indicates that no numbers were given).

Number of Inputs n 32 64 128 256 512 1,024

σ = 24bit| f | 22,432 52,096 118,656 266,240 590,336 1,296,384Ours [s] 1.7 (±2.2%) 1.9 (±3.4%) 2.1 (±2.4%) 2.5 (±2.4%) 3.6 (±4.2%) 7.4 (±8.7%)[HCE11] [s] 30 68 161 410 1,052 3,010

σ = 32bit

| f | 30,368 70,528 160,640 360,448 799,232 1,755,136Ours [s] 1.7 (±2.7%) 1.9 (±3.5%) 2.3 (±7.7%) 3.0 (±18%) 4.4 (±9.8%) 8.5 (±20%)[HCE11] [s] 42 87 233 565 1,468 4,662[HEK12] [s] — — 1 2.2 4.95 10.5

σ = 160bit| f | 156,768 364,096 829,312 1,860,864 4,126,208 9,061,376Ours [s] 2.2 (±8.8%) 2.7 (±16%) 4.0 (±1.9%) 7.0 (±1.9%) 14.3 (±2.9%) 28.7 (±1.4%)[HEK12] [s] — — — — — 51.5

mentation of [HEK12], even though our implementationruns on substantially slower mobile phones while theirsis evaluated on two desktop PCs. From Tab. 3 we ob-serve that our scheme outperforms the Yao’s garbled cir-cuits implementation of [HCE11], evaluated on identicalmobile phones, by factor 18 for 32 inputs with σ = 24 bitand by up to factor 550 for 1,024 inputs with σ = 32 bit.

Finally, we compare the performance of our protocolto the PSI protocol of [CADT11,CADT13]. We use theirreported numbers for pre-computed PSI on 20 input val-ues and set the bit size σ = 160 in our protocol.5 Theprotocol of [CADT11, CADT13] needs 3.7 s, while ourad-hoc runtime is only 2.1 s (±4.8%). Note, however,that their approach has only a constant number of roundsand can be sped up using multiple cores.

7 Related Work

We classify related work into three categories: securefunction evaluation (§7.1), server-aided secure functionevaluation (§7.2), and token-based cryptography (§7.3).

7.1 Generic Secure Function EvaluationThe foundations for secure function evaluation (SFE)were laid by Yao [Yao86] and Goldreich et al. [GMW87]who demonstrated that every function that is efficientlyrepresentable as Boolean circuit can be computed se-curely in polynomial time with multiple parties.

SFE Compiler A first compiler for specific securetwo-party computation functionalities was presented in[MOR03]. The Fairplay framework [MNPS04] was thefirst efficient implementation of Yao’s garbled circuitsprotocol [Yao86] for generic secure two-party compu-tation and enabled a user to specify the function to becomputed in a high-level language. The FastGC frame-work [HEKM11] improved on the results of Fairplay by

5Note that [CADT11, CADT13] also support bigger bit sizes, sincethey operate on 1,024-bit ElGamal ciphertexts.

evaluating functions with millions of Boolean gates inmere minutes using optimizations such as the free XORtechnique [KS08] and pipelining. The FastGC frame-work has been used to implement various functions suchas privacy-preserving set intersection [HEK12], genomicsequencing, or AES [HEKM11], and was optimized withrespect to a low memory footprint in [HS13].

Next to Yao’s garbled circuits protocol, the GMW pro-tocol [GMW87] recently received increasing attention.The work of [CHK+12] efficiently implemented GMWin a setting with multiple parties. Subsequently, [SZ13]optimized GMW for the two-party setting and showedthat GMW has advantages over Yao’s garbled circuitsprotocol as it allows to pre-compute all symmetric cryp-tographic operations in a setup phase and that the work-load can be split evenly among both parties.

SFE on Mobile Devices A recent line of research aimsat making SFE available on mobile devices, such assmartphones. In [HCE11] the authors port the FastGCframework [HEKM11] to smartphones and observe asubstantial performance reduction when compared to thedesktop environment. They identify the slower process-ing speed and the high memory requirements as the mainbottlenecks. Similarly, [CMSA12] ported the Fairplayframework [MNPS04] to smartphones. A compiler withsmaller memory constraints than Fairplay was presentedin [MLB12]. We emphasize that previous works ongeneric SFE on mobile devices use Yao’s garbled circuitsprotocol, whereas our approach is based on GMW.

Several special-purpose protocols for mobile de-vices using homomorphic encryption were proposed in[BJH+11] (activity scheduling), [CDA11] (scheduling,interest sharing), and [CADT11,CADT13] (comparison,location-based tweets, common friends). In contrast togeneric solutions, such custom-tailored protocols can bemore efficient, but are restricted to specific functionali-ties. Their extension to new use cases is complex andusually requires new security proofs.

USENIX Association 23rd USENIX Security Symposium 905

7.2 Server-Aided SFE

One way to speed up generic secure computations on re-source constrained devices is to outsource expensive op-erations to one or more servers. In [HS12] a system forfair server-aided secure two-party computation using twoservers was introduced. SALUS [KMR12] is a systemfor fair SFE among multiple parties using a single server.A system that allows cloud-aided garbled circuits evalu-ation between one mobile device and a server was intro-duced in [CMTB13] and its efficiency was demonstratedon large-scale practical applications, such as a securepath finding algorithm. Both [CMTB13] and [KMR12]achieve security against malicious adversaries, but re-quire at least one party to be a machine with more com-puting power than a mobile phone as it evaluates multiplegarbled circuits. [Hua12] proposes that a trusted servergenerates multiplication triples that are sent to both par-ties over a secure channel, requiring O(| f |) bits commu-nication. Instead, we propose to replace the server witha trusted hardware token and show that the communica-tion to one party can be reduced to sub-linear complexity.Moreover, they achieve security against malicious adver-saries based on [NNOB12]; we sketch how to extend ourwork to malicious security in the full version [DSZ14].

We consider this line of research as orthogonal to ours,since it focuses on outsourcing secure computations to apowerful but untrusted cloud server. In contrast, we fo-cus on secure computation between two mobile deviceswhere computations are outsourced to a trusted, but re-source constrained smartcard locally held by one party.

7.3 Token-Based Cryptography

Another approach is to outsource computations to trustedhardware tokens, such as smartcards. These tokens aretypically resource-constrained, but have the advantage ofoffering a tamper-proof trusted execution environment.

Setup Assumptions for UC Hardware tokens can beused as setup assumption for Canetti’s universal compos-ability (UC) framework, as they allow to construct UCcommitments, with which in turn any secure computa-tion functionality can be realized, e.g., [Kat07, DNW09,DKMQ11]. These works are mainly feasibility resultsand have not been implemented yet.

SFE in Plaintext As discussed in [HL08], the trivialsolution to performing SFE using hardware tokens wouldbe to have each party send its inputs over a secure chan-nel to the token, which evaluates f and returns the output.A similar approach with multiple tokens, which addition-ally provides fault tolerance was given in [FFP+06].

When using the hardware token for plaintext evalua-tion, the performance of the time-critical online phase islimited by the performance of the token, which is typi-cally very low. Moreover, this requires the token to holdall input values in memory, which quickly exceeds itsvery limited resources.6 Alternatively, the token coulduse external secure memory to store inputs and inter-mediate values, e.g., [IS05, IS10], but this would requiresymmetric cryptographic operations in the online phase.Additionally, each new functionality would have to beimplemented on the token, whereas our scheme is imple-mented only once and supports arbitrary functionalities.

Specific Functionalities An efficient protocol for pri-vate set-intersection using smartcards was presentedin [HL08]. This protocol was extended to multiple un-trusted hardware tokens in [FPS+11]. An anonymouscredential protocol was presented in [BCGS09].

Outsourcing Oblivious Transfer There are severalworks that use hardware tokens to compute oblivioustransfer (OT): [GT08] implemented non-interactive OTusing an extension of a TPM, [Kol10] proposed OT se-cure in the malicious model using a stateless hardwaretoken, and [DSV10] provided non-interactive OT in themalicious model using two hardware tokens.

We outsource the setup phase of the GMW proto-col, which previously was done via OT, to the hard-ware token. Previous works on outsourcing n OTs re-quire the hardware token to evaluate O(n) symmetric (oreven asymmetric) cryptographic operations in the ad-hocphase. In comparison, our scheme requires T to evaluateO(n/t) symmetric cryptographic operations in the initphase and only O(log2 n) symmetric cryptographic oper-ations in the setup phase (cf. the full version [DSZ14]).

8 Conclusion and Future Work

In this work, we demonstrated that generic ad-hoc se-cure computation can be performed efficiently on mo-bile devices when aided by a trusted hardware token.We showed how to extend the GMW protocol by sucha token, similar to a TPM, to which most costly cryp-tographic operations can be outsourced. Our schemepre-computes most of the workload of GMW in aninitialization phase, which is performed independentlyof the later communication partner and without know-ing the function or its size in advance. This is par-ticularly desirable as the pre-computation can happenat any time, e.g., when the device is connected to a

6The smartcard we use in our experiments has 1,750 Bytes of RAM,which would be completely filled if each party provided 300 inputs of24 bits length in private set intersection (cf. §6.3).

906 23rd USENIX Security Symposium USENIX Association

power source, which happens regularly with modernsmartphones. The remaining interactive ad-hoc phaseis very efficient and can be executed in a few seconds,even for complex functionalities. We implemented sev-eral privacy-preserving applications that are typical formobile devices (availability scheduling, location-awarescheduling, and set-intersection) on off-the-shelf smart-phones using a general-purpose smartcard and showedthat their execution times are truly practical. We foundthat the performance of our scheme is two orders of mag-nitude faster than that of other generic secure two-partycomputation schemes on mobile devices and comparableto the performance of similar schemes in the semi-honestadversary model implemented on desktop PCs.

We see several interesting directions for future re-search. As our scheme is based on the GMW protocol,it can easily be extended to more than two parties, e.g.,for securely scheduling a meeting, cf. [CHK+12]. More-over, our scheme can be modified to also provide securityagainst malicious parties, cf. [Hua12] (we provide moredetails in the full version [DSZ14]). Another directionmight be equipping both mobile devices with a hardwaretoken to further improve efficiency and/or security.

Acknowledgements

We thank the anonymous reviewers of USENIX Security2014 for their helpful comments on our paper. We alsothank Giesecke & Devrient for providing us with mul-tiple smartcards and the authors of [HCE11] for shar-ing their code with us. This work was supported bythe German Federal Ministry of Education and Research(BMBF) within EC SPRIDE, by the Hessian LOEWEexcellence initiative within CASED, and by the Euro-pean Union Seventh Framework Program (FP7/2007-2013) under grant agreement n. 609611 (PRACTICE).

References

[ALSZ13] G. Asharov, Y. Lindell, T. Schneider,M. Zohner. More efficient oblivious trans-fer and extensions for faster secure com-putation. In Computer and Communica-tions Security (CCS’13), p. 535–548. ACM,2013.

[BCGS09] P. Bichsel, J. Camenisch, T. Groß,V. Shoup. Anonymous credentials ona standard Java card. In Computer andCommunications Security (CCS’09), p.600–610. ACM, 2009.

[Bea91] D. Beaver. Efficient multiparty protocolsusing circuit randomization. In Advances

in Cryptology – CRYPTO’91, volume 576of LNCS, p. 420–432. Springer, 1991.

[Bea95] D. Beaver. Precomputing oblivious transfer.In Advances in Cryptology – CRYPTO’95,volume 963 of LNCS, p. 97–109. Springer,1995.

[BJH+11] I. Bilogrevic, M. Jadliwala, J.-P. Hubaux,I. Aad, V. Niemi. Privacy-preserving activ-ity scheduling on mobile devices. In ACMData and Application Security and Privacy(CODASPY’11), p. 261–272. ACM, 2011.

[CADT11] H. Carter, C. Amrutkar, I Dacosta,P. Traynor. Efficient oblivious computationtechniques for privacy-preserving mobileapplications. Technical report, GeorgiaInstitute of Technology, 2011.

[CADT13] H. Carter, C. Amrutkar, I. Dacosta,P. Traynor. For your phone only: Customprotocols for efficient secure function eval-uation on mobile devices. Journal of Secu-rity and Communication Networks (SCN),2013.

[CDA11] E. D. Cristofaro, A. Durussel, I. Aad. Re-claiming privacy for smartphone applica-tions. In Pervasive Computing and Com-munications (PerCom’11), p. 84–92. IEEE,2011.

[CHK+12] S. G. Choi, K.-W. Hwang, J. Katz,T. Malkin, D. Rubenstein. Secure multi-party computation of Boolean circuits withapplications to privacy in on-line mar-ketplaces. In Cryptographers’ Track atthe RSA Conference (CT-RSA’12), volume7178 of LNCS, p. 416–432. Springer, 2012.

[CMSA12] G. Costantino, F. Martinelli, P. Santi,D. Amoruso. An implementation of securetwo-party computation for smartphoneswith application to privacy-preservinginterest-cast. In Privacy, Security and Trust(PST’12), p. 9–16. IEEE, 2012.

[CMTB13] H. Carter, B. Mood, P. Traynor, K. Butler.Secure outsourced garbled circuit evalua-tion for mobile phones. In USENIX Secu-rity’13, p. 289–304. USENIX, 2013.

[CT10] E. De Cristofaro, G. Tsudik. Practical pri-vate set intersection protocols with linearcomplexity. In Financial Cryptography andData Security (FC’10), volume 6052 ofLNCS, p. 143–159. Springer, 2010.

USENIX Association 23rd USENIX Security Symposium 907

[CT12] E. De Cristofaro, G. Tsudik. Experimentingwith fast private set intersection. In Trustand Trustworthy Computing (TRUST’12),volume 7344 of LNCS, p. 55–73. Springer,2012.

[DKMQ11] N. Dottling, D. Kraschewski, J. Muller-Quade. Unconditional and composable se-curity using a single stateful tamper-proofhardware token. In Theory of Cryptogra-phy Conference (TCC’11), volume 6597 ofLNCS, p. 164–181. Springer, 2011.

[DNW09] I. Damgard, J. B. Nielsen, D. Wichs.Universally composable multiparty com-putation with partially isolated parties.In Theory of Cryptography Conference(TCC’09), volume 5444 of LNCS, p. 315–331. Springer, 2009.

[DSV10] M. Dubovitskaya, A. Scafuro, I. Visconti.On efficient non-interactive oblivious trans-fer with tamper-proof hardware. Cryptol-ogy ePrint Archive, Report 2010/509, 2010.http://eprint.iacr.org/2010/509.

[DSZ14] D. Demmler, T. Schneider, M. Zohner. Ad-hoc secure two-party computation on mo-bile devices using hardware tokens. Cryp-tology ePrint Archive, Report 2014/467,2014. http://eprint.iacr.org/2014/467.

[FFP+06] M. Fort, F. C. Freiling, L. D. Penso, Z. Be-nenson, D. Kesdogan. TrustedPals: Se-cure multiparty computation implementedwith smart cards. In European Sympo-sium on Research in Computer Security(ESORICS’06), volume 4189 of LNCS, p.34–48. Springer, 2006.

[FPS+11] M. Fischlin, B. Pinkas, A.-R. Sadeghi,T. Schneider, I. Visconti. Secure set inter-section with untrusted hardware tokens. InCryptographers’ Track at the RSA Confer-ence (CT-RSA’11), volume 6558 of LNCS,p. 1–16. Springer, 2011.

[GMW87] O. Goldreich, S. Micali, A. Wigderson.How to play any mental game or a com-pleteness theorem for protocols with hon-est majority. In Symposium on Theory ofComputing (STOC’87), p. 218–229. ACM,1987.

[GT08] V. Gunupudi, S. R. Tate. Generalized non-interactive oblivious transfer using count-limited objects with applications to secure

mobile agents. In Financial Cryptographyand Data Security (FC’08), volume 5143 ofLNCS, p. 98–112. Springer, 2008.

[HCC+01] T. Herlea, J. Claessens, D. De Cock,B. Preneel, J. Vandewalle. Secure meet-ing scheduling with agenTa. In Communi-cations and Multimedia Security (CMS’01),volume 192 of IFIP Conference Proceed-ings, p. 327–338. Kluwer, 2001.

[HCE11] Y. Huang, P. Chapman, D. Evans. Privacy-preserving applications on smartphones.In Hot topics in security (HotSec’11).USENIX, 2011.

[HEK12] Y. Huang, D. Evans, J. Katz. Private set in-tersection: Are garbled circuits better thancustom protocols? In Network and Dis-tributed Security Symposium (NDSS’12).The Internet Society, 2012.

[HEKM11] Y. Huang, D. Evans, J. Katz, L. Malka.Faster secure two-party computation usinggarbled circuits. In USENIX Security’11, p.539–554. USENIX, 2011.

[HL08] C. Hazay, Y. Lindell. Constructions of trulypractical secure protocols using standardsmartcards. In Computer and Communica-tions Security (CCS’08), p. 491–500. ACM,2008.

[HS12] A. Herzberg, H. Shulman. Obliviousand fair server-aided two-party computa-tion. In Availability, Reliability and Secu-rity (ARES’12), p. 75–84. IEEE, 2012.

[HS13] W. Henecka, T. Schneider. Faster se-cure two-party computation with less mem-ory. In Symposium on Information, Com-puter and Communications Security (ASI-ACCS’13), p. 437–446. ACM, 2013.

[Hua12] Y. Huang. Practical Secure Two-PartyComputation. PhD dissertation, Universityof Virginia, 2012.

[IET08] IETF. The Transport Layer Security (TLS)Protocol Version 1.2. Technical report,Internet Engineering Task Force (IETF),2008.

[IKNP03] Y. Ishai, J. Kilian, K. Nissim, E. Pe-trank. Extending oblivious transfers ef-ficiently. In Advances in Cryptology –CRYPTO’03, volume 2729 of LNCS, p.145–161. Springer, 2003.

908 23rd USENIX Security Symposium USENIX Association

[IS05] A. Iliev, S. Smith. More efficient securefunction evaluation using tiny trusted thirdparties. Technical report, Dartmouth Col-lege, Computer Science, 2005.

[IS10] A. Iliev, S. W. Smith. Small, stupid,and scalable: secure computing withFaerieplay. In Workshop on ScalableTrusted Computing (STC’10), p. 41–52.ACM, 2010.

[JKSS10] K. Jarvinen, V. Kolesnikov, A.-R. Sadeghi,T. Schneider. Embedded SFE: Offloadingserver and network using hardware tokens.In Financial Cryptography and Data Se-curity (FC’10), volume 6052 of LNCS, p.207–221. Springer, 2010.

[Kat07] J. Katz. Universally composable multi-party computation using tamper-proof hard-ware. In Advances in Cryptology – EURO-CRYPT’07, volume 4515 of LNCS, p. 115–128. Springer, 2007.

[KMR12] S. Kamara, P. Mohassel, B. Riva. Salus:a system for server-aided secure functionevaluation. In Computer and Communica-tions Security (CCS’12), p. 797–808. ACM,2012.

[KMSB13] B. Kreuter, B. Mood, A. Shelat, K. Butler.PCF: a portable circuit format for scalabletwo-party secure computation. In USENIXSecurity’13, p. 321–336. USENIX, 2013.

[Kol10] V. Kolesnikov. Truly efficient string obliv-ious transfer using resettable tamper-prooftokens. In Theory of Cryptography Confer-ence (TCC’10), volume 5978 of LNCS, p.327–342. Springer, 2010.

[KS08] V. Kolesnikov, T. Schneider. Improvedgarbled circuit: Free XOR gates and ap-plications. In International Colloquiumon Automata, Languages and Program-ming (ICALP’08), volume 5126 of LNCS,p. 486–498. Springer, 2008.

[MLB12] B. Mood, L. Letaw, K. Butler. Memory-efficient garbled circuit generation for mo-bile devices. In Financial Cryptographyand Data Security (FC’12), volume 7397 ofLNCS, p. 254–268. Springer, 2012.

[MNPS04] D. Malkhi, N. Nisan, B. Pinkas, Y. Sella.Fairplay – a secure two-party computationsystem. In USENIX Security’04, p. 287–302. USENIX, 2004.

[MOR03] P. MacKenzie, A. Oprea, M. K. Reiter.Automatic generation of two-party com-putations. In Computer and Communica-tions Security (CCS’03), p. 210–219. ACM,2003.

[MvOV96] A. Menezes, P. C. van Oorschot, S. A. Van-stone. Handbook of Applied Cryptography.CRC Press, 1996.

[NIS09] NIST. NIST Special Publication 800-56b,Recommendation for Pair-Wise Key Estab-lishment Schemes Using Integer Factoriza-tion). Technical report, National Institute ofStandards and Technology (NIST), 2009.

[NIS12] NIST. NIST Special Publication 800-57, Recommendation for Key ManagementPart 1: General (Rev. 3). Technical report,National Institute of Standards and Tech-nology (NIST), 2012.

[NNOB12] J. B. Nielsen, P. S. Nordholt, C. Or-landi, S. S. Burra. A new approachto practical active-secure two-party com-putation. In Advances in Cryptology –CRYPTO’12, volume 7417 of LNCS, p.681–700. Springer, 2012.

[NP01] M. Naor, B. Pinkas. Efficient oblivioustransfer protocols. In ACM-SIAM Sympo-sium On Discrete Algorithms (SODA’01),p. 448–457. Society for Industrial and Ap-plied Mathematics, 2001.

[PFW11] G. P. Perrucci, F. H. P. Fitzek, J. Wid-mer. Survey on energy consumption enti-ties on the smartphone platform. In Vehicu-lar Technology Conference (VTC’11), p. 1–6. IEEE, 2011.

[Rab81] M. O. Rabin. How to exchange secrets withoblivious transfer, TR-81 edition, 1981.Aiken Computation Lab, Harvard Univer-sity.

[SZ13] T. Schneider, M. Zohner. GMW vs.Yao? Efficient secure two-party compu-tation with low depth circuits. In Fi-nancial Cryptography and Data Security(FC’13), volume 7859 of LNCS, p. 275–292. Springer, 2013.

[TCG13] TCG. TCG TPM specifications 2.0, 2013.Trusted Computing Group.

[Yao86] A. C. Yao. How to generate and exchangesecrets. In Foundations of Computer Sci-ence (FOCS’86), p. 162–167. IEEE, 1986.