[] reconciling mobile app privacy and usability on smartphones: could user privacy profiles help?

31
Reconciling Mobile App Privacy and Usability on Smartphones: Bin Liu, Jialiu Lin, Norman Sadeh School of Computer Science Carnegie Mellon University 1 Could User Privacy Profiles Help?

Upload: bin-liu

Post on 15-Jul-2015

484 views

Category:

Science


0 download

TRANSCRIPT

Reconciling Mobile App Privacy and Usability on Smartphones:

Bin Liu, Jialiu Lin, Norman Sadeh

School of Computer Science Carnegie Mellon University

1  

Could User Privacy Profiles Help?  

Explosion of Smartphone Privacy Settings

•  Users are expected to configure/review an unrealistically large number of privacy / permission settings! [Lin 2012]

2  

Approaches for Privacy Settings

iOS Privacy Settings Give users fine-grained controls but the number of decisions is overwhelming

3  

Approaches for Privacy Settings

Android Permissions Grant permissions upfront during installation Will not actively show them again Grant/Deny on a per-app basis. Insufficient control

4  

Neither of them are working well

•  iOS Privacy Settings: – Users are overwhelmed with options!

•  Android Permissions: –  Ineffective [Felt 2012], bad timing [Kelley et al. 2012], – Lack of sufficiently fine control.

–  In Android 4.3, App Ops are introduced: •  Post-installation fine-grained control

5  

6  

App  Ops  

Then  it  has  same  problem  as  iOS  Privacy  Se6ngs:  Way  more  se3ngs  that  users  can  handle!  

 

How can we simplify this process?

•  Users care about app privacy [Lin 2012, Kelly 2012]

–  but they are overwhelmed with options.

•  Can default settings take care of everything? –  Ideally people feel and behave in similar ways –  However, [Agarwal 2013, Lin 2012]

people’s app privacy preferences are diverse.

7  

Research Question

Can we have a manageable framework that capture users’ diverse preferences and reduce their preferences into small number of profiles?

8  

I  want  to  choose  the  profile  that:  

Protect  my  locaDon  informaDon…  

Keep  my  phone  call  history  away  from  social  apps  …  

Whatever,  just  give  those  apps  the  permissions.  

Dataset Description

•  We were given access to a unique corpus of data: users’ actual permission settings from LBE Privacy Guard –  This app runs on rooted Android phones. –  Available on Google Play and several other app stores.

•  It relies on API interception technology to give users the ability to control 12 permissions that can possibly be requested by an app –  e.g. location, phone ID, call monitoring, SMS, etc.

9  

10  

A sample snapshot of LBE Privacy Guard on a MIUI 2 phone

Users  can  choose  “Allow”,  “Deny”,      “Ask  to  be  dynamically  prompted”  or  leave  them  as  “Default”  (which  is  managed  by  LBE).  

Dataset Description

•  Permission settings of 4.8 million LBE users – over a 10-day time period (May 1st ~ 10th, 2013)

•  Users’ settings are mostly stable after 10 days. Majority are done changing.

– Format: [user, app, permission, decision] •  Decision: Allow, Deny, Ask, Default. •  We can know if each decision is made / reviewed by

user or default settings. (And in the analysis, we exclude decisions if users are not involved in them)

11  

•  Preprocessing –  Users: >=20 apps, >=1 non-default and >=1 non-allow

settings. –  Apps: >=10 users, >=1 permission requests, available

on Google Play during the same time period. –  Permission Request from an app: >5 users’ decisions.

•  After the screening process, the corpus analyzed in this study includes: –  239,402 representative users –  12,119 representative apps –  28,630,179 decision records

–  On average, each user has 22.66 apps; each app requested 3.03 of all observed 12 permissions.

12  

Diversity of users’ app permission settings

App-permission pairs with >80% agreement among users:

ONLY 63.9% One-size-fits-all does not apply!

13  

Distribution of users’ decisions (“Allow”, “Deny” and “Ask”) for each app-permission pair

Could we predict their settings?

•  Specifically, can we learn a function F:

– Assumption: We limit the model where the set of decisions is restricted to “Allow” or “Deny” (the majority decisions) in this study.

– We trained classifiers based on training dataset, and evaluate classifier using 20% of apps a user has already installed to make predictions on the other 80%.

•  Equivalent to assuming that a user has already installed 4 or 5 apps using the corresponding app-permission decisions.

14  

High Dimensionality & Sparsity Challenge

•  Users’ data is sparse. –  On average, each user will only install 22.66 apps

from 12,119 apps.

•  Two approaches were considered – Approach 1: Aggregation

•  Instead of dealing with the decisions app by app, study users’ preferences at an aggregate level.

– Approach 2: SVD •  similar techniques used in recommender systems.

15  

Details of the prediction settings

•  Apply SVM classifier with linear kernels (LibLinear) – Efficient for large-scale input (14.5 million rows) – Convenient to implant additional features

•  Aggregation: – Collect users’ general preferences on each of the 12

permissions for all app installed. •  SVD

–  Reduce the dimensionality of matrix (#user * #(app-permission pairs) into 100. (The app-permission pairs are generated from the permission requests of the most 1000 popular apps)

16  

Comparing Different Classifiers

17  

FS-­‐8  (“Aggrega?on  on  permission”)    Boosted  the  performance  

Interactive Process

•  Fully predicting users’ settings is hard. •  A more realistic way to predict users’ settings:

Figure them out if we have confidence, otherwise, ask the users.

–  Estimate confidence for each decision query from the training of the classifier.

–  Select a small subset of questions that our classifier is relatively uncertain and ask the users. (Not random samples)

18  

19  

Predic?on  accuracy  improves  as  users  entered  more  decisions  in  the  interac?ve  process.        (87.8%  -­‐>  92%  with  addiDonal  input  of  10%  of  users’  se3ngs.)  

•  Tested with a purely analytical settings: – We take users’ final settings in the time period as

ground truth. – Further user experiments will be conducted.

Simplifying Privacy Decisions Using Privacy Profiles

•  Intuition: Though users' preferences are diverse, there are strong correlations that enable us to identify a small set of privacy profiles

•  Question: Is it possible to develop easy-to-understand privacy profiles that capture users’ different preferences?

–  Each profile effectively corresponds to a group of like-minded users.

–  We can match individual users with profiles by showing descriptions or asking a few questions.

20  

Generating Privacy Profiles

•  Clustering Like-Minded Users •  We represent each user as a vector of their

aggregated preference on each of the 12 permissions. –  According our previous results, features of aggregation

on permissions boosted the performance of the decision prediction.

•  Then we apply K-mean algorithm on their characteristic vectors with Euclidean distance to identify the clusters.

21  

How many privacy profiles do we need?

•  The effectiveness depends on clustering method and actual users’ experiences.

•  Metrics to consider for a good value of K – Prediction Accuracy

•  If we replace users’ identity into profile membership and re-run the classification task, how accurate can we predict?

–  Interpretability & Understandability •  Compact descriptions could be presented to users who

would then identify which profile is the best match.

– Stability of Privacy Profiles •  If only part of users’ decisions are observable, will each

user be matched with the same privacy profile? 22  

Aggregated Preferences on Permissions for each profile (K=5)

23  Protec?ve  Permissive  

24  

C1 C2 C3

Send SMS

Phone Call

SMS DB

Contact

Call Log

Positioning

Phone ID

3G Network

Wi−Fi Network

ROOT

Phone State

Call Monitoring

−1.0

−0.5

0.0

0.5

1.0

C1 C2 C3 C4

Send SMS

Phone Call

SMS DB

Contact

Call Log

Positioning

Phone ID

3G Network

Wi−Fi Network

ROOT

Phone State

Call Monitoring

−1.0

−0.5

0.0

0.5

1.0

C1 C2 C3 C4 C5

Send SMS

Phone Call

SMS DB

Contact

Call Log

Positioning

Phone ID

3G Network

Wi−Fi Network

ROOT

Phone State

Call Monitoring

−1.0

−0.5

0.0

0.5

1.0

C1 C2

Send SMS

Phone Call

SMS DB

Contact

Call Log

Positioning

Phone ID

3G Network

Wi−Fi Network

ROOT

Phone State

Call Monitoring

−1.0

−0.5

0.0

0.5

1.0

Comparing Aggregated Preferences for Different K

K  =  2        K  =  3        K  =  4            K  =  5  

Fine  Differences  Coarse  Differences  

25  

Comparing Variations of Users’ Settings

C1

Send SMS

Phone Call

SMS DB

Contact

Call Log

Positioning

Phone ID

3G Network

Wi−Fi Network

ROOT

Phone State

Call Monitoring

K=1 avg=0.511

0.0

0.2

0.4

0.6

0.8

1.0

C1 C2 C3

Send SMS

Phone Call

SMS DB

Contact

Call Log

Positioning

Phone ID

3G Network

Wi−Fi Network

ROOT

Phone State

Call Monitoring

K=3 avg=0.251

0.0

0.2

0.4

0.6

0.8

1.0

C1 C2 C3 C4

Send SMS

Phone Call

SMS DB

Contact

Call Log

Positioning

Phone ID

3G Network

Wi−Fi Network

ROOT

Phone State

Call Monitoring

K=4 avg=0.231

0.0

0.2

0.4

0.6

0.8

1.0

         K  =  1        K  =  3                  K  =  4                              K  =  5      avg  =  0.511                                            avg  =  0.251                                      avg  =  0.231                                      avg  =  0.216  

C1 C2 C3 C4 C5

Send SMS

Phone Call

SMS DB

Contact

Call Log

Positioning

Phone ID

3G Network

Wi−Fi Network

ROOT

Phone State

Call Monitoring

K=5 avg=0.216

0.0

0.2

0.4

0.6

0.8

1.0

Variances  of  users’  se6ngs  on  each  permission  has  been  significantly  reduced  using  profiles.  

26  For  each  profile,  users’  overall  preferences  are  clearer  and  their  decisions  are  more  similar  to  each  other.  

Assigning users into profiles •  Description-based approach:

Discriminative features of each profile – What decisions do users in this profile usually

have, which are relatively unique?

27  

28  

These features can also provide a basis for asking a few questions to users and determine in which cluster their preferences fall.

C1 C2 C3 C4 C5 C6

Send SMS

Phone Call

SMS DB

Contact

Call Log

Positioning

Phone ID

3G Network

Wi−Fi Network

ROOT

Phone State

Call Monitoring

−1.0

−0.5

0.0

0.5

1.0

Concluding Remarks

•  Mobile apps can access a wide range of sensitive data and functionality

•  Mobile app users care about privacy but in different ways: one size-fits-all settings would not work

•  However, our research shows that a small set of privacy profiles and limited number of interactions with users can go a long way in accurately capturing people’s privacy preferences

– Privacy profiles & simple dialogues can go a long way in reconciling mobile app privacy and usability

29  

Concluding Remarks

•  We are refining our prediction with inputs from users, tuning features and models.

•  Showing deeper analyzed information, such as purposes why app requested the permission, can also help users to make decisions (paper submitted for publication)

•  Human subject experiments will be conducted to evaluate how users respond to these interfaces. 30  

I got it. �I should

choose this profile. �

31  

C1 C2 C3 C4 C5 C6

Send SMS

Phone Call

SMS DB

Contact

Call Log

Positioning

Phone ID

3G Network

Wi−Fi Network

ROOT

Phone State

Call Monitoring

−1.0

−0.5

0.0

0.5

1.0

Thanks! Q&A

[email protected]