diy hosting for online privacy - stanford...

31
DIY Hosting for Online Privacy Shoumik Palkar and Matei Zaharia Stanford University Appeared at HotNets 2017

Upload: others

Post on 23-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

DIY Hosting for Online PrivacyShoumik Palkar and Matei Zaharia

Stanford University

Appeared at HotNets 2017

Page 2: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Before: A Federated Internet

The Internet and its protocols were designed to be federatedOrganizations would host own email, chat, and file transfer servers……and manage their own data!

Page 3: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Today: The Era of Centralized Services

Centralized services store data for organization.Organizations trade control of data for high availability at low cost

Highly AvailableCentralized Service

(e.g., Gmail, Slack, Office 365)

Page 4: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized
Page 5: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized
Page 6: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized
Page 7: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Why Do We Use Centralized Services?

They provide high availability at low cost

+ Failover Configuration+ Geo-replication+ Auto-scaling+ etc. etc. etc.

Strawman: Hosting your own tiny EC2 VM costs $4.50/monthHigh availability costs even more

Page 8: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

What does this mean?

A New Hope: Serverless Computing

Serverless computing: The availability of a top-tier cloud provider, but zero cost when idle

0123456

0 1500000 3000000

Mon

thly

Cos

t ($)

Monthly Requests

Lambda

EC2

Most usersare here.

Functions that run only when request is made, billed at 100 ms granularity

Page 9: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Deploy It Yourself: Taking Back the Internet

Users run personal web applications using serverless computing platforms.

High availability, low cost, and privacy for the first time.

Page 10: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Deploy It Yourself (DIY) Architecture

ClientClient

Client

Email

f()

f()

f()Encrypteduser data

Load Balancer

Serverless Platform

Key

Key Service

Storage Service

Page 11: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Deploy It Yourself (DIY) Architecture

ClientClient

Client

Email

f()

f()

f()Encrypteduser data

Load Balancer

Serverless Platform

Key

Key Service

Storage Service

1. Register Serverless Function

Page 12: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Deploy It Yourself (DIY) Architecture

ClientClient

Client

Email

f()

f()

f()Encrypteduser data

Load Balancer

Serverless Platform

Key

Key Service

Storage Service

2. Configure a cloud storage provider

Page 13: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Deploy It Yourself (DIY) Architecture

ClientClient

Client

Email

f()

f()

f()Encrypteduser data

Load Balancer

Serverless Platform

Key

Key Service

Storage Service

3. Register Key with a Key Service

Page 14: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Deploy It Yourself (DIY) Architecture

ClientClient

Client

Email

f()

f()

f()Encrypteduser data

Load Balancer

Serverless Platform

Key

Key Service

Storage Service

Page 15: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Why is DIY More Secure?1. Narrow boundary between data and service

vs. centralized service: many internal systems can access user data

2. Stored data is encrypted to prevent leaksvs. centralized service: employees access data to monetize it.

3. Cloud providers minimize data access internallyvs. centralized service: EULAs state data can be used for ad targeting, etc. etc.

4. Ability to migrate data off insecure clouds and regionsvs. centralized service: generally, no control over where data lives.

Page 16: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Why is DIY More Secure*?1. Narrow boundary between data and service

vs. centralized service: many internal systems can access user data

2. Stored data is encrypted to prevent leaksvs. centralized service: employees access data to monetize it.

3. Cloud providers minimize data access internally.vs. centralized service: EULAs state data can be used for ad targeting, etc. etc.

4. Ability to migrate data off insecure clouds and regionsvs. centralized service: generally, no control over where data lives.

*Assumes the function code, isolation mechanisms, and key service are trusted.

Page 17: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Threat Model

Trusted

Page 18: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Threat Model

Trusted

Function containers must hide execution and function state**Could one day be attested and secured using hardware enclaves?

ServerlessComputingPlatform Isolation

Page 19: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Threat Model

Trusted

Protecting access to users’ keys**Management services already secured via enclaves today, have strict EULAs

ServerlessComputingPlatformIsolation KeyManagementService

Page 20: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Threat Model

Trusted

Function code must not leak data or have critical bugs

ServerlessComputingPlatformIsolation KeyManagementService FunctionCode

Page 21: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Threat Model

Trusted

Untrusted

ServerlessComputingPlatformIsolation KeyManagementService FunctionCode

InternalNetwork Storageserviceandothercloudservices

Internettrafficbetweenuserandcloudprovider

Page 22: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

DIY Architecture

ClientClient

Client

Email

f()

f()

f()

Load Balancer

Serverless Platform

Key

Key Service

Trusted Components

Encrypteduser data

Storage Service

Page 23: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

DIY Architecture

ClientClient

Client

Email

f()

f()

f()

Load Balancer

Serverless Platform

Key

Key Service

Simple enough to be secured via hardware enclaves

Encrypteduser data

Storage Service

Page 24: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

What DIY Protects Against

Snooping employees

Data mining and sale

Buggy or insecure software

Government Surveillance

J J K L

Page 25: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

1. Back-of-the-Envelope Costs2. Chat Prototype and Challenges3. A Marketplace for DIY

Rest of this Talk

Page 26: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Back-of-the-Envelope Costs

Application Daily Requests

Compute /Request Memory Persistent

Storage Monthly Cost

Group Chat 2000 500 ms 128 MB 2 GB $0.14

Email 500 500 ms 128 MB 5 GB $0.21

File Transfer 100 2000 ms 1 GB 2 GB $0.14

IoT Control 100 500 ms 128 MB 1 GB $0.12

Video Chat* 1 15 min call 1.7 GB 1 GB $0.84

Comparison: un-replicated EC2 t2.nano server (500 MB, CPU burst only) = $4.50/month*On a billed-per-second VM.

Page 27: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Chat Prototype and Challenges

Client

HTTPS Endpoint

f()

HTTPS

f()EncryptedStorage

SQS

Challenge 1: Asynchronous communication (reading messages without keeping Lambda running)

SQS used to allow client polling without running Lambda function continuously.

Challenge 2: Latency with Pay-Per-Request Storage

Append small objects to S3.

Page 28: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Chat Prototype and Challenges

Client

HTTPS Endpoint

f()

HTTPS

f()EncryptedStorage

SQS

200ms Response Time.

(Most time spent in reading from SQS queue and posting to S3)

25,000 messages/monthat no cost.

Including SQS and Lambda compute.+ additional $0.09/mo. For storage

Page 29: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Bringing DIY Applications to Everyone

Cloud provider manages:• Installation• Permissions/Signing• Updates• etc. etc.

Available on the DIY App Store

For UsersPrivacy with automatic low cost and availability

For DevelopersFaster innovation: No need to manage a full multitenant scalable service

Page 30: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Conclusion

DIY could revolutionize how we run web applications by offering privacy, high availability, and low cost for the first time.

https://www.shoumik.xyz

@sppalkia sppalkia [email protected]

Page 31: DIY Hosting for Online Privacy - Stanford Universitynetseminar.stanford.edu/seminars/01_18_18.pdf · Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized

Related Work

• E2E Encrypted apps (e.g., Signal, WhatsApp)• Don’t support server side computation

• P2P Social Networks (e.g., Diaspora)• Could be hosted on top of serverless platforms?

• No-trust cryptographic protocols (e.g., Dissent, Pung)• Stronger security guarantees, but harder to deploy