suremail: notification overlay for email reliability
DESCRIPTION
TRANSCRIPT
1
SureMail: Notification Overlay for Email Reliability
Sharad Agarwal & Venkat PadmanabhanMicrosoft Research
Dilip Antony Joseph UC Berkeley
HotNets 2005
2
Silent Email Loss
Silent email loss: email “vanishes” without sender/recipient
knowledge missed opportunities, misunderstanding, or
worse Nontrivial problem
anecdotal evidence measurement studies
−0.69% loss rate [LM 04]−0.1-5% loss rate [AB 05]
commercial offerings to address the problem−e.g., Pivotal Veracity, Zenprise
3
HotNets air ticket confirmation
“We have sent it through again. If you do not receive it with in an hour or two, please let us know.”
Funding proposal
"No I never got and I never acked it… My last mail from you was [on] 3/10/2004.”
IMC 2005 decision notification“I recd reviews for one paper (#X) but not that of #Y.”
IMAP server upgrade problems“Some unanticipated migration problems occurred that may have caused some lost or delayed email.”
4
Silent Email Loss
Silent email loss: email “vanishes” without sender/recipient
knowledge missed opportunities, misunderstanding, or
worse Nontrivial problem
anecdotal evidence measurement studies
−0.69% loss rate [Lang & Moors 2004]−0.1-5% loss rate [Afergan & Beverly 2005]
commercial offerings to address the problem−e.g., Pivotal Veracity, Zenprise
5
Silent Email Loss
Why email loss? spam filtering: big problem aggressive filtering
−MS: 90% of emails discarded before hitting user mailboxes
−AOL: 100 emails per month to maintain IP white-listing
server failures and upgrades−SMTP is not end-to-end reliable
(Non-)Delivery status notifications compounds spam problem raises privacy concerns
So email loss is often silent
6
Fixing the Problem
Improve the email delivery infrastructure more reliable servers
−e.g., cluster-based (Porcupine [Saito ’00])
server-less systems −e.g., DHT-based (POST [Mislove ’03])
total switchover might be risky “Smarter” spam filtering
moving target mistakes inevitable non-content-based filtering still needed to cope
with spam load
7
SureMail
SureMail addresses the problem from the outside add separate notification overlay email delivery infrastructure left undisturbed users can benefit without operator cooperation
Design goals: minimize demands on infrastructure and users preserve asynchronous operation and privacy
(no worse than it is today) maintain defenses against spam and viruses minimize overhead
8
Basic Operation
Sender S Recipient R
Notification server
Missing Items Folder
Request lost message
[S,H(M)]
GetNotif
icatio
ns
9
Notification Overlay
Decentralized limited collusion among the constituent nodes
Efficient notification server lookup e.g., R H(R) in a DHT setup
Agnostic to actual implementation end-host-based (e.g., always-on user desktops) infrastructure-based (e.g., “NX servers”)
10
Challenges
Privacy information about users’ email habits could be leaked
Notification spam spammers can spoof notifications and burden users annoyance attacks discredit notifications in general
Even the notification infrastructure isn’t trusted
No universal PKI for email users
11
SureMail Goals
Protect the recipient’s identity attacker shouldn’t be able to retrieve R’s
notifications or learn the volume of notifications intended for R
Protect the sender’s identity attacker shouldn’t be able to learn S’s identity or
monitor the volume of notifications posted by S
Block notification spam attacker shouldn’t be able to spoof notifications
12
Assumptions
No email eavesdroppers privacy is moot otherwise
Limited collusion among notification nodes needed only to avoid leaking notification
volume info
13
Key Mechanisms
#1: Email-based handshake
#2: Decoupled registration and notification
#3: Email-based shared secret
#4: Reply-based shared secret
14
#1: Email-based handshake
Goal: prevent hijacking of R’s identity Only R can receive emails sent to R
One-time operation for initial registration Send email to R to establish registration
secret shared with the notification overlay R can then use registration secret to
authenticate itself to the notification overlay
15
#2: Decoupled registration & notification
Goal: prevent snooping on recipient identity
Limited collusion among notification nodes
Registration at Dreg=H(H(R)) Notification posted at Dnot=H(R) R contacts Dnot to retrieve notifications for H(R) Dnot can find Dreg without knowing R Neither Dnot nor Dreg can associate notifications
with R, unless they collude
16
#3: Email-based shared secret
Goal: prevent snooping on sender identity
Email Mold from S to R in known only to S and R
H(Mold) could serve as implicit identifier of S to R But it doesn’t quite serve as authenticator for S:
Dnot knows H(Mold), so it could spoof notifications from S
even other attackers could do so by first sending Mspam purporting to be from S
17
#4: Reply-based shared secret
Goal: block spoofing of notifications Users rarely have conversation with
spammers R remembers (hashes of) recent emails from S
that it has replied to If S receives a reply to Mold it had sent R, Mold
can serve as a shared secret between S and R S could use H1(Mold) as an implicit identifier…
… and H2(Mold) as an authenticator
Hard for a spammer (even Dnot) to spoof
18
Putting it all together
Sender S Recipient R
Missing Items Folder
Request lost message
Dreg=H(H(R))
Reg
iste
r
Dnot=H(R) Verify
GetNotifications
H1(M)
=H1(Mold)
,H2(Mold)][
19
Other issues
Reply-detection: “in-reply-to” header may not always help indirect checks based on text similarity
Reducing overhead: post notifications only for “important” emails hold off on posting notification in the hope of receiving
an implicit ACK (reply) or NACK (bounce-back) First-time “legitimate” senders:
they are indistinguishable from spammers Mobility:
reply-based shared secret enables secure migration without state transfer
20
Status
Ongoing measurement experiment Design being refined Implementation in the works
21
Discussion
#1: Should the notification system be folded into the email infrastructure?
Separation is advantageous: provides failure independence keeps the notification layer simple
−small, fixed format notifications don’t require the same kind of processing as virus-laden email
provides engineering convenience
22
Discussion
#2: Is there a social benefit to silent email loss because of the plausible deniability it provides?
Any such benefit is far outweighed by the costs Should cars be slightly unreliable because of the
excuse it would give people when they miss an engagement?
It is the asynchronous nature of email that is key
23
24
AnecdotesFunding proposal
"No I never got and I never acked it… My last mail from you was [on] 3/10/2004.”
Response to self-managing networks summit invitation
"Yesterday's email did not bounce back, wonder where it is!”
IMC 2005 decision notification“I recd reviews for one paper (#X) but not that of #Y.”
IMAP server upgrade problems“Some unanticipated migration problems occurred that may have caused some lost or delayed email.”
25
Basic Operation
Senders post notifications to overlay, in addition to sending emails as usual
Intended recipients periodically download notifications intended for them
A notification without a matching email suggests possible email loss
26
Putting it all together
Registration: R contacts Dreg=H(H(R)) to register
Dreg sends R an email to set up registration secret
Posting notifications: upon sending email M to R, S posts notification
N to Dnot=H(R)
N = [Encrypt(H2(Mold), H1(M)), H1(Mold)]
27
Putting it all together
Retrieving notifications: R asks Dnot for the notifications corresponding
to H(R) and presents evidence of registration secret
Dnot contacts Dreg to verify evidence, before returning the notifications to R
R uses H1(Mold) to identify Mold and compute the encryption key H2(Mold)
R discards bogus notifications and checks for missing emails corresponding to the remaining notifications
28
#1: Protecting the recipient’s identity
Goal: only R should be able to retrieve notifications
intended for it attackers shouldn’t be able to learn even the
volume of notifications intended for R Key idea:
Email-based handshake: −prevents “hijacking” of R’s identity
Decoupling registration from notification: −prevents bad DHT node from associating
notifications with R
29
#1: Protecting the recipient’s identity
Registration: R contacts Dreg = H(H(R)) to register Dreg sends R an email to set up a shared secret
Posting notifications: Upon sending email M to R, S posts notification N =
[H(M),S] to Dnot = H(R) Retrieving notifications:
R presents an authenticator to Dnot and asks for the notifications corresponding to H(R)
Dnot contacts Dreg to verify the authenticator, before returning the notifications
R checks if emails are missing and presents the corresponding S to the user
30
SureMail Goals
#1: protecting the recipient’s identity
#2: protecting the sender’s identity
#3: blocking notification spam
31
#2: protecting the sender’s identity
Goal: attackers shouldn’t be able to learn S’s identity
or monitor the volume of notifications posted by S
clearly N = [H(M),S] won’t do
Key idea: email-based shared secret assuming no eavesdroppers, an email Mold from S
to R in known only to S and R so H(Mold) could serve as an authenticator and
identifier of S to R
32
#2: protecting the sender’s identity
Posting notifications: S’s identity is made implicit in the notification N = [H(M), H(Mold)]
Retrieving notifications: R stores hashes of emails received (recently)
from various senders it searches for H(Mold) to identify S
if H(Mold) can’t be found, the notification is ignored
33
SureMail Goals
#1: protecting the recipient’s identity
#2: protecting the sender’s identity
#3: blocking notification spam
34
#3: blocking notification spam
Goal: prevent spammers from posting bogus
notifications and burdening users with false reports of email loss
A malicious DHT node could itself be a spammer
35
#3: blocking notification spam
Current scheme is vulnerable to attack Easy for malicious DHT node Dnot to generate spam:
Dnot has ready access to H(Mold) it could spam R with bogus notifications purporting to be
from S Another spammer (say X) could also do so:
X sends R a message Mspam purporting to be from S
X can then use H(Mspam) as the implicit identifier in notifications
R may assume it is really missing an email from S Spammer gain indirectly by annoying R and S,
thereby discrediting notifications in general
36
#3: blocking notification spam
Key idea: reply-based shared secret users rarely engage in conversations with
spammers so if S receives a reply to a message Mold that it had
sent R, S could use H(Mold) as an implicit identifier hard for a spammer to spoof the identifier special construction to prevent spoofing by Dnot
Notification format: N = [Encrypt(H”(Mold), H(M)), H(Mold)]
R uses H(Mold) to identify Mold and compute the encryption key H”(Mold)
37
PKI-based Design
R↔DA: RegisterRecipient(R,A) S↔DN: PostNotification(H(R),N)
N = [H(M), TTL, E(Rpb,S), Sg(Spv,H(M),TTL)]
R↔DN: CheckNotification(H(R),A) DN: find DA = H(H(R)) DN↔DA: AuthenticateRequest(H(R),A) DN↔R: ReturnNotifications() R: identify notifications corresponding to
missing email and notify user if S is trusted
38
Notation
Sender S Recipient R Message M Notification N
Crypto operations: H: hash(…) Sg: sign(key,…) E: encrypt(key,…)
39
Overhead duplication of effort with respect to email
delivery
#3: preserve privacy of email content−attacker shouldn’t be able to learn email content
40
Silent Email Loss
Silent email loss is a non-negligible problem silent loss = no notification to sender or recipient imposes significant cost, degrades user
experience Several causes
spam filtering −MS IT: 90% of emails discarded off the bat
failures and upgrades Measurement studies
0.5-1.0% silent loss ([Lang ’04], [Afergan ’05]) ongoing measurement study at MSR
−quantify email delays & loss across ~25 domains
41
SureMail Overview
SureMail adds separate notification system orthogonal to email delivery infrastructure email is still subject to checking by spam filters,
virus scanners doesn’t create backdoor for malware bounds worst case performance
Asynchronous operation senders post notifications = hash(message) recipients check for them preserves the privacy of email (unlike read
receipts)
42
SureMail Design
Reply-based shared secret A sends an email M to B (AB) B replies to A’s email (BA) A uses hash(M) to “prove” to B that it is
legitimate shared secret is continually refreshed
Reply-based shared secret helps: avoid burdening users with notification spam maintain privacy of notifications