analysis - university of pennsylvaniaryrogers/leverage_talk.pdf · at the end of may, the imagenet...
TRANSCRIPT
Analysis 𝑡
𝑡
𝑡 𝑋
𝑡
𝑡 𝑋
Analysis 𝑡# ← 𝑡 𝑋
𝑡 𝑋
𝑡
Analysis 𝑡# ← 𝑡 𝑋𝑡# 𝑋#
A lot of existing theory assumes tests are selected independently of the data.
𝑡#
Ideal World
How can we provide statistically valid answers to adaptively chosen analyses?
Real World
Ideal World
How can we provide statistically valid answers to adaptively chosen analyses?
Real World
I
n
t
r
o
d
u
c
t
i
o
n
M
o
d
e
l
R
e
s
u
l
t
s
K
e
y
I
d
e
a
s
P
r
o
o
f
S
k
e
t
c
h
Adaptivity causes r
eal problems
I
n
t
r
o
d
u
c
t
i
o
n
M
o
d
e
l
R
e
s
u
l
t
s
K
e
y
I
d
e
a
s
P
r
o
o
f
S
k
e
t
c
h
Adaptivity causes real problems
Introduction Model Results Key Ideas Proof Sketch
Adaptivity causes real problems
𝑋~𝑃(
How can we provide statistically valid answers to adaptively chosen analyses?
𝑡𝑡(𝑋)
𝑡′ ← 𝑡 𝑋
𝑡′𝑡′(𝑋)
𝑋~𝑃(
𝑡𝑡(𝑋)
𝑡′ ← 𝑎
𝑡′𝑡′(𝑋)
Answer: Limit the info learned about the dataset with each analysis [Dwork,Feldman,Hardt,Pitassi,Reingold,Roth’15].
𝑎
[Dwork,McSherry,Nissim,Smith’06]
𝐴:𝐷( → 𝑌 𝜀, 𝛿𝑥, 𝑥′ ∈ 𝐷( 𝑆 ⊆ 𝑌
𝑃 𝐴 𝑥 ∈ 𝑆 ≤ 𝑒;𝑃 𝐴 𝑥# ∈ 𝑆 + 𝛿
[R, Roth, Smith, Thakkar’16].
[Gaboardi, Lim, R, Vadhan’16], [Kifer,R’16].
[R,Roth,Ullman,Vadhan’16].