paroma varma, bryan he, payalbajaj, nishith khandwala, …bryanhe/posters/inferring_nips... ·...
TRANSCRIPT
Inferring Generative Model Structure with Static Analysis Paroma Varma, Bryan He, Payal Bajaj, Nishith Khandwala, Imon Banerjee, Daniel Rubin, Christopher Ré
MotivationsSummary
• Generative Models to Label Training DataUse generative models to combine sources of weak supervision to assign noisy labels
• Complex Dependencies among SourcesSources of labels are rarely independent
• Inferring Model StructureUse static analysis to infer dependencies and encode in generative model structure
Heuristic Structure• Domain Specific Primitives (DSPs)
Interpretable characteristics of raw data
• Heuristic Functions (HFs)Programmatic rules that output noisy labels
MotivationsStatic Analysis• Shared Input Sharing primitives as inputs
leads to explicit dependencies
• Compositions: Primitives composed of others can lead to implicit dependencies
Statistical Modeling• HF Dependency Represents the
dependencies found using static analysis
• DSP Similarity Represents the learned correlations among the DSPs
MotivationsTheoretical Results
• Learning DependenciesLearning k-degree dependencies among 𝑛heuristics requires 𝑂 𝑛#$%log𝑛 samples
• Inferring DependenciesAnalyzing the code can infer the dependencies among heuristics without data
Given dependencies, learning heuristic accuracies requires 𝑂 𝑛 log𝑛 samples
MotivationsExperimental Results
p1𝜙%+,
𝜙-+,
𝜙.+,
Y 𝜙/01
λ1
λ2
λ3
p2
p3
𝜙-233
𝜙%233
𝜙.233
P.w
P.h
P.yλ1
λ2
λ3
DSP HF
Shared Input, Explicit Dep.
Composition, Implicit Dep.
𝜙4+,
𝜙5233
𝜙/01
YTrue Label
Accuracy
HFDependency
DSP Similarity
def λ_1(P.y):if P.y[‘person’]>= P.y[‘bike’]:
return Trueelse:
return False
def λ_2(P.h):if P.h[‘person’] >= P.h[‘bike’]:
return Trueelse:
return False
def λ_3(P.area):if P.area[‘person’]
<= 2*P.area[‘bike’]:return True
else:return False
• Inferring dependencies outperforms learning dependencies
• Outperforms fully supervised model with additional noisy training labels
* reports F1 scores, rest in accuracy (%)