generative adversarial networks for simulation · generative adversarial networks for simulation!...
TRANSCRIPT
Generative Adversarial Networks for Simulation
! @wondermicky " [email protected] # mickypaganini.github.io
Michela Paganini PhD Candidate, Yale University
Affiliate, Lawrence Berkeley National Lab
arXiv:1701.05927 arXiv:1705.02355 arXiv:1711.08813
with
Luke de Oliveira and Ben NachmanVai, LBNL LBNL
Comput Softw Big Sci (2017) 1: 4 Accepted in PRL In ACAT 2017 Proceedings
Generative Modeling
Build a generative model with probability distribution
Generative Modeling
dataset
Generative Modeling
dataset noise
generator
Generative Modeling
dataset noise
generator
How do I train the generator to output samples that look like the could have been drawn from the
original data distribution?
Generative Adversarial Networks
Turn generative modeling into a two player game
Generative Adversarial Networks
Turn generative modeling into a two player game
Finding an Equilibrium
• If we allow D, G, to be from space of all continuous functions, then
• There exists a unique Nash equilibrium (no “player” incentivize to deviate off path)
• G exactly recovers , the data distribution
• D(I) = 1/2 for all inputs
Solutions to Instability
• Ad hoc / experimental solutions
• DCGAN, BatchNorm, label flipping, soft labels, Minibatch discrimination, etc.
• Theory-driven / analytical solutions
• WGAN, WGAN-GP, BEGAN, etc.
For more info: see my talk at IML meeting and Ian Goodfellow’s seminar at CERN
Generative Modeling for Science
Why study generative models? “After all, when applied to images, such models seem to merely provide more images, and the world has no shortage of images.”
Generative Modeling for Science
Why study generative models? “After all, when applied to images, such models seem to merely provide more images, and the world has no shortage of images.”
in science, we do!
ATLAS grid consumption
LSST-DESC cosmological simulation run on 16384 of Titan's
GPU-enhanced nodes
LAGANarXiv:1701.05927
LAGAN
• Large dynamic range (pixel intensity = energy)
• Sparse (~10%) occupancy of images
• Changing location or intensity of 1 pixel activation $ different jet properties
• Important features have high Lipschitz constant
Learn a generative model to reproduce Pythia8 QCD vs boosted W from W’—>WZ jet images
Unique characteristics of jet images:
LAGAN Overview
• Designed a GAN architecture suited to location-specificity of jet images
• For an overview, see my talk at the IML Machine Learning Workshop
• Main takeaway: ML / GANs & Physics are not at odds!
GAN-generated signal - background Real signal - background
n-subjettiness
jet mass
CaloGAN
Open Dataset of Calorimeter Images
• 3 layer, heterogeneous segmentation and resolution (designed to approximate ATLAS LAr calorimeter)
• 3 types of particle: e+, π+, y
• Variable position and angle of incidence (5cm in x and y; 0˚, 5˚ and 20˚ in theta and phi)
• Open, available, re-usable, citable
GEANT4 Simulation
Loca
l Ene
rgy
Dep
osit
[MeV
]
0
5
10
15
20
25
30
Depth from Calorimeter Center [mm]200− 150− 100− 50− 0 50 100 150 200
dire
ctio
n [m
m]
η
200−
150−
100−
50−
0
50
100
150
200
-Geant4, Pb Absorber, lAr Gap, 10 GeV e
Depth from Calorimeter Center [mm]200− 150− 100− 50− 0 50 100 150 200
dire
ctio
n [m
m]
η
200−
150−
100−
50−
0
50
100
150
200
Cel
l Ene
rgy
[MeV
]
0
100
200
300
400
500
600
700
800
900
1000
0
100
200
300
400
500
600
700
800
900
1000
0
100
200
300
400
500
600
700
800
900
1000
0
100
200
300
400
500
600
700
800
900
1000-Geant4, Pb Absorber, lAr Gap, 10 GeV e
We simulate exact (x, y, z) We can read out this
(side view)
Open Dataset of Calorimeter Images
• Challenges:
- sparsity
- dynamic range
- location specificity
• Advantages:
- compositionality
- quantifiable properties —> available 1D marginals of data distribution
3x96
12x12
12x6
Closer to raw detector output
CaloGAN Generator
• Three independent streams, one per calorimeter layer
• Learnable attention mechanism decides how much energy from one layer to carry to the next
• Similar to next-frame-prediction in GAN applications to videos
CaloGAN Generator
CaloGAN Discriminator
G minimizes −LG+Lξ+λELE and D minimizes −LD + Lξ + λELE:
CaloGAN Results
Qualitative Results (1)
CaloGAN
GEANT
Average positron shower in each calorimeter layer
Qualitative Results (2)
GEANT 1st layer deposition
GEANT 2nd layer deposition
GEANT 3rd layer deposition
CaloGAN 1st layer deposition
CaloGAN 2nd layer deposition
CaloGAN 3rd layer deposition
Individual positron showers and generated nearest neighbors
Shower Shapes - Log Axes
and many more!
Kernel PCA
kernel=cosinen_comp=3
kernel=polyn_comp=3
n_comp=2n_comp=2
GEANT and GAN datasets look very similar, at least along the first three principal components
Ways of representing distribution agreement along principal components
Potential Speed-Up
Up to a 100,000x speed-up!
CaloGAN Tuning
(1) Attention
(2) Convolutional vs Locally Connected layers in D and G
(3) Different sparsity implementations
(4) Propagate vs concat E
(5) Need for batch norm?
(6) Hyper-params (learning rate, etc.)
(7) Regression / conditioning (θ, φ, x0, y0)
Can we learn (θ, φ, x0, y0) from Calo Images?
Regression on θ, φ, x0, y0 from training dataset
Useful for enforcing conditions in CaloGAN!
Conditioning the CaloGAN Generator
Traversing the manifold along the Energy direction:
Traversing the manifold along the x0 direction:
Reproducible Research
• We have open-sourced our code, dataset, and analysis procedure for both works.
• https://github.com/hep-lbdl/adversarial-jets
• https://github.com/hep-lbdl/CaloGAN
Calo Images Dataset for Classification
Classification with Calo Images
• e+ vs π+; e+ vs y
• Compare:
• Fully-connected network on shower shapes
• Fully-connected network on unravelled pixels
• 3-stream locally-connected network
• 3-stream convolutional network
• 3-stream densely-connected convolutional network (DenseNet)
Classification with Calo Images
Classification with Calo Images
Open Dataset of Calo Images
• Can be used for:
• Classification
• Regression
• Generation
• other uses we haven’t thought of yet!
• HDF5 format — simple, well-documented structure