generative adversarial networks for simulation · generative adversarial networks for simulation!...

Generative Adversarial Networks for Simulation

! @wondermicky " [email protected] # mickypaganini.github.io

Michela Paganini PhD Candidate, Yale University

Affiliate, Lawrence Berkeley National Lab

arXiv:1701.05927 arXiv:1705.02355 arXiv:1711.08813

with

Luke de Oliveira and Ben NachmanVai, LBNL LBNL

Comput Softw Big Sci (2017) 1: 4 Accepted in PRL In ACAT 2017 Proceedings

mailto:[email protected]?subject=

http://mickypaganini.github.io

Generative Modeling

Build a generative model with probability distribution

Generative Modeling

dataset

Generative Modeling

dataset noise

generator

Generative Modeling

dataset noise

generator

How do I train the generator to output samples that look like the could have been drawn from the

original data distribution?

Generative Adversarial Networks

Turn generative modeling into a two player game

Finding an Equilibrium

• If we allow D, G, to be from space of all continuous functions, then

• There exists a unique Nash equilibrium (no “player” incentivize to deviate off path)

• G exactly recovers , the data distribution

• D(I) = 1/2 for all inputs

Solutions to Instability

• Ad hoc / experimental solutions

• DCGAN, BatchNorm, label flipping, soft labels, Minibatch discrimination, etc.

• Theory-driven / analytical solutions

• WGAN, WGAN-GP, BEGAN, etc.

For more info: see my talk at IML meeting and Ian Goodfellow’s seminar at CERN

https://indico.cern.ch/event/655447/contributions/2742180/attachments/1552018/2438676/advanced_gans_iml.pdf

https://indico.cern.ch/event/673989/

Generative Modeling for Science

Why study generative models? “After all, when applied to images, such models seem to merely provide more images, and the world has no shortage of images.”

Generative Modeling for Science

Why study generative models? “After all, when applied to images, such models seem to merely provide more images, and the world has no shortage of images.”

in science, we do!

ATLAS grid consumption

LSST-DESC cosmological simulation run on 16384 of Titan's

GPU-enhanced nodes

LAGANarXiv:1701.05927

LAGAN

• Large dynamic range (pixel intensity = energy)

• Sparse (~10%) occupancy of images

• Changing location or intensity of 1 pixel activation $ different jet properties

• Important features have high Lipschitz constant

Learn a generative model to reproduce Pythia8 QCD vs boosted W from W’—>WZ jet images

Unique characteristics of jet images:

https://link.springer.com/content/pdf/10.1007%2Fs41781-017-0004-6.pdf

LAGAN Overview

• Designed a GAN architecture suited to location-specificity of jet images

• For an overview, see my talk at the IML Machine Learning Workshop

• Main takeaway: ML / GANs & Physics are not at odds!

GAN-generated signal - background Real signal - background

n-subjettiness

jet mass

https://indico.cern.ch/event/595059/contributions/2497383/attachments/1431666/2199445/gan_presentation_IML.pdf

CaloGAN

Open Dataset of Calorimeter Images

• 3 layer, heterogeneous segmentation and resolution (designed to approximate ATLAS LAr calorimeter)

• 3 types of particle: e+, π+, y

• Variable position and angle of incidence (5cm in x and y; 0˚, 5˚ and 20˚ in theta and phi)

• Open, available, re-usable, citable

https://zenodo.org/record/846388#.WZskq5OGORs

GEANT4 Simulation

Loca

l Ene

rgy

Dep

osit

[MeV

]

0

5

10

15

20

25

30

Depth from Calorimeter Center [mm]200− 150− 100− 50− 0 50 100 150 200

dire

ctio

n [m

m]

η

200−

150−

100−

50−

0

50

100

150

200

-Geant4, Pb Absorber, lAr Gap, 10 GeV e

Depth from Calorimeter Center [mm]200− 150− 100− 50− 0 50 100 150 200

dire

ctio

n [m

m]

η

200−

150−

100−

50−

0

50

100

150

200

Cel

l Ene

rgy

[MeV

]

0

100

200

300

400

500

600

700

800

900

1000

0

100

200

300

400

500

600

700

800

900

1000

0

100

200

300

400

500

600

700

800

900

1000

0

100

200

300

400

500

600

700

800

900

1000-Geant4, Pb Absorber, lAr Gap, 10 GeV e

We simulate exact (x, y, z) We can read out this

(side view)

Open Dataset of Calorimeter Images

• Challenges:

- sparsity

- dynamic range

- location specificity

• Advantages:

- compositionality

- quantifiable properties —> available 1D marginals of data distribution

3x96

12x12

12x6

Closer to raw detector output

CaloGAN Generator

• Three independent streams, one per calorimeter layer

• Learnable attention mechanism decides how much energy from one layer to carry to the next

• Similar to next-frame-prediction in GAN applications to videos

CaloGAN Generator

CaloGAN Discriminator

G minimizes −LG+Lξ+λELE and D minimizes −LD + Lξ + λELE:

CaloGAN Results

Qualitative Results (1)

CaloGAN

GEANT

Average positron shower in each calorimeter layer

Qualitative Results (2)

GEANT 1st layer deposition

GEANT 2nd layer deposition

GEANT 3rd layer deposition

CaloGAN 1st layer deposition

CaloGAN 2nd layer deposition

CaloGAN 3rd layer deposition

Individual positron showers and generated nearest neighbors

Shower Shapes - Log Axes

and many more!

Kernel PCA

kernel=cosinen_comp=3

kernel=polyn_comp=3

n_comp=2n_comp=2

GEANT and GAN datasets look very similar, at least along the first three principal components

Ways of representing distribution agreement along principal components

Potential Speed-Up

Up to a 100,000x speed-up!

CaloGAN Tuning

(1) Attention

(2) Convolutional vs Locally Connected layers in D and G

(3) Different sparsity implementations

(4) Propagate vs concat E

(5) Need for batch norm?

(6) Hyper-params (learning rate, etc.)

(7) Regression / conditioning (θ, φ, x0, y0)

Can we learn (θ, φ, x0, y0) from Calo Images?

Regression on θ, φ, x0, y0 from training dataset

Useful for enforcing conditions in CaloGAN!

Conditioning the CaloGAN Generator

Traversing the manifold along the Energy direction:

Traversing the manifold along the x0 direction:

Reproducible Research

• We have open-sourced our code, dataset, and analysis procedure for both works.

• https://github.com/hep-lbdl/adversarial-jets

• https://github.com/hep-lbdl/CaloGAN

https://github.com/hep-lbdl/adversarial-jets

https://github.com/hep-lbdl/CaloGAN

Calo Images Dataset for Classification

Classification with Calo Images

• e+ vs π+; e+ vs y

• Compare:

• Fully-connected network on shower shapes

• Fully-connected network on unravelled pixels

• 3-stream locally-connected network

• 3-stream convolutional network

• 3-stream densely-connected convolutional network (DenseNet)

Classification with Calo Images

Open Dataset of Calo Images

• Can be used for:

• Classification

• Regression

• Generation

• other uses we haven’t thought of yet!

• HDF5 format — simple, well-documented structure

https://zenodo.org/record/846388#.WZskq5OGORs

THANK YOU!

You can find me at: % [email protected]

Question?

mailto:[email protected]?subject=

generative adversarial networks for simulation · generative adversarial networks for simulation!...

Documents