supplemental material: overdispersed variational...

Post on 03-Sep-2020

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Supplemental Material: Overdispersed Variational Autoencoders

Harshil Shah and David BarberDepartment of Computer Science

University College London

Neural network architecturesBelow, we describe the architecture of all of the neural net-works used for the parameters of the generative and varia-tional distributions. Note that all hidden units used the tanhnonlinearity.

Hidden layers Output nonlinear-ity

MN

IST πππθ(h) 2 × 200 units sigmoid

ηηηMNISTφ (x) 2 × 200 units linear

ΩΩΩMNISTφ (x) 2 × 200 units exp

FF

µµµθ(h) 2 × 100 units sigmoidΣΣΣθ(h) 2 × 100 units expηηηFFφ (x) 2 × 100 units linearΩΩΩFFφ (x) 2 × 100 units exp

Variance of gradient estimatesBelow, we show the sample variances of the gradient esti-mates during the first 1,000 iterations of training, for bothdatasets.

MNISTVAE vs. OVAE

Copyright c© 2017, Association for the Advancement of ArtificialIntelligence (www.aaai.org). All rights reserved.

IWAE vs. OIWAE

For the MNIST dataset, The OVAE gradient estimateshave noticeably lower variance than do the VAE updates.Comparing the OIWAE against the IWAE, this is less clear,except in the first 300 training iterations.

Frey Faces

VAE vs. OVAE

IWAE vs. OIWAE

As with MNIST, for the Frey Faces dataset, the OVAEgradient estimates have noticeably lower variance than dothe VAE updates. However, this is not the case for the OI-WAE when compared against the IWAE, which have verysimilar variances.

Generated output

Below we show sample outputs generated from the prior andposterior of the learned models, for both datasets.

MNIST

OVAE: Prior

OVAE: Posterior

OIWAE: Prior

OIWAE: Posterior

Frey Faces

OVAE: Prior

OVAE: Posterior

OIWAE: Prior

OIWAE: Posterior

top related