The continuous Bernoulli: fixing a pervasive error in variational autoencoders Gabriel Loaiza-Ganem John P. Cunningham Department of Statistics Department of Statistics Columbia University Columbia University
[email protected] [email protected] Abstract Variational autoencoders (VAE) have quickly become a central tool in machine learning, applicable to a broad range of data types and latent variable models. By far the most common first step, taken by seminal papers and by core software libraries alike, is to model MNIST data using a deep network parameterizing a Bernoulli likelihood. This practice contains what appears to be and what is often set aside as a minor inconvenience: the pixel data is [0; 1] valued, not f0; 1g as supported by the Bernoulli likelihood. Here we show that, far from being a triviality or nuisance that is convenient to ignore, this error has profound importance to VAE, both qualitative and quantitative. We introduce and fully characterize a new [0; 1]-supported, single parameter distribution: the continuous Bernoulli, which patches this pervasive bug in VAE. This distribution is not nitpicking; it produces meaningful performance improvements across a range of metrics and datasets, including sharper image samples, and suggests a broader class of performant VAE.1 1 Introduction Variational autoencoders (VAE) have become a central tool for probabilistic modeling of complex, high dimensional data, and have been applied across image generation [10], text generation [14], neuroscience [8], chemistry [9], and more. One critical choice in the design of any VAE is the choice of likelihood (decoder) distribution, which stipulates the stochastic relationship between latents and observables.