Transformer-based Conditional Variational Autoencoder for Controllable Story Generation Le Fangy, Tao Zengx, Chaochun Liux, Liefeng Box, Wen Dongy, Changyou Cheny yUniversity at Buffalo, xJD Finance America Corporation, AI Lab flefang, wendong,
[email protected] ftao.zeng, chaochun.liu,
[email protected] Abstract powerful representation learning can deal with both the effec- tiveness and the controllability of generation. We investigate large-scale latent variable models (LVMs) for neural story generation—an under-explored application for In recent years, Transformers (Vaswani et al. 2017) and open-domain long text—with objectives in two threads: gener- its variants have become the main-stream workhorses and ation effectiveness and controllability. LVMs, especially the boosted previous generation effectiveness by large margins. variational autoencoder (VAE), have achieved both effective Models based on similar self-attention architectures (Devlin and controllable generation through exploiting flexible distri- et al. 2018; Radford et al. 2018, 2019) could leverage both big butional latent representations. Recently, Transformers and models and big training data. A dominant paradigm emerges its variants have achieved remarkable effectiveness without to be “pre-training + fine-tuning” on a number of natural explicit latent representation learning, thus lack satisfying con- language processing tasks. Even without explicitly learning trollability in generation. In this paper, we advocate to revive latent representations, Transformer-based models could ef- latent variable modeling, essentially the power of representa- tion learning, in the era of Transformers to enhance controlla- fectively learn from training data and generate high-quality bility without hurting state-of-the-art generation effectiveness. text. It’s thrilling to witness computational models generate Specifically, we integrate latent representation vectors with consistent long text in thousands of words with ease.