Session 5A: Learning Agents AAMAS 2019, May 13-17, 2019, Montréal, Canada A New Concept of Convex based Multiple Neural Networks Structure Yu Wang Yue Deng Samsung Research America Samsung Research America Mountain View, CA Mountain View, CA
[email protected] [email protected] Yilin Shen Hongxia Jin Samsung Research America Samsung Research America Mountain View, CA Mountain View, CA
[email protected] [email protected] ABSTRACT During the last decade, there are lots of progresses on build- In this paper, a new concept of convex based multiple neural net- ing a variety of different neural network structures, like recurrent works structure is proposed. This new approach uses the collective neural networ (RNN), long short-term memory (LSTM) network, information from multiple neural networks to train the model. convolutional neural network (CNN), time-delayed neural network From both theoretical and experimental analysis, it is going to and etc. [12, 13, 31]. Training these neural networks, however, are demonstrate that the new approach gives a faster training speed of still heavily to be relied upon back-propagation [40] by computing convergence with a similar or even better test accuracy, compared the loss functions’ gradients, then further updating the weights of to a conventional neural network structure. Two experiments are networks. Despite the effectiveness of using back-propagation, it conducted to demonstrate the performance of our new structure: still has several well known draw-backs in many real applications. the first one is a semantic frame parsing task for spoken language For example, one of the main issues is the well known gradient understanding (SLU) on ATIS dataset, and the other is a hand writ- vanishing problem [3], which limits the number of layers in a deep ten digits recognition task on MNIST dataset.