A Memory Network Approach for Story-based Temporal Summarization of 360◦ Videos Sangho Lee, Jinyoung Sung, Youngjae Yu, Gunhee Kim Seoul National University
[email protected],
[email protected],
[email protected],
[email protected] http://vision.snu.ac.kr/projects/pfmn Abstract Past Memory (Selected Subshots) We address the problem of story-based temporal sum- Future Memory (Candidate Subshots) marization of long 360◦ videos. We propose a novel mem- ory network model named Past-Future Memory Network (PFMN), in which we first compute the scores of 81 nor- mal field of view (NFOV) region proposals cropped from Select the Next ◦ Summary Subshot the input 360 video, and then recover a latent, collective Order Time In summary using the network with two external memories that Correlating Between store the embeddings of previously selected subshots and fu- Past & Future Memory ture candidate subshots. Our major contributions are two- View Selection Input : 360° Video fold. First, our work is the first to address story-based tem- poral summarization of 360◦ videos. Second, our model is the first attempt to leverage memory networks for video summarization tasks. For evaluation, we perform three sets Figure 1. The intuition of the proposed PFMN model for temporal of experiments. First, we investigate the view selection ca- ◦ pability of our model on the Pano2Vid dataset [42]. Sec- summarization of 360 videos. With a view selection module for finding key objects and a novel memory network leveraging two ond, we evaluate the temporal summarization with a newly ◦ past and future memories, our PFMN temporally summarizes a collected 360 video dataset.