The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21) DramaQA: Character-Centered Video Story Understanding with Hierarchical QA Seongho Choi,1 Kyoung-Woon On,1 Yu-Jung Heo,1 Ahjeong Seo,1 Youwon Jang,1 Minsu Lee,1 Byoung-Tak Zhang1,2 1 Seoul National University 2 AI Institute (AIIS) fshchoi,kwon,yjheo,ajseo,ywjang,mslee,
[email protected] Abstract Since drama closesly describes our everyday life, the con- tents of drama also help to learn realistic models and pat- Despite recent progress on computer vision and natural lan- terns of humans’ behaviors and conversations. However, the guage processing, developing a machine that can understand causal and temporal relationships between events in drama video story is still hard to achieve due to the intrinsic diffi- culty of video story. Moreover, researches on how to evaluate are usually complex and often implicit (Riedl 2016). More- the degree of video understanding based on human cognitive over, the multimodal characteristics of the video make the process have not progressed as yet. In this paper, we propose a problem trickier. Therefore, video story understanding has novel video question answering (Video QA) task, DramaQA, been considered as a challenging machine learning task. for a comprehensive understanding of the video story. The One way to enable a machine to understand a video DramaQA focuses on two perspectives: 1) Hierarchical QAs story is to train the machine to answer questions about as an evaluation metric based on the cognitive developmental the video story (Schank and Abelson 2013; Mueller 2004). stages of human intelligence. 2) Character-centered video an- notations to model local coherence of the story.