Misinformation Detection on YouTube Using Video Captions Raj Jagtap Abhinav Kumar School of Mathematics and Computer Science School of Mathematics and Computer Science Indian Institute of Technology Goa, India Indian Institute of Technology Goa, India
[email protected] [email protected] Rahul Goel Shakshi Sharma Rajesh Sharma Institute of Computer Science Institute of Computer Science Institute of Computer Science University of Tartu, Tartu, Estonia University of Tartu, Tartu, Estonia University of Tartu, Tartu, Estonia
[email protected] [email protected] [email protected] Clint P. George School of Mathematics and Computer Science Indian Institute of Technology Goa, India
[email protected] Abstract—Millions of people use platforms such as YouTube, According to recent research, YouTube, the largest video Facebook, Twitter, and other mass media. Due to the accessibility sharing platform with a user base of more than 2 billion users, of these platforms, they are often used to establish a narrative, is commonly utilized to disseminate misinformation and hatred conduct propaganda, and disseminate misinformation. This work proposes an approach that uses state-of-the-art NLP techniques videos [6]. According to a survey [7], 74% of adults in the to extract features from video captions (subtitles). To evaluate USA use YouTube, and approximately 500 hours of videos are our approach, we utilize a publicly accessible and labeled dataset uploaded to this platform every minute, which makes YouTube for classifying videos as misinformation or not. The motivation hard to monitor. Thus, this makes YouTube an excellent forum behind exploring video captions stems from our analysis of videos for injecting misinformation videos, which could be difficult metadata.