
Identifying Shooting Tweets with Deep Learning and Keywords Filtering: Comparative Study A thesis submitted to the Graduate school of the University of Cincinnati in partial fulfillment of the requirements for the degree of Master of Science in the School of Information Technology of the College of Education, Criminal Justice, and Human Services by Ammar A. A. Mohamed March 2021 Committee Chair: Dr. Jess Kropczynski Committee Member: Dr. Shane Halse ABSTRACT During large scale crises, 911 call centers often become inundated by high call volume, making it difficult for citizens to request help. When this is the case, people may turn to social media for support. This also happens when someone may wish to discuss an incident, such as hearing gunshots, but feel unsure if calling 911 is the most appropriate action. 911 does not typically monitor social media platforms for these types of requests due to challenges in sorting and filtering relevant information. To support the fast identification of important information to be shared with first responders, this research focuses on analyzing social media posts to determine the relevancy of social media posts about shooting incidents and emergencies. It compares the accuracy and relevancy of two methods of filtering social media data. The first is filtering tweets using keywords related to shooting and manually labeling them based on their relevancy to shooting events. The second method is by training a Transfer Learning model to determine the relevancy of collected tweets. The comparison results show that the machine learning technique is more accurate in identifying relevant tweets than the keyword filtering technique. 1 Copyright Notice © Copyright by Ammar Mohamed, 2021. All Rights Reserved. 2 ACKNOWLEDEGMENT I would like to give a special thanks to Prof. Jess, for giving me an opportunity working on her project, guidance, support and feedback. I also would like to thank Prof. Shane for his technical guidance and ideas throughout my work on that research. I would also like to give a special thanks to my family, and friends for supporting me starting from day one. 3 TABLE OF CONTENTS Section 1: Introduction .................................................................................................................... 7 1.1 Social Media Crisis ............................................................................................................... 7 1.2. Guns And Shooting ............................................................................................................ 10 Section 2: Literature Review ........................................................................................................ 13 2.1 Social Media For Real-Time Information Sharing ............................................................. 13 2.2 Social Media Filtering Techniques ..................................................................................... 14 2.3 Bursty Keywords ................................................................................................................ 15 2.4 Social Triangulation ............................................................................................................ 15 2.5 Social Media Filtering Using Event Detection ................................................................... 16 2.6 Social Media Shooting ........................................................................................................ 16 2.7 Social Media Analysis Using Nlp ....................................................................................... 18 2.8 Text Classification Using Machine Learning ..................................................................... 19 2.8 Text Classification Using Deep Learning ........................................................................... 22 2.9 Research Aims .................................................................................................................... 24 Section 3: Methodology ................................................................................................................ 25 3.1 Data Collection ................................................................................................................... 25 3.2 Data Storage ........................................................................................................................ 27 3.3 Data Analysis ...................................................................................................................... 28 Section 4: Results .......................................................................................................................... 30 4.1 Comparative Analysis Of Filtering Techniques ................................................................. 31 4.2 Discussion ........................................................................................................................... 34 4 4.3 Limitations And Future Work ............................................................................................. 35 Section 5: Conclusion ................................................................................................................... 36 Section 6: References ................................................................................................................... 37 Section 7: Appendix ...................................................................................................................... 43 7.1 Appendix A: List of Keywords ........................................................................................... 43 7.2 Appendix B: Scripts ............................................................................................................ 44 5 LIST OF FIGURES Figure 1-1 6Ws Coding Schema (Kropczynski et al. 2018) ........................................................... 9 Figure 1-2 Number of Civilian Firearms per 100 people ............................................................. 10 Figure 1-3 GN Statistics in 2019 .................................................................................................. 12 Figure 1-4 GN Statistics in 2020 .................................................................................................. 12 Figure 2-1 Text Classification Tasks. (n.d). ................................................................................. 20 Figure 2-2 Spam Detection (Axel Bellec, 2018) .......................................................................... 20 Figure 2-3 Text Classification Process (M. Ikonmakis et al., 2005) ............................................ 21 Figure 2-4 VD-CNN results on the eight datasets (Conneau, Alexis, et al., 2017) ...................... 22 Figure 2-5 Different Learning Process between (a) traditional machine learning and (b) transfer learning (Weiss et al., 2016) ......................................................................................................... 24 Figure 2-6 Test error rater (%) on text classification datasets (Howard et al., 2018) .................. 24 LIST OF TABLES Table 1: Comparison between the two approaches in the Nova Scotia Dataset ........................... 33 Table 2: Comparison between the two approaches in Downtown Cincinnati Dataset ................. 34 6 SECTION 1: INTRODUCTION 1.1 Social Media Crisis Large amounts of data are posted to social media every second—this is untapped potential to recognize patterns that we did not know existed and can have a real-time application to crisis response. For example, video surveillance is a ubiquitous task done every day. Companies may employ individuals who can easily monitor a small number of cameras on a corporate building. On occasion, these cameras have offered useful information about developing crises. Today, hundreds of cameras are deployed, from drones to cell phones, creating more information than what a human can feasibly monitor. With the help of computers using the Machine Learning (ML) and Deep Learning (DL) methods, we are now able to identify and predict human behaviors through cameras (Ellis et al. 2009). This work aims to extend this type of ML techniques to filter social media data better using data collected around the time of actual shooting events and comparing methods for accuracy. A great deal of data is generated every day on social media. This information is used for marketing purposes regularly; it has the potential to serve other purposes, such as in crisis management. This study contributes to a larger body of work that focuses on collecting actionable data from social media, namely, Twitter, to help 911 telecommunicators (floor supervisors, call takers, and dispatchers) to: identify Twitter users requesting assistance during a crisis; identify information that may be related to incidents reported to 911; and pass the information to the first responders (police, fire, and emergency medical services). This work makes that contribution by comparing automated methods that can support the work of identifying information related to a particular kind of incident. 7 Previous work refining automated methods for social media filtering have included identifiers to pick relevant tweets like hashtags and related keywords to a crisis (Herndon and Caragea, 2016; Li et al. 2018). Other methods were developed to filter bots and non-relevant information (Wang et al. 2014; Varol et al. 2017). A massive number of data is generated about crises in the news and outside observers (Olteanu et al. 2015; Starbird et al. 2010).
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages49 Page
-
File Size-