Metadata-Based Detection of Child Sexual Abuse Material
Total Page:16
File Type:pdf, Size:1020Kb
1 Metadata-Based Detection of Child Sexual Abuse Material Mayana Pereira, Rahul Dodhia, and Richard Brown Abstract—In the last decade, the scale of creation and dis- with non-profits such as Project VIC International 1, Thorn 2 tribution of child sexual abuse medias (CSAM) has exponen- and the Internet Watch Foundation3 to create tools to combat tially increased. Technologies that aid law enforcement agencies CSAM proliferation. worldwide to identify such crimes rapidly can potentially result in the mitigation of child victimization, and the apprehending With the COVID-19 pandemic, experts have observed that of offenders. Machine learning presents the potential to help the distribution of CSAM in social material and video con- law enforcement rapidly identify such material, and even block ference apps has significantly increased [25]. Distributors use such content from being distributed digitally. However, collecting coded language to trade links of CSAM hosted in plain sight and storing CSAM files to train machine learning models has on content sharing platforms such as Facebook, YouTube, many ethical and legal constraints, creating a barrier to the development of accurate computer vision-based models. With Twitter, and Instagram using cryptic language to evade the such restrictions in place, the development of accurate machine current detection tools [6], [25]. learning classifiers for CSAM identification based on file meta- data becomes crucial. In this work, we propose a system for CSAM identification on A. Preventing Distribution of Undiscovered Material. file storage systems based solely on metadata - file paths. Our The identification of CSAM is an extremely challenging aim is to provide a tool that is material type agnostic (image, problem. It starts with the fact that it can manifest in different video, PDF), and can potentially scans thousands of file storage systems in a short time. Our approach uses convolutional neural types of material: images, videos, streaming, video conference, networks, and achieves an accuracy of 97% and recall of 94%. online gaming, among others. CSAM still undiscovered and Additionally, we address the potential problem of offenders trying unlabeled on the internet is potentially several magnitudes to evade detection by this model by evaluating the robustness of greater than previously identified CSAM. New material is our model against adversarial modifications in the file paths. created daily and digital platforms have a large role to play in Index Terms—child sexual abuse medias; child exploitation; the detection and and removal of this material. CSAM; file path; file storage systems; cloud storage; machine Multiple approaches to the CSAM identification problem learning; neural networks. using computer vision-based methods have appeared in the literature in the past years [28], [16], [32]. Although these I. INTRODUCTION approaches appear promising, they lack evaluation in real- world big data unbalanced data sets; data sets with ethnicity NTERNATIONAL law enforcement handle millions of diversity, and time efficiency and scalability evaluation. To I child sexual abuse cases annually. Hotlines internationally keep up with all new forms of CSAM content distribution [6], have received and reviewed over 37 million child sexual abuse all these factors must be carefully evaluated and taken into files in 2017, alone [5]. Despite the 2008 Protect our children consideration. arXiv:2010.02387v1 [cs.LG] 5 Oct 2020 act [20], the numbers of CSAM in digital platforms have dramatically grown in the last decade[30]. Online sharing plat- forms have facilitated[24] the explosive growth of CSAM cre- B. Our Contributions ation and distribution [6]. Every platform for content searching Given the complexity of the problem of detecting CSAM, it and sharing, including social material, likely has CSAM on it is essential to analyze different aspects of the CSAM distribu- [15]. This is an issue that affects society in larger numbers than tion. This paper proposes statistical models that compute the expected. Recent studies reported alarming statistics, where likelihood that a file path is associated with a CSAM file. An 25% of girls and 17% of boys in the U.S. will experience advantage of this approach is that it does not require posses- some form of sexual abuse before the age of 18 [27]. sion of the raw CSA image or video for training machine As the scale of the problem grew, technology became learning models. Specifically, we accomplish the following essential in CSAM identification. Technology giants such as tasks: Microsoft, Apple, Facebook, and Google have made detection • Train and compare several machine-learning based mod- and removal of CSAM one of their top concerns, and work els that analyze metadata from file storage systems and determine a probability that a given file has child sexual M. Pereira is with Microsoft Corporation, AI for Good Research Lab, Redmond, WA, 98052 and also with the Universidade de Bras´ılia, Campus abuse content. Darcy Ribeiro, Bras´ılia, Brazil (e-mail: [email protected]). 1 R. Dodhia is with Microsoft Corporation, AI for Good Research Lab, https://www.projectvic.org Redmond, WA, 98052. 2https://www.thorn.org R. Brown is with Project VIC International, Neptune City, NJ, 07753. 3https://www.iwf.org.uk 2 • Analyze the robustness of our models against commonly C. Machine Learning Based on Complementary Signals and used techniques for avoiding detection in text-based ma- Metadata chine learning classifiers. We study the resilience of our While significant efforts have focused on the images them- models against black-box attackers (attackers that have selves, some researchers have looked for complementary no knowledge of the models and their parameters) and signals and metadata that could help in CSAM identifica- against attackers that have knowledge about words that tion. For example, queries that bring up CSAM in search have a high probability to appear in CSAM file paths. engines, or conversations that imply grooming or trade in • We train our models on a real-world data set containing CSAM[26]. Other efforts have used textual signals to identify over one million file paths. It is the largest file path data where CSAM might be located, such as keywords related set ever used for CSAM detection to date. Our classifiers to website content[29], using NLP analysis[23], [21], [2], achieve recall rates over 94% and accuracy over 97%. conversations[4]. Our work falls into this category. Our results demonstrate, for the first time, the practicality of using file paths for CSAM detection in storage systems. D. File Path classification To the best of our knowledge, there is only a single paper that describes an attempt to identify CSAM based solely on file II. RELATED WORK paths [2]. While pioneering, this work was based on a small data set containing only 9,000 CSAM file paths, and their A. PhotoDNA Hash best model achieves recall rates lower than 80% for CSAM Identification of CSAM via statistical algorithms is a fairly file paths. High recall rate is critical for this application. recent approach. At the turn of the century, laws targeting Additionally, the work in [2] does not address the question online exploitation were introduced and refined (COPA in the of classifiers robustness against noise introduced at test time US, Crime and Disorder Act UK)[7], but the first widely (adversarial attacks). used methodology was released in 2008. Known widely as PhotoDNA[17], the technology was developed by Dr. Hany E. Robustness Against Adversarial Attacks Farid and Microsoft. Whenever a machine learning classifier is used for detecting In PhotoDNA, an image is converted to a long string of malicious behavior or content, it is important to consider characters by a fuzzy hash algorithm, and this hash could the possibility that inputs to the model have been modified then be compared against other hashes to find identical or in an adversarial way. Adversarial inputs to a classifier are similar images without people having to view the images inputs that have been modified in an intentional way to and compromising the victim’s identity. This system is still make detection of malicious activity harder. An adversary can one of the most widely used methods for detecting images intentionally flip characters of words that most likely contain worldwide, databases of hashed CSAM images are used to signal for helping in the classification task, e.g. using Lo7ita identify identical or similar images on many digital platforms, instead of Lolita. including search engines, social networks, and file storage Our proposed techniques are based on classification of file platforms. The advantage of PhotoDNA Hash is that it has paths. Thus, it is important to study the effects of adversarial a low false-positive detection rate, but it has some clear inputs in the classifier performance. drawbacks. It only works on known CSAM that has gone Such a question is not new in natural language processing through its hashing algorithm, and identification of new CSAM and the effect of random modifications to a classifier’s inputs is still primarily a manual labeling endeavor. Labelers have to have been previously studied in the context of text classifica- be trained and wellness programs have to be developed to tion [1], machine translation [3] and word embedding [11]. counter the psychological impact of viewing these images. In this paper, we study the robustness of our model in two different situations. One where such an adversary has no knowledge of model’s parameters [8], or even the score output B. Machine Learning for Image Identification generated at test time [12]. We call this a black-box adversary. Additionally, we also present an analysis where the adver- Since PhotoDNA’s first development, computer vision mod- sary has some knowledge of the models and uses it to target els have undergone a revolution resulting in novel machine regions of the file path that most likely contain signals for learning based models for pornography and CSAM detection classification.