Automatic Detection of Discrimination Actions from Social Images
Total Page:16
File Type:pdf, Size:1020Kb
electronics Article Automatic Detection of Discrimination Actions from Social Images Zhihao Wu 1 , Baopeng Zhang 1,*, Tianchen Zhou 1, Yan Li 1 and Jianping Fan 2 1 School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China; [email protected] (Z.W.); [email protected] (T.Z.); [email protected] (Y.L.) 2 AI Lab, Lenovo Research, Beijing 100094, China; [email protected] * Correspondence: [email protected] Abstract: In this paper, we developed a practical approach for automatic detection of discrimination actions from social images. Firstly, an image set is established, in which various discrimination actions and relations are manually labeled. To the best of our knowledge, this is the first work to create a dataset for discrimination action recognition and relationship identification. Secondly, a practical approach is developed to achieve automatic detection and identification of discrimination actions and relationships from social images. Thirdly, the task of relationship identification is seamlessly integrated with the task of discrimination action recognition into one single network called the Co- operative Visual Translation Embedding++ network (CVTransE++). We also compared our proposed method with numerous state-of-the-art methods, and our experimental results demonstrated that our proposed methods can significantly outperform state-of-the-art approaches. Keywords: discrimination action recognition; relationship prediction; representation learning Citation: Wu, Z.; Zhang, B.; Zhou, T.; 1. Introduction Li, Y.; Fan, J. Automatic Detection of In line with the popularity of smart devices, most social platforms, such as Facebook Discrimination Actions from Social and Twitter, have incorporated image sharing into their important functions. Many people, Images. Electronics 2021, 10, 325. particularly young users of social networks, may unintentionally share their discriminatory https://doi.org/10.3390/ images (i.e., social images with discriminatory actions or attitudes) with the public, and electronics10030325 may not be aware of the potential impact on their future lives and negative effects on others or individual groups. Unfortunately, social images and their tags, and text-based Academic Editor: Byung Cheol Song discussions, may frequently reveal and spread discriminatory attitudes (such as race Received: 27 December 2020 discrimination, age discrimination, gender discrimination, etc.). Control over the sharing Accepted: 26 January 2021 Published: 30 January 2021 of these images and discussions has become one of the most significant challenges on social networks. Numerous application domains leverage artificial intelligence to improve Publisher’s Note: MDPI stays neutral application utility and automation level, such as communication systems [1], production with regard to jurisdictional claims in systems [2,3] and human–machine interaction systems [4,5]. Many social network sites published maps and institutional affil- may potentially abuse the technologies of artificial intelligence and computer vision in iations. automatically tagging human faces [6,7], recognizing users’ activity and sharing their attitudes with others [8], identifying their associates, and, in particular, sorting people into various social categories. However, most social platforms have some images that contain discriminatory information, intentionally or unintentionally, and these discriminatory images can significantly affect people’s normal life [9]. Therefore, to avoid the problem Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland. of users widely revealing their data to the public, discriminatory image recognition is This article is an open access article an attractive approach for preventing direct image discrimination during social sharing. distributed under the terms and In particular, capturing various types of image discrimination and their most relevant conditions of the Creative Commons discrimination-sensitive object classes and attributes, such as discriminative gestures, is Attribution (CC BY) license (https:// an enabling technology for avoiding discrimination propagation. However, there are no creativecommons.org/licenses/by/ relevant research works on discriminatory image recognition tasks, including necessary 4.0/). datasets and related algorithms. Electronics 2021, 10, 325. https://doi.org/10.3390/electronics10030325 https://www.mdpi.com/journal/electronics Electronics 2021, 10, x FOR PEER REVIEW 2 of 19 Electronics 2021, 10, 325 2 of 19 relevant research works on discriminatory image recognition tasks, including necessary datasets and related algorithms. TheThe discrimination-sensitive discrimination-sensitive object object classes classes and and attributes attributes appearing appearing in in social social images images cancan be be used used to to effectively effectively characterize characterize their their sensitivity sensitivity (discrimination). (discrimination). Thus, Thus, supporting supporting thethe understanding understanding of of deep deep images images plays plays an an important important role role in in achieving achieving more more accurate accurate characterizationcharacterization and and identification identification ofof discriminatorydiscriminatory attitudes in in social social images images (i.e., (i.e., image im- agediscrimination). discrimination). In addition In addition to todiscrimination discrimination gestures, gestures, emotional emotional interaction interaction exists exists be- betweentween persons, persons, that that is, is, the discriminatory relationshiprelationship hashas aa subjectsubject and and an an object. object. The The semanticssemantics of of the the comparison comparison of of facial facial expressions expressions between between the the subject subject and and object object may may strengthenstrengthen the the judgement judgement of of the the discriminatory discriminatory relationship. relationship. We We regard regard discriminatory discriminatory detectiondetection as as a classificationa classification problem, problem, but abut person’s a person’s gestures gestures and interactions and interactions are not suitedare not to the use of the classical image processing method [10]. It can be seen from the above suited to the use of the classical image processing method [10]. It can be seen from the analysis that discriminatory image recognition can adopt both the relational model and the above analysis that discriminatory image recognition can adopt both the relational model traditional action recognition model. Therefore, a recognition scheme combining the action and the traditional action recognition model. Therefore, a recognition scheme combining and the relationship is proposed in this paper. We optimized the existing relationship the action and the relationship is proposed in this paper. We optimized the existing rela- model by exploiting semantic transformation and incorporated the action model to propose tionship model by exploiting semantic transformation and incorporated the action model the Co-operative Visual Translation Embedding network++(CVTransE++). to propose the Co-operative Visual Translation Embedding network++(CVTransE++). According to our investigation, no dataset on discriminatory action has been compiled According to our investigation, no dataset on discriminatory action has been com- to date. We propose a data set named the Discriminative Action Relational Image dataset piled to date. We propose a data set named the Discriminative Action Relational Image (DARIdata), which contains a total of 12,287 images, including six scenes. To make the dataset (DARIdata), which contains a total of 12,287 images, including six scenes. To make dataset applicable to both the relational model and the action model, we defined the persons the dataset applicable to both the relational model and the action model, we defined the and the relationships according to the discriminatory actions, 11 person characteristic categories,persons and and the 17 relationships relationship categories. according to the discriminatory actions, 11 person charac- teristicIn this categories, study, a and discriminatory 17 relationship image categories. dataset was constructed, and a network model was builtIn this onthe study, dataset a discriminatory to achieve discriminatory image dataset image wasrecognition constructed, (as and shown a network in Figure model1 ). Towas summarize, built on the the dataset main contributions to achieve discrimina are three-fold.tory image recognition (as shown in Figure 1). To summarize, the main contributions are three-fold. Figure 1. With a discriminatory or friendly image as input, the model outputs the action category Figure 1. With a discriminatory or friendly image as input, the model outputs the action category of persons in the image and the relationship category of the persons. - indicates be discriminated, + of persons in the image and the relationship category of the persons. − indicates be discriminated, indicates discriminatory. + indicates discriminatory. First, we propose a fully annotated dataset (DARIdata) to assist with the study of dis- First, we propose a fully annotated dataset (DARIdata) to assist with the study of criminatory action recognition and discriminatory relation prediction. To the best of our dis-criminatory action recognition and discriminatory relation prediction. To the best of our knowledge, this is the first benchmark dataset used for discriminatory emotional relationships.