Deep Object Co-Segmentation Via Spatial-Semantic Network Modulation

Total Page:16

File Type:pdf, Size:1020Kb

Deep Object Co-Segmentation Via Spatial-Semantic Network Modulation The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20) Deep Object Co-Segmentation via Spatial-Semantic Network Modulation Kaihua Zhang,1 Jin Chen,1 Bo Liu,2∗ Qingshan Liu1 1B-DAT and CICAEET, Nanjing University of Information Science and Technology, Nanjing, China 2JD Finance America Corporation {zhkhua, kfliubo}@gmail.com Abstract Object co-segmentation is to segment the shared objects in (a) multiple relevant images, which has numerous applications in computer vision. This paper presents a spatial and se- mantic modulated deep network framework for object co- segmentation. A backbone network is adopted to extract multi-resolution image features. With the multi-resolution (b) features of the relevant images as input, we design a spa- tial modulator to learn a mask for each image. The spatial modulator captures the correlations of image feature descrip- tors via unsupervised learning. The learned mask can roughly localize the shared foreground object while suppressing the Figure 1: Object co-segmentation examples by our ap- background. For the semantic modulator, we model it as a proach. (a) Horse group; (b) Horse group co-segmentation supervised image classification task. We propose a hierar- maps. chical second-order pooling module to transform the image features for classification use. The outputs of the two mod- ulators manipulate the multi-resolution features by a shift- features cannot well handle the challenging cases in co- and-scale operation so that the features focus on segmenting segmentation such as background clutter and large-scale co-object regions. The proposed model is trained end-to-end appearance variations of the co-objects in images. In re- without any intricate post-processing. Extensive experiments cent years, deep-learning-based co-segmentation methods on four image co-segmentation benchmark datasets demon- strate the superior accuracy of the proposed method com- have attracted much attention. For example, (Li et al. 2018; pared to state-of-the-art methods. The codes are available at Chen et al. 2018) leverage a Siamese network architecture http://kaihuazhang.net/. for object co-segmentation and an attention mechanism is used to enhance the co-object feature representations. These methods have shown superior performance compared to 1 Introduction the traditional methods (Yuan, Lu, and Shivakumara 2014; As a special case of image object segmentation, object co- Dai et al. 2013; Lee et al. 2015), which inspire us to explore segmentation refers to the task of jointly discovering and a deep-learning-based solution to object co-segmentation. segmenting the objects shared in a group of images. It has One critical property of object co-segmentation is that the been widely used to support various computer vision appli- co-objects in images belong to the same semantic category. cations, such as interactive image segmentation (Kamranian Those co-objects usually occupy part of each image. One il- et al. 2018), 3D reconstruction (Mustafa and Hilton 2017) lustrative example is shown in Figure 1. It is desirable that and object co-localization (Wei et al. 2019; Han et al. 2018), the deep convolutional network layers, being used as a fea- to name a few. ture extractor, are targeted on modelling the co-objects. To Image features that characterize the co-objects in the im- this end, we propose a spatial-semantic modulated network age group are vital for a co-segmentation task. Conven- structure to model this property. The two modulators are tional approaches use the hand-crafted cues such as color achieved by the designed group-wise mask learning branch histograms, Gabor filter outputs and SIFT descriptors as and co-category classification branch, respectively. We sum- feature representations (Yuan, Lu, and Shivakumara 2014; marize the technical contributions of this work as follows: Dai et al. 2013; Lee et al. 2015). Those hand-crafted • We propose a spatial-semantic modulated deep network ∗Corresponding author for object co-segmentation. Image features extracted by a Copyright c 2020, Association for the Advancement of Artificial backbone network are used to learn a spatial modulator Intelligence (www.aaai.org). All rights reserved. and a semantic modulator. The outputs of the modulators 12813 guide the image features up-sampling to generate the co- by each other, and benefit each other synergistically. The segmentation results. The network parameter learning is idea of joint processing can exploit the inherent interdepen- formulated into a multi-task learning task, and the whole dencies of two tasks to achieve better results jointly. Re- network is trained in an end-to-end manner. cently, (Li et al. 2018; Chen et al. 2018) respectively pro- • For the spatial modulation branch, an unsupervised learn- pose an end-to-end deep-learning-based method for object ing method is proposed to learn a mask for each image. co-segmentation using a Siamese encoder-decoder architec- With the fused multi-resolution image features as input, ture and a semantic-aware attention mechanism. we formulate the mask learning as an integer program- 2.2 Network Modulation ming problem. Its continuous relaxation has a closed- form solution. The learned parameter indicates whether Modulation module has been proved to be an effective way the corresponding image pixel corresponds to foreground to manipulate network parameter learning. The modulator or background. can be modelled as parameters or output of the auxiliary branch that are used to guide the main branch parameter • In the semantic modulation branch, we design a hierar- learning. In the segmentation method (Dai, He, and Sun chical second-order pooling (HSP) operator to transform 2015), an image mask is used as a modulator for background the convolutional features for object classification. Spatial removal. In the Mask R-CNN model (He et al. 2017), a clas- pooling (SP) is shown to be able to capture the high-order sification branch is used to guide the segmentation branch feature statistical dependency (Gao et al. 2019). The pro- learning. Feature-wise linear modulation is a widely-used posed HSP module has a stack of two SP layers. They are scheme, which has been applied to object detection (Lin et dedicated to capturing the long-range channel-wise de- al. 2017) and graph neural networks learning (Brockschmidt pendency of the holistic feature representation. The out- 2019). In visual reasoning problem, network modulation is put of the HSP layer is fed into a fully-connected layer for used to encode the language information (De Vries et al. object classification and used as the semantic modulator. 2017; Perez et al. 2018). The attention module in the image We conduct extensive evaluations on four object co- caption model (Chen et al. 2017) can be viewed as a mod- segmentation benchmark datasets (Faktor and Irani 2013; ulator. (Yang et al. 2018) proposes to model the visual and Rubinstein et al. 2013), including the sub-set of MSRC, In- spatial information by a modulator for video object segmen- ternet, the sub-set of iCoseg and PASCAL-VOC datasets. tation. In (Flores et al. 2019), a saliency detection branch is The proposed model achieves a significantly higher accuracy added to an existing CNN architecture as a modulator for than state-of-the-art methods. Especially, on the most chal- fine-grained object recognition. A cross-modulation mecha- lenging PASCAL-VOC dataset, our method outperforms the nism is proposed in (Prol, Dumoulin, and Herranz 2018) for second best-performing state-of-the-art approach (Li et al. few-shot learning. 2018) by 6% in terms of average Jaccard index J . The rest of this work is organized as follows: In § 2, 3 Proposed Approach we introduce the related works of our study. § 3 describes 3.1 Problem Formulation the proposed framework and its main components. Af- Figure 2 presents an overview of our model. Given a group terwards, comprehensive experimental evaluations are pre- I { n}N of N images = I n=1 containing co-objects of a spe- sented in § 4. Finally, we conclude this work in § 5. cific category, our objective is to learn a feed-forward net- work f that produces a set of object co-segmentation masks M {Mn}N 2 Related Work = n=1: M I θ 2.1 Object Co-segmentation = f( ; ), (1) where θ denotes the network parameters to be optimized. A more comprehensive literature review about image co- The network f is composed of three sub-networks: spa- segmentation can be found in (Zhu et al. 2016). Existing tial modulation sub-net f , semantic modulation sub-net object co-segmentation methods can be roughly grouped spa fsem and segmentation sub-net fseg. The renowned SPP- into four categories including graph-based model, saliency- Net (He et al. 2015) has shown that the convolutional fea- based model, joint processing model and deep learning ture maps (CFMs) for object recognition encode both spa- model. Conventional approaches such as (Yuan, Lu, and tial layouts of objects (by their positions) and the semantics Shivakumara 2014; Collins et al. 2012; Lee et al. 2015) (by strengths of their activations). Inspired by this model, assume the pixels or superpixels in the co-objects can be we design fspa and fsem to encode the spatial and seman- grouped together and then they formulate co-segmentation tic information of the co-objects in I, respectively. The two as a clustering task to search for the co-objects. Saliency- modulators guide the convolution layers learning in fseg to detection-based methods assume regions of interest in the focus on the co-objects in the images. Specifically, the sub- images are usually the co-objects to be segmented. They net f is to learn a mask for each image to coarsely local- conduct image co-segmentation through detecting the re- spa ize the co-object in it.
Recommended publications
  • Mimicking Very Efficient Network for Object Detection
    Mimicking Very Efficient Network for Object Detection Quanquan Li1, Shengying Jin2, Junjie Yan1 1SenseTime 2Beihang University [email protected], [email protected], [email protected] Abstract Method MR−2 Parameters test time (ms) Inception R-FCN 7.15 2.5M 53.5 Current CNN based object detectors need initialization 1/2-Inception 7.31 625K 22.8 from pre-trained ImageNet classification models, which are Mimic R-FCN usually time-consuming. In this paper, we present a fully 1/2-Inception finetuned 8.88 625K 22.8 convolutional feature mimic framework to train very effi- from ImageNet cient CNN based detectors, which do not need ImageNet pre-training and achieve competitive performance as the Table 1: The parameters and test time of large and small mod- els. Tested on TITANX with 1000×1500 input. The 1/2-Inception large and slow models. We add supervision from high-level model trained by mimicking outperforms that fine-tuned from Im- features of the large networks in training to help the small ageNet pre-trained model. Moreover, it obtains similar perfor- network better learn object representation. More specifi- mance as the large Inception model with only 1/4 parameters and cally, we conduct a mimic method for the features sampled achieves a 2.5× speed-up. from the entire feature map and use a transform layer to map features from the small network onto the same dimen- sion of the large network. In training the small network, we AlexNet gets 56.8% AP. Due to this phenomenon, nearly optimize the similarity between features sampled from the all the modern detection methods can only train networks same region on the feature maps of both networks.
    [Show full text]
  • Detection-Based Object Tracking Applied to Remote Ship Inspection
    sensors Article Detection-Based Object Tracking Applied to Remote Ship Inspection Jing Xie 1,* , Erik Stensrud 1 and Torbjørn Skramstad 2 1 Group Technology and Research, DNV GL, Veritasveien 1, 1363 Høvik, Norway; [email protected] 2 Department of Computer Science, Norwegian University of Science and Technology, NO-7491 Trondheim, Norway; [email protected] * Correspondence: [email protected] Abstract: We propose a detection-based tracking system for automatically processing maritime ship inspection videos and predicting suspicious areas where cracks may exist. This system consists of two stages. Stage one uses a state-of-the-art object detection model, i.e., RetinaNet, which is customized with certain modifications and the optimal anchor setting for detecting cracks in the ship inspection images/videos. Stage two is an enhanced tracking system including two key components. The first component is a state-of-the-art tracker, namely, Channel and Spatial Reliability Tracker (CSRT), with improvements to handle model drift in a simple manner. The second component is a tailored data association algorithm which creates tracking trajectories for the cracks being tracked. This algorithm is based on not only the intersection over union (IoU) of the detections and tracking updates but also their respective areas when associating detections to the existing trackers. Consequently, the tracking results compensate for the detection jitters which could lead to both tracking jitter and creation of redundant trackers. Our study shows that the proposed detection-based tracking system has achieved a reasonable performance on automatically analyzing ship inspection videos. It has proven the feasibility of applying deep neural network based computer vision technologies to automating remote ship inspection.
    [Show full text]
  • Machine Learning for Blob Detection in High-Resolution 3D Microscopy Images
    DEGREE PROJECT IN COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2018 Machine learning for blob detection in high-resolution 3D microscopy images MARTIN TER HAAK KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE Machine learning for blob detection in high-resolution 3D microscopy images MARTIN TER HAAK EIT Digital Data Science Date: June 6, 2018 Supervisor: Vladimir Vlassov Examiner: Anne Håkansson Electrical Engineering and Computer Science (EECS) iii Abstract The aim of blob detection is to find regions in a digital image that dif- fer from their surroundings with respect to properties like intensity or shape. Bio-image analysis is a common application where blobs can denote regions of interest that have been stained with a fluorescent dye. In image-based in situ sequencing for ribonucleic acid (RNA) for exam- ple, the blobs are local intensity maxima (i.e. bright spots) correspond- ing to the locations of specific RNA nucleobases in cells. Traditional methods of blob detection rely on simple image processing steps that must be guided by the user. The problem is that the user must seek the optimal parameters for each step which are often specific to that image and cannot be generalised to other images. Moreover, some of the existing tools are not suitable for the scale of the microscopy images that are often in very high resolution and 3D. Machine learning (ML) is a collection of techniques that give computers the ability to ”learn” from data. To eliminate the dependence on user parameters, the idea is applying ML to learn the definition of a blob from labelled images.
    [Show full text]
  • Object Detection in 20 Years: a Survey
    1 Object Detection in 20 Years: A Survey Zhengxia Zou, Zhenwei Shi, Member, IEEE, Yuhong Guo, and Jieping Ye, Senior Member, IEEE Abstract—Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Its development in the past two decades can be regarded as an epitome of computer vision history. If we think of today’s object detection as a technical aesthetics under the power of deep learning, then turning back the clock 20 years we would witness the wisdom of cold weapon era. This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century’s time (from the 1990s to 2019). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed up techniques, and the recent state of the art detection methods. This paper also reviews some important detection applications, such as pedestrian detection, face detection, text detection, etc, and makes an in-deep analysis of their challenges as well as technical improvements in recent years. Index Terms—Object detection, Computer vision, Deep learning, Convolutional neural networks, Technical evolution. F 1 INTRODUCTION BJECT detection is an important computer vision task O that deals with detecting instances of visual objects of a certain class (such as humans, animals, or cars) in digital images. The objective of object detection is to develop computational models and techniques that provide one of the most basic pieces of information needed by computer vision applications: What objects are where? As one of the fundamental problems of computer vision, object detection forms the basis of many other computer vision tasks, such as instance segmentation [1–4], image captioning [5–7], object tracking [8], etc.
    [Show full text]
  • Cyclesegnet: Object Co-Segmentation with Cycle Refinement and Region Correspondence
    1 CycleSegNet: Object Co-Segmentation with Cycle Refinement and Region Correspondence Chi Zhang∗, Guankai Li∗, Guosheng Lin, Qingyao Wu, Rui Yao Abstract—Image co-segmentation is an active computer vision task that aims to segment the common objects from a set of images. Recently, researchers design various learning-based algorithms to undertake the co-segmentation task. The main difficulty in this task is how to effectively transfer information between images to make conditional predictions. In this paper, we present CycleSegNet, a novel framework for the co-segmentation task. Our network design has two key components: a region correspondence module which is the basic operation for exchanging information between local image regions, and a cycle refinement module, which utilizes ConvLSTMs to progressively update image representations and exchange information in a cycle and iterative manner. Extensive experiments demonstrate that our proposed method significantly outperforms the state-of-the-art methods on four popular benchmark datasets — PASCAL VOC dataset, MSRC dataset, Internet dataset, and iCoseg dataset, by 2.6%, 7.7%, 2.2%, and 2.9%, respectively. Index Terms—deep learning, co-segmentation, cycle refinement, attention F 1 INTRODUCTION Image co-segmentation is an active computer vision topic with a long research history, which aims to segment the common objects jointly from a set of images. Image co- segmentation algorithms have shown their usages in var- ious computer vision tasks, such as image retrieval [47], 3D reconstruction [37], photo collections [41], image match- ing [6], [58], [59], and video object tracking [32], [33], [41]. Recently, data-driven deep neural networks based meth- ods attract wide interest in the literature.
    [Show full text]
  • Towards Efficient Video Detection Object Super-Resolution with Deep Fusion Network for Public Safety
    Hindawi Security and Communication Networks Volume 2021, Article ID 9999398, 14 pages https://doi.org/10.1155/2021/9999398 Research Article Towards Efficient Video Detection Object Super-Resolution with Deep Fusion Network for Public Safety Sheng Ren , Jianqi Li , Tianyi Tu , Yibo Peng , and Jian Jiang School of Computer and Electrical Engineering, Hunan University of Arts and Science, Changde 415000, China Correspondence should be addressed to Jianqi Li; [email protected] Received 22 March 2021; Revised 14 April 2021; Accepted 14 May 2021; Published 24 May 2021 Academic Editor: David Meg´ıas Copyright © 2021 Sheng Ren et al. *is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Video surveillance plays an increasingly important role in public security and is a technical foundation for constructing safe and smart cities. *e traditional video surveillance systems can only provide real-time monitoring or manually analyze cases by reviewing the surveillance video. So, it is difficult to use the data sampled from the surveillance video effectively. In this paper, we proposed an efficient video detection object super-resolution with a deep fusion network for public security. Firstly, we designed a super-resolution framework for video detection objects. By fusing object detection algorithms, video keyframe selection al- gorithms, and super-resolution reconstruction algorithms, we proposed a deep learning-based intelligent video detection object super-resolution (SR) method. Secondly, we designed a regression-based object detection algorithm and a key video frame selection algorithm.
    [Show full text]
  • Deepco3: Deep Instance Co-Segmentation by Co-Peak Search and Co-Saliency Detection
    DeepCO3: Deep Instance Co-segmentation by Co-peak Search and Co-saliency Detection Kuang-Jui Hsu1,2 Yen-Yu Lin1 Yung-Yu Chuang1,2 1Academia Sinica, Taiwan 2National Taiwan University, Taiwan Abstract In this paper, we address a new task called instance co- segmentation. Given a set of images jointly covering object instances of a specific category, instance co-segmentation aims to identify all of these instances and segment each of them, i.e. generating one mask for each instance. This task is important since instance-level segmentation is preferable for humans and many vision applications. It is also chal- lenging because no pixel-wise annotated training data are available and the number of instances in each image is un- known. We solve this task by dividing it into two sub-tasks, co-peak search and instance mask segmentation. In the for- mer sub-task, we develop a CNN-based network to detect Figure 1. Two examples of instance co-segmentation on categories bird and sheep, respectively. An instance here refers to an object the co-peaks as well as co-saliency maps for a pair of im- appearing in an image. In each example, the top row gives the ages. A co-peak has two endpoints, one in each image, that input images while the bottom row shows the instances segmented are local maxima in the response maps and similar to each by our method. The instance-specific coloring indicates that our other. Thereby, the two endpoints are potentially covered by method produces a segmentation mask for each instance.
    [Show full text]
  • A Survey of Deep Learning-Based Object Detection
    1 A Survey of Deep Learning-based Object Detection Licheng Jiao, Fellow, IEEE, Fan Zhang, Fang Liu, Senior Member, IEEE, Shuyuan Yang, Senior Member, IEEE, Lingling Li, Member, IEEE, Zhixi Feng, Member, IEEE, and Rong Qu, Senior Member, IEEE Abstract—Object detection is one of the most important and understanding, object detection has been widely used in many challenging branches of computer vision, which has been widely fields of modern life, such as security field, military field, applied in peoples life, such as monitoring security, autonomous transportation field, medical field and life field. Furthermore, driving and so on, with the purpose of locating instances of semantic objects of a certain class. With the rapid development many benchmarks have played an important role in object of deep learning networks for detection tasks, the performance detection field so far, such as Caltech [1], KITTI [2], ImageNet of object detectors has been greatly improved. In order to [3], PASCAL VOC [4], MS COCO [5], and Open Images V5 understand the main development status of object detection [6]. In ECCV VisDrone 2018 contest, organizers have released pipeline, thoroughly and deeply, in this survey, we first analyze a novel drone platform-based dataset [7] which contains a large the methods of existing typical detection models and describe the benchmark datasets. Afterwards and primarily, we provide a amount of images and videos. comprehensive overview of a variety of object detection methods • Two kinds of object detectors in a systematic manner, covering the one-stage and two-stage Pre-existing domain-specific image object detectors usually detectors.
    [Show full text]
  • Yolomask, an Instance Segmentation Algorithm Based on Complementary Fusion Network
    mathematics Article YOLOMask, an Instance Segmentation Algorithm Based on Complementary Fusion Network Jiang Hua 1, Tonglin Hao 2, Liangcai Zeng 1 and Gui Yu 1,3,* 1 Key Laboratory of Metallurgical Equipment and Control Technology, Ministry of Education, Wuhan University of Science and Technology, Wuhan 430081, China; [email protected] (J.H.); [email protected] (L.Z.) 2 School of Automation, Wuhan University of Science and Technology, Wuhan 430081, China; [email protected] 3 School of Mechanical and Electrical Engineering, Huanggang Normal University, Huanggang 438000, China * Correspondence: [email protected] Abstract: Object detection and segmentation can improve the accuracy of image recognition, but traditional methods can only extract the shallow information of the target, so the performance of algorithms is subject to many limitations. With the development of neural network technology, semantic segmentation algorithms based on deep learning can obtain the category information of each pixel. However, the algorithm cannot effectively distinguish each object of the same category, so YOLOMask, an instance segmentation algorithm based on complementary fusion network, is proposed in this paper. Experimental results on public data sets COCO2017 show that the proposed fusion network can accurately obtain the category and location information of each instance and has good real-time performance. Keywords: image segmentation; deep learning; instance segmentation; fusion network Citation: Hua, J.; Hao, T.; Zeng, L.; Yu, G. YOLOMask, an Instance Segmentation Algorithm Based on Complementary Fusion Network. 1. Introduction Mathematics 2021, 9, 1766. https:// Object detection and segmentation based on RGB images are the basis of 6D pose doi.org/10.3390/math9151766 estimation, and it is also the premise of successful robot grasping.
    [Show full text]
  • Towards Real-Time Object Detection on Edge with Deep Neural Networks
    TOWARDS REAL-TIME OBJECT DETECTION ON EDGE WITH DEEP NEURAL NETWORKS A Dissertation presented to the Faculty of the Graduate School at the University of Missouri In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy by ZHI ZHANG Dr. Zhihai He, Dissertation Supervisor December 2018 The undersigned, appointed by the Dean of the Graduate School, have examined the dissertation entitled: TOWARDS REAL-TIME OBJECT DETECTION ON EDGE WITH DEEP NEURAL NETWORKS presented by Zhi Zhang, a candidate for the degree of Doctor of Philosophy and hereby certify that, in their opinion, it is worthy of acceptance. Dr. Zhihai He Dr. Guilherme DeSouza Dr. Dominic Ho Dr. Jianlin Cheng ACKNOWLEDGMENTS This is the perfect time to replay my memories during my doctorate life, and I suddenly recall so many people to thank. First and foremost, I would like to sincerely thank my advisor Dr. Zhihai He, who guided me into the research community. I am fortunate enough to have Dr. He’s professionalism backing my research and study. Dr. He is an awesome friend and tutor as well, which is warm and nice considering I am studying ten thousand miles away from home. I would also like to express my deep gratitude to Dr Guilherme DeSouza, Dr. Dominic Ho, Dr Jianlin Chen, for being so supportive committee members. Besides, it is a good time to sincerely thank professors and department faculties Dr. Tony Han, Dr. Michela Becchi, Dr. James Keller and so many more, who teach me knowledge and help me walk through the doctoral degree. And of course, how can I forget my dear colleagues and friends in Mizzou and Columbia: Xiaobo Ren, Yifeng Zeng, Chen Huang, Guanghan Ning, Zhiqun Zhao, Yang Li, Hao Sun and Hayder Yousif.
    [Show full text]
  • Content-Based Image Copy Detection Using Convolutional Neural Network
    electronics Article Content-Based Image Copy Detection Using Convolutional Neural Network Xiaolong Liu 1,2,* , Jinchao Liang 1, Zi-Yi Wang 3, Yi-Te Tsai 4, Chia-Chen Lin 3,5,* and Chih-Cheng Chen 6,* 1 College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China; [email protected] 2 Digital Fujian Institute of Big Data for Agriculture and Forestry, Fujian Agriculture and Forestry University, Fuzhou 350002, China 3 Department of Computer Science and Information Engineering, National Chin-Yi University of Technology, Taichung 41170, Taiwan; [email protected] 4 Department of Computer Science and Communication Engineering, Providence University, Taichung 43301, Taiwan; [email protected] 5 Department of Computer Science and Information Management, Providence University, Taichung 43301, Taiwan 6 Department of Aeronautical Engineering, Chaoyang University of Technology, Taichung 413310, Taiwan * Correspondence: [email protected] (X.L.); [email protected] (C.-C.L.); [email protected] (C.-C.C.) Received: 27 October 2020; Accepted: 23 November 2020; Published: 1 December 2020 Abstract: With the rapid development of network technology, concerns pertaining to the enhancement of security and protection against violations of digital images have become critical over the past decade. In this paper, an image copy detection scheme based on the Inception convolutional neural network (CNN) model in deep learning is proposed. The image dataset is transferred by a number of image processing manipulations and the feature values in images are automatically extracted for learning and detecting the suspected unauthorized digital images. The experimental results show that the proposed scheme takes on an extraordinary role in the process of detecting duplicated images with rotation, scaling, and other content manipulations.
    [Show full text]
  • Multiple Object Recognition Using Opencv
    Multiple Object Recognition Using OpenCV D. Kavitha1; B.P. Rishi Kiran2; B. Niteesh3; S. Praveen4 1Assistant Professor, Department of Computer Science and Engineering, SRM Institute of Science and Technology, Ramapuram Campus, Chennai, India. [email protected] 2Department of Computer Science and Engineering, SRM Institute of Science and Technology, Ramapuram Campus, Chennai, India. [email protected] 3Department of Computer Science and Engineering, SRM Institute of Science and Technology, Ramapuram Campus, Chennai, India. [email protected] 4Department of Computer Science and Engineering, SRM Institute of Science and Technology, Ramapuram Campus, Chennai, India. [email protected] Abstract For automatic vision systems used in agriculture, the project presents object characteristics analysis using image processing techniques. In agriculture science, automatic object characteristics identification is important for monitoring vast areas of crops, and it detects signs of object characteristics as soon as it occurs on plant leaves. Image content characterization and supervised classifier type neural network are used in the proposed deciding method. Pre-processing, image segmentation, and detection are some of the image processing methods used in this form of decision making. An image data will be rearranged and, if necessary, a region of interest will be selected during preparation. For network training and classification, colour and texture features are extracted from an input. Colour characteristics such as mean and variance in the HSV colour space, as well as texture characteristics such as energy, contrast, homogeneity, and correlation. The device will be trained to automatically identify test images in order to assess object characteristics. With some training samples of that type, an automated classifier NN could be used for classification supported learning in this method.
    [Show full text]