Anabranch Network for Camouflaged Object Segmentation

1 Computer Vision and Image Understanding journal homepage: www.elsevier.com Anabranch Network for Camouflaged Object Segmentation Trung-Nghia Lea,∗∗, Tam V. Nguyenb,∗∗, Zhongliang Nieb, Minh-Triet Tranc, Akihiro Sugimotod aDepartment of Informatics, The Graduate University for Advanced Studies (SOKENDAI), Tokyo, Japan bDepartment of Computer Science, University of Dayton, Ohio, 45469, United States of America cUniversity of Science, VNU-HCM, Ho Chi Minh, Vietnam dNational Institute of Informatics, Tokyo, Japan ABSTRACT Camouflaged objects attempt to conceal their texture into the background and discriminating them from the background is hard even for human beings. The main objective of this paper is to explore the camouflaged object segmentation problem, namely, segmenting the camouflaged object(s) for a given image. This problem has not been well studied in spite of a wide range of potential applications including the preservation of wild animals and the discovery of new species, surveillance systems, search-and-rescue missions in the event of natural disasters such as earthquakes, floods or hurricanes. This paper addresses a new challenging problem of camouflaged object segmentation. To address this problem, we provide a new image dataset of camouflaged objects for benchmarking purposes. In addition, we propose a general end-to-end network, called the Anabranch Network, that leverages both classification and segmentation tasks. Different from existing networks for segmentation, our proposed network possesses the second branch for classification to predict the probability of contain- ing camouflaged object(s) in an image, which is then fused into the main branch for segmentation to boost up the segmentation accuracy. Extensive experiments conducted on the newly built dataset demonstrate the effectiveness of our network using various fully convolutional networks. © 2019 Elsevier Ltd. All rights reserved. 1. Introduction not play an important role anymore since we have to ignore objects that capture our attention. While detecting camouflaged Camouflage is an attempt to conceal the texture of a fore- objects is technically difficult on one hand, it is beneficial in ground object into the background (Singh et al., 2013). The various practical scenarios, on the other hand, which include arXiv:2105.09451v1 [cs.CV] 20 May 2021 term “camouflage” was first coined from nature where ani- surveillance systems, and search-and-rescue missions. For ex- mals used to hide themselves from predators by changing their ample, detecting and segmenting camouflaged objects in im- body pattern, texture, or color. Unlike salient objects, camou- ages definitely helps us search for camouflaged animals for the flaged objects are hard to detect in their nature even for human preservation of wild animals and the discovery of new species. beings. Autonomously detecting/segmenting camouflaged ob- Nevertheless, camouflaged object segmentation has not been jects is thus a challenging task where discriminative features do well explored in the literature. There exist two types of camouflaged objects in reality: nat- ∗∗Trung-Nghia Le ([email protected]) contributed to this work urally camouflaged objects such as animals or insects hiding while he interned at University of Dayton. He is currently a postdoc at the Institute of Industrial Science, the University of Tokyo, Japan. Tam V. Nguyen themselves from their predators, and artificially camouflaged ([email protected]) is the corresponding author. objects such as soldiers and weapons disguised by artificial tex- 2 Fig. 1. A few examples from our Camouflaged Object (CAMO) dataset with corresponding pixel-level annotations. Camouflaged objects attempt to conceal their texture into the background. ture patterns. These objects first evaluate their surrounding en- rated in what way. We consider that classification scheme can vironment and then change to camouflaged textures via the col- be used as the awareness of existing of camouflaged objects orization of cloths or cover. For either type of camouflaged in an image in order to combine with detection/segmentation objects, it is not obvious to identify them in images. Figure scheme. 1 shows a few examples of camouflaged objects in real life. As reviewed in Singh et al. (Singh et al., 2013), there are a Based on this figure, we can easily see how challenging cam- few methods that study camouflaged object detection in sim- ouflaged object segmentation is. ple contexts. These methods use hand-crafted low-level fea- Camouflaged objects have naturally evolved to exploit weak- tures such as color, edge or texture to detect camouflaged ob- nesses in the visual system of their prey or predator, and thus jects. Such hand-crafted low-level features are designed to be as understanding such a mechanism will provide some insights for much discriminative as possible for detecting and segmenting segmenting camouflaged objects. Without any prior, we hu- objects, which is the opposite way to camouflage. Therefore, man beings easily miss detecting camouflaged objects; how- their performances are limited to camouflaged object segmen- ever, once we are informed a camouflaged object exists in an tation. Furthermore, these methods can be applied only to rela- image, we can carefully scan the entire image to detect it. This tively low-resolution images with a uniform background. Deep also comes up in nature: once the predator has an awareness of features, on the other hand, are known to outperform low-level camouflaged animals in the scene, then it makes efforts to lo- features in many tasks in computer vision such as image clas- calize them for hunting. Therefore, utilizing the awareness as a sification (Krizhevsky et al., 2012), image semantic segmenta- prior can be an essential cue for camouflaged object segmentation (Shelhamer et al., 2016), action recognition (Tran et al., tion. Then, the challenge is what awareness should be incorpo- 2015), and face recognition (Wen et al., 2016). We thus expect 3 that deep features can replace low-level features even for cam- the network is applied on various fully convolutional net- ouflaged object segmentation by incorporating them into a new works (FCNs) for camouflaged object segmentation. network. To use them, however, a large number of data of cam- The datasets, evaluation scripts, models, and results are pub- ouflaged objects are required. Nevertheless, there is no public licly available at our websites 2,3 dataset (both for training data and testing data) for the camou- The remainder of this paper is organized as follows. Sec- flaged object segmentation problem. This obviously poses seri- tion 2 summarizes the related work. Next, Section 3 and Sec- ous problems to (1) train deep learning models and (2) evaluate tion 4 introduce the constructed dataset and the proposed net- the performance of the proposed network. work, respectively. Section 5 then reports extensive experi- The overall contribution of this paper is two-fold: ments over the proposed method and baselines on the newly • We provide a new image dataset of camouflaged objects constructed dataset. Section 5.5.2 discusses the joint training to promote new methods for camouflaged object seg- of ANet. Finally, Section 6 draws the conclusion and paves the mentation. Our newly constructed Camouflaged Object way for the future work. (CAMO) dataset consists of 1250 images, each of which contains at least one camouflaged object. Pixel-wise 2. Related Work ground-truths are manually annotated to each image. Fur- This section first reviews existing works on camouflaged ob- thermore, images in the CAMO dataset involve a variety ject segmentation and salient object segmentation, which share of challenging scenarios such as object appearance, back- some similarities. We also clarify that the camouflaged object ground clutter, shape complexity, small object, object oc- segmentation is more challenging than the salient object seg- clusion, multiple objects, and distraction. We emphasize mentation. Then, we briefly introduce some advancements of that this is the very first dataset for camouflage segmenta- two-stream networks in several computer vision tasks. tion. • We propose a novel end-to-end network, called the 2.1. Camouflaged Object Segmentation 1 Anabranch Network (ANet), which is general, concep- As addressed in the intensive survey compiled by Singh et tually simple, flexible, and efficient for camouflaged ob- al. (Singh et al., 2013), most of the related work (Kavitha et al., ject segmentation. Our proposed ANet leverages the ad- 2011; Pan et al., 2011; Siricharoen et al., 2010; Song and Geng, vantages of different network streams, namely, the classi- 2010; Yin et al., 2011) uses handcrafted, low-level features for fication stream and the segmentation stream, in order to camouflaged region detection/segmentation where the similar- segment camouflaged objects. The classification stream ity between the camouflaged region and the background is eval- is used as the awareness of existing of camouflaged ob- uated using low-level features, e.g., color, shape, orientation, ject(s) in an image. This design is motivated by the obser- and brightness (Galun et al., 2003; Song and Geng, 2010; Xue vation that there is no guarantee that a camouflaged object et al., 2016). We note that there is little work that directly is always present in an image and thus we first identify deals with camouflaged object detection; most of the work is whether a camouflaged object is present in an image and dedicated to detecting the foreground region even when some then only if exists, the object(s) should be accurately seg- of its texture is similar to the background. Pan et al. (Pan mented; nothing is segmented otherwise. Extensive exper- et al., 2011) proposed a 3D convexity based method to detect iments verify the effectiveness of our proposed ANet when camouflaged objects in images. Liu et al. (Liu et al., 2012) 1 An anabranch is a section of a river or stream that diverts from the main 2https://sites.google.com/view/ltnghia/research/camo channel or stem of the watercourse and rejoins the main stem downstream.

Load more