Optimization Scheme for Accurate Water Semantic Segmentation
Total Page:16
File Type:pdf, Size:1020Kb
Received September 2, 2019, accepted October 9, 2019, date of publication October 25, 2019, date of current version November 6, 2019. Digital Object Identifier 10.1109/ACCESS.2019.2949635 Multiscale Features Supported DeepLabV3C Optimization Scheme for Accurate Water Semantic Segmentation ZIYAO LI , RUI WANG , WEN ZHANG, FENGMIN HU, AND LINGKUI MENG School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China Corresponding author: Lingkui Meng ([email protected]) This work was supported by the National Key Research and Development Program of China. (No. 2017YFC0405806). ABSTRACT In the task of using deep learning semantic segmentation model to extract water from high-resolution remote sensing images, multiscale feature sensing and extraction have become critical factors that affect the accuracy of image classification tasks. A single-scale training mode will cause one-sided extraction results, which can lead to ``reverse'' errors and imprecise detail expression. Therefore, fusing multiscale features for pixel-level classification is the key to achieving accurate image segmentation. Based on this concept, this paper proposes a deep learning scheme to achieve fine extraction of image water bodies. The process includes multiscale feature perception splitting of images, a restructured deep learning network model, multiscale joint prediction, and postprocessing optimization performed by a fully connected conditional random field (CRF). According to the scale space concept of remote sensing, we apply hierarchical multiscale splitting processing to images. Then, we improve the structure of the image semantic segmentation model DeepLabV3C, an advanced image semantic segmentation model, and adjust the feature output layer of the model to multiscale features after weighted fusion. At the back end of the deep learning model, the water boundary details are optimized with the fully connected CRF. The proposed multiscale training method is well adapted to feature extraction for the different scale images in the model. In the multiscale output fusion, assigning different weights to the output features of each scale controls the influence of the various scale features on the water extraction results. We carried out a large number of water extraction experiments on GF1 remote sensing images. The results show that the method significantly improves the accuracy of water extraction and demonstrates the effectiveness of the method. INDEX TERMS Remote sensing, deep learning, semantic segmentation, water information extraction, multi- scales, DeepLabV3C. I. INTRODUCTION their features. Among the two types of methods, the index Accurate extraction of water information from remote- method (NDWI), which is a spectrum-photometric method, sensing images has always been an important research topic is widely used in the task of water extraction because of its in the field of remote sensing image analysis because it simplicity and universality. In actual appliances, especially in plays a vital role in national land/water resource monitoring the large-scale daily monitoring of water body nationwide, and environmental protection. The traditional remote sensing it is inevitable to process massive remote sensing images. image water body extraction methods are divided into two The traditional methods with more manual intervention are main types. One is based on the spectral characteristics of not able to guarantee the quality of data products. In par- water bodies and involves setting different thresholds and ticular, there is substantial random interference in images, using a water body index to classify and extract water infor- such as clouds, shadows, fog, etc. Even if some models mation [1]–[3]. The other is based on mathematical morphol- have good generality, there will also be an extraction loss ogy [4] and segments water body edges [5], [6] to extract on the details of water. These will greatly affect our mon- itoring and use efficiency. Therefore, it is of far-reaching The associate editor coordinating the review of this manuscript and significance to study a water extraction model that can meet approving it for publication was Jingchang Huang . the requirements of high precision and high generalization This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ VOLUME 7, 2019 155787 Z. Li et al.: Multiscale Features Supported DeepLabV3C Optimization Scheme for Accurate Water Semantic Segmentation ability and effectively reduce the need for human inter- principle, the neural network model structure optimization vention. As deep learning in the field of computer vision and multifeature fusion and form a set of standardized pro- has developed and improved, intelligent pattern recogni- duction processes with application and promotion value. The tion based on image spatial features has increasingly been experimental results show that the accuracy index of this applied to remote sensing image target detection and pixel method is markedly improved and has good generalizabil- semantic segmentation tasks [7]–[9]. Compared with tradi- ity. This scheme focuses on multiscale features and uses tional methods, semantic segmentation methods based on a hierarchical multiscale splitting method combined with a deep learning can map pixels to semantics [10] and mine stretching algorithm to enrich the detailed features of the spectral features of deep remote sensing images that tradi- samples, thereby preserving the multiscale features of the tional methods cannot extract. Many researchers use semantic samples. We control both the scale of the sample data used segmentation models to achieve water body recognition and for model training and the input and output scales of the extraction tasks [11], [12]. Under the continuous optimization image data during inference. Simultaneously, by changing of complex neural network models, target feature mining the DeepLabV3C network structure, the model can better and learning occurs at deeper levels; thus, image classifi- perceive the global background context characteristics of the cation accuracy has improved continually. Moreover, in the image block. Finally, the extraction results are optimized remote sensing field, with the continuous improvement of with a conditional random field (CRF) to meet the accuracy image resolution, the textural details of surface features have requirements of microscopic and macroscopic features of also been revealed, and the accuracy requirements for water water bodies. extraction have also continually improved. Nonetheless, the The remaining sections of this article are organized as complexity of water background information combined with follows: The second section introduces the problems and interference from complex homologous and heterogeneous reviews related works from the literature. The third section (the same water body with different spectra) phenomena must introduces the DeepLabV3C model application optimization also be considered. Consequently, adopting complex deep scheme, including the image scale division specification, learning networks is very important to learn and identify the the application details of the optimized DeepLabV3C model features of high-resolution images. network structure, and the fully connected CRF. In the fourth Compared with the commonly used images for semantic section, we describe experiments and results comparison segmentation [13], [14], high-resolution satellite images have using a GF1 image as the research object. The experimen- large differences in the number of spectral bands, the extent of tal results and revealed problems are described in the fifth the image range, and the target scale. In particular, the target section, and we present conclusions in the sixth section. range and shape of water bodies are uncertain: a large area of water can cover the entire image segment, while a small water body may be expressed by only one or two pixels in II. RELATED WORKS the image. In [11], a large-scale block size of 512 × 512 is A. LIMITATIONS OF THE SPECTRUM-PHOTOMETRIC used for the experiments because the author believes that METHOD large-scale analysis has a good effect on the remote sensing In this section, we discuss the limitations of traditional meth- image segmentation model, allowing it to maximally per- ods. The spectrum-photometric method combined with the ceive the acceptance domain. However, in our study of water NDWI index is widely used as the main method for water extraction, we found that the model's generalization ability body extraction [15]. The principle is to compute a suitable and extraction accuracy did not achieve the expected results segmentation threshold based on a normalized ratio index, when using a single-scale sample for training. A large sample which is calculated using the green band and the near-infrared scale may reduce the precision of the details extracted during band. Many researchers have conducted extensive research the water extraction process, while small-scale samples are on the selection of thresholds [16]–[18]. Some researchers insufficient for identifying features clearly at the information have also used unsupervised classification methods (Isodata macro level. Thus, scale selection and processing has become or K-means) for water classification based on the NDWI a primary problem that affects the deep learning water extrac- index [19]–[21]. tion process. Consequently, determining how to design a rea- Although this type