<<

Journal of Marine Science and Engineering

Article A Novel Detection and Directional Discrimination Method for Remote Sensing Image Based on Lightweight Network

Pan Wang, Jianzhong Liu *, Yinbao Zhang, Zhiyang Zhi, Zhijian Cai and Nannan Song

School of Geoscience and Technology, Zhengzhou University, Zhengzhou 450001, China; [email protected] (P.W.); [email protected] (Y.Z.); [email protected] (Z.Z.); [email protected] (Z.C.); [email protected] (N.S.) * Correspondence: [email protected]

Abstract: Recently, cargo ship detection in remote sensing images based on deep learning is of great significance for cargo ship monitoring. However, the existing detection network is not only unable to realize autonomous operation on spaceborne platforms due to the limitation of computing and storage, but the detection result also lacks the directional information of the cargo ship. In order to address the above problems, we propose a novel cargo ship detection and directional discrimination method for remote sensing images based on a lightweight network. Specifically, we design an efficient and lightweight feature extraction network called the one-shot aggregation and depthwise separable network (OSADSNet), which is inspired by one-shot feature aggregation modules and   depthwise separable convolutions. Additionally, we combine the RPN with the K-Mean++ algorithm to obtain the K-RPN, which can produce a more suitable region proposal for cargo ship detection. Citation: Wang, P.; Liu, J.; Zhang, Y.; Furthermore, without introducing extra parameters, the directional discrimination of the cargo ship Zhi, Z.; Cai, Z.; Song, N. A Novel is transformed into a classification task, and the directional discrimination is completed when the Cargo Ship Detection and Directional detection task is completed. Experiments on a self-built remote sensing image cargo ship dataset Discrimination Method for Remote indicate that our model can provide relatively accurate and fast detection for cargo (mAP of Sensing Image Based on Lightweight 91.96% and prediction time of 46 ms per image) and discriminate the directions (north, east, south, Network. J. Mar. Sci. Eng. 2021, 9, 932. https://doi.org/10.3390/ and west) of cargo ships, with fewer parameters (model size of 110 MB), which is more suitable jmse9090932 for autonomous operation on spaceborne platforms. Therefore, the proposed method can meet the needs of cargo ship detection and directional discrimination in remote sensing images on spaceborne Academic Editors: Marco Cococcioni platforms. and Pierre Lermusiaux Keywords: deep learning; remote sensing image; cargo ship; directional discrimination; faster-RCNN; Received: 10 August 2021 lightweight Accepted: 23 August 2021 Published: 28 August 2021

Publisher’s Note: MDPI stays neutral 1. Introduction with regard to jurisdictional claims in With the progress of science and technology and the development of world trade, published maps and institutional affil- economic globalization has become the trend of world economic development. The trade iations. links between countries are becoming closer and closer, and maritime transportation has become the main mode of transportation in foreign trade because of its advantages of large volume and low-cost transportation. Cargo ships are the main means of transportation for sea transportation. However, with the diversification of cargo ship types, the larger Copyright: © 2021 by the authors. and higher speed of cargo ships, and the increase of ports, the navigation environment of Licensee MDPI, Basel, Switzerland. cargo ships is becoming more and more complicated, causing the course of cargo ships This article is an open access article to deviate from their planned course. This leads to frequent problems such as channel distributed under the terms and block, cargo ships collisions, and increased navigation costs. Therefore, if we can grasp the conditions of the Creative Commons Attribution (CC BY) license (https:// navigation information of the cargo ship in time, it is of great significance for improving creativecommons.org/licenses/by/ the navigation environment of the cargo ship, ensuring the safe navigation of the cargo 4.0/). ship, improving the navigation efficiency, and shortening the navigation time [1–4].

J. Mar. Sci. Eng. 2021, 9, 932. https://doi.org/10.3390/jmse9090932 https://www.mdpi.com/journal/jmse J. Mar. Sci. Eng. 2021, 9, 932 2 of 17

In recent years, with the continuous development of satellite remote sensing technol- ogy and the emergence of a large number of remote sensing images, facing the vast ocean, remote sensing images have the advantages of wide coverage, high spatial resolution, fast update speed, and low cost, which makes the use of remote sensing images for real-time monitoring of ships in the ocean important [5,6]. However, in the face of massive remote sensing images, if we only rely on manual interpretation to obtain cargo ship information in remote sensing images, it can no longer meet the needs of modern society because of the huge workload and low efficiency. Therefore, obtaining cargo ship target information quickly and accurately from massive remote sensing images has become an urgent problem to be solved. Traditional ship detection algorithms in remote sensing images rely heavily on manual design features, which require designers to understand relevant professional knowledge. It is difficult to realize the rapid processing of massive amounts of remote sensing data, and the accuracy of cargo ships under a complex background is low [7–9]. Compared with traditional object detection algorithms, the algorithms based on deep learning can independently extract object features, avoiding the complexity of manual design features, and the extracted features are more robust. Therefore, deep learning makes it possible to intellectualize cargo ship detection in remote sensing images. Recently, with the substantial improvement of computer hardware performance and the emergence of large-scale training samples, the deep learning techniques represented by convolutional neural networks have shown strong performance in object detection applica- tions [10–12]. At present, object detection algorithms based on the convolutional neural network can be divided into two-stage object detection algorithms and one-stage object detection algorithms according to whether region proposals are generated. Two-stage object detection algorithms, such as R-CNN [13], fast-RCNN [14], and faster-RCNN [15], firstly extract region proposals on the feature map where there may be objects, then further classify and locate these areas, and finally the detection results are obtained. This type of method achieves a high detection accuracy, but it cannot meet real-time requirements. One-stage object detection algorithms, such as YOLO [16–19] and SSD [20], use the re- gression method to establish an object detection framework, which removes the operation of generating region proposals. This type of method has fast detection speed, but the detection accuracy is low. In view of the excellent performance of deep learning in computer vision, deep learn- ing has been widely used in ship detection in remote sensing images [21–23]. However, there are some problems in remote sensing images, such as diversified ship sizes and complex backgrounds, which bring difficulties to ship detection. Many scholars have put forward their own solutions to such problems. To efficiently detect ships with various scales, Li et al. [24] proposed a hierarchical selective filtering layer based on the faster- RCNN algorithm to map features in different scales to the same scale space. Although this method can simultaneously detect inshore and offshore ships with dozens of pixels to thousands of pixels, the detection results lack the directional information of the cargo ships. Aiming at the problem of high false alarm ratio and the great influence of the sea surface in traditional ship detection methods, Zhang et al. [25] adopted the idea of deep networks and proposed a fast regional-based convolutional neural network (R-CNN) method to detect ships from high-resolution remote sensing imagery. However, the proposed method con- tains a complicated pre-processing stage, which cannot meet the requirements of real-time detection. Wang et al. [26] proposed a fast-RCNN method based on the adversary strategy, which adopted the adversarial spatial transformer network (ASTN) module to improve the classification ability for ships in remote sensing images under complex backgrounds. However, it is hard to realize autonomous operation on spaceborne platforms. Although automatic identification system (AIS) data can provide some information about the cargo ship, the information will be lost if the AIS does not keep the normal opening statement or fails, and some cargo ships are not even loaded with AIS. However, J. Mar. Sci. Eng. 2021, 9, 932 3 of 17

the remote sensing satellite will not have a similar situation, so the remote sensing satellite image has become an important data source to obtain cargo ship information. In the face of massive satellite remote sensing images, the transmission bandwidth from the satellite to ground is limited. Therefore, the real-time intelligent processing of remote sensing images on spaceborne platforms must be the development trend in the future. However, due to the limitation of computing and storage on the spaceborne platform, complex and huge object detection models have not been widely used. In particular, although the two-stage target detection algorithm has high accuracy, it has high model complexity and low detection efficiency, resulting in poor availability of the algorithm on the spaceborne platform. The category and position information of cargo ships can be obtained by real-time de- tection of cargo ships in remote sensing images. However, due to the unique perspective of remote sensing images, the remote sensing images also contain rich directional information of cargo ship targets. If the directional information of cargo ship targets can be obtained at the same time, it not only can ensure the safe navigation of cargo ships but can also help cargo ships navigate accurately along the route and reduce navigation costs. In this paper, cargo ships in remote sensing images are taken as the research objects, and the faster-RCNN algorithm is used as the basis to propose a novel cargo ship detection and directional discrimination method for remote sensing images based on a lightweight network. This model not only can accurately detect the category and position of cargo ship targets in remote sensing images in real time but can also discriminate the direction of cargo ships. Additionally, its lightweight advantage improves the availability of spaceborne platforms. The rest of the paper is summarized as follows: Section2 introduces the proposed method, Section3 describes the experimental dataset and the experimental results, and Section4 summarizes the whole paper and puts forward some suggestions for future work.

2. Methodology 2.1. Faster-RCNN Network Considering the application needs of cargo ship detection and direction recognition in remote sensing images, this paper adopts the faster-RCNN model with high comprehensive performance of detection accuracy and detection speed as the algorithm prototype. As shown in Figure1, faster-RCNN is composed of three parts: feature extraction network, region proposal network (RPN), and region based convolutional neural networks (R-CNN). The feature extraction network uses convolution to extract the features of the images, and the extracted features are shared by the following RPN and R-CNN. The RPN is a fully convolutional neural network used to generate region proposals for potential cargo ships in the images, and the following ROI pooling layer is used to unify region proposals of J. Mar. Sci. Eng. 2021, 9, x FOR PEERdifferent REVIEW sizes. R-CNN is used for coordinating regression and classification of the extracted4 of 19

region of interest to realize the detection of the cargo ships.

FigureFigure 1.1.The The architecturearchitecture of faster-RCNNfaster-RCNN network.

2.2. Proposed Model 2.2.1. Model Overview The model of cargo ship detection and directional discrimination in remote sensing images based on a lightweight network proposed in this paper is shown in Figure 2. In this model, one-shot aggregation module [23] and depthwise separable convolution [24] are used to build an efficient and lightweight feature extraction network, namely the one- shot aggregation and depthwise separable network (OSADSNet). Second, K-Means++ clustering algorithm is introduced into RPN to generate high-quality candidate boxes. Fi- nally, without introducing extra parameters, the direction recognition problem of the cargo ship in remote sensing images is converted to a classification problem, and the di- rectional discrimination is completed while the detection task is accomplished.

Figure 2. The overall model structure.

2.2.2. Feature Extraction Network The feature extraction network is an important module in the object detection model, and its performance precisely influences the performance of the model. In this paper, we propose a novel and efficient feature extraction network, the one-shot aggregation and depthwise separable network (OSADSNet), to meet the specific needs of cargo ship detec- tion in remote sensing images. The network draws on the idea of the one-shot aggregation

J. Mar. Sci. Eng. 2021, 9, x FOR PEER REVIEW 4 of 19

J. Mar. Sci. Eng. 2021, 9, 932 4 of 17

Figure 1. The architecture of faster-RCNN network.

2.2.2.2. ProposedProposed ModelModel 2.2.1. Model Overview 2.2.1. Model Overview The model of cargo ship detection and directional discrimination in remote sensing The model of cargo ship detection and directional discrimination in remote sensing images based on a lightweight network proposed in this paper is shown in Figure2. In images based on a lightweight network proposed in this paper is shown in Figure 2. In this model, one-shot aggregation module [27] and depthwise separable convolution [28] this model, one-shot aggregation module [23] and depthwise separable convolution [24] are used to build an efficient and lightweight feature extraction network, namely the one- are used to build an efficient and lightweight feature extraction network, namely the one- shot aggregation and depthwise separable network (OSADSNet). Second, K-Means++ shot aggregation and depthwise separable network (OSADSNet). Second, K-Means++ clustering algorithm is introduced into RPN to generate high-quality candidate boxes. clustering algorithm is introduced into RPN to generate high-quality candidate boxes. Fi- Finally, without introducing extra parameters, the direction recognition problem of the nally, without introducing extra parameters, the direction recognition problem of the cargo ship in remote sensing images is converted to a classification problem, and the cargo ship in remote sensing images is converted to a classification problem, and the di- directional discrimination is completed while the detection task is accomplished. rectional discrimination is completed while the detection task is accomplished.

FigureFigure 2. 2.The The overalloverall modelmodel structure.structure.

2.2.2.2.2.2. FeatureFeature ExtractionExtraction NetworkNetwork TheThe feature feature extractionextraction networknetwork isis an important module in in the the object object detection detection model, model, andand itsits performanceperformance precisely precisely influences influences the the performance performance of ofthe the model. model. In this In thispaper, paper, we wepropose propose a novel a novel and and efficient efficient feature feature extraction extraction network, network, the one the-shot one-shot aggregation aggregation and anddepthwise depthwise separable separable network network (OSADSNet), (OSADSNet), to meet to the meet specific the specific needs of needs cargo of ship cargo detec- ship detectiontion in remote in remote sensing sensing images. images. The netw Theork networkdraws on drawsthe idea on of the one idea-shot of theaggregation one-shot aggregation module (Figure3). Each layer of convolution has two connections: one is directly connected to the next layer to acquire a larger receptive field, and the other is connected to the last layer to aggregate features. Compared with the residual module [29], the network integrates more low-level features, and the connection is not the point-to- point addition of the feature map but the splicing between channels, which reduces the number of parameters between layers. Compared with the dense connection module [30], the connection is optimized, and all features are aggregated only before the last layer, which realizes the efficient aggregation of different convolutional layers, avoids feature redundancy, and performs at a faster speed. At the same time, the network uses depthwise separable convolution (Figure4b) to replace the standard convolution (Figure4a), which reduces the model parameters and effectively extracts features, finally realizing the light weight of the model. Table1 shows the implementation details of OSADSNet. J. Mar. Sci. Eng. 2021, 9, x FOR PEER REVIEW 5 of 19

J. Mar. Sci. Eng. 2021, 9, x FOR PEER REVIEW 5 of 19

module (Figure 3). Each layer of convolution has two connections: one is directly con- nectedmodule to the(Figure next 3)layer. Each to acquirelayer of aconvolution larger receptive has two field, connections and the other: one isis connecteddirectly con- to thenected last layer to the to aggregatenext layer ftoeatures. acquire Compared a larger receptive with the field,residual and module the other [25] is, connectedthe network to integratesthe last layer more to low aggregate-level features, features. and Compared the connection with the is residual not the module point-to [25]-point, the addition network of integratesthe feature mo mapre low but-level the splicing features, between and the channels,connection which is not reduce the points the-to number-point addition of pa- rametersof the feature between map layers. but theCompared splicing withbetween the densechannels, connection which reduce modules the [26] number, the connec- of pa- tionrameters is optimized, between and layers. all features Compared are aggregated with the dense only connectionbefore the lastmodule layer, [26] which, the realizesconnec- thetion efficient is optimized, aggregation and all of featuresdifferent are convolution aggregatedal onlylayers, before avoids the feature last layer, redundancy which realizes, and performsthe efficient at a fasteraggregation speed. of At different the same convolution time, the networkal layers, usesavoids depthwise feature redundancy separable con-, and performs at a faster speed. At the same time, the network uses depthwise separable con- J. Mar. Sci. Eng. 2021, 9, 932 volution (Figure 4b) to replace the standard convolution (Figure 4a), which reduces5 ofthe 17 modelvolution parameters (Figure and4b) toeffectively replace theextracts standard features, convolution finally realiz (Figureing 4a)the, lightwhich weight reduces of thethe model.model Table parameters 1 shows and the effectively implementation extracts details features, of OSADSNet.finally realiz ing the light weight of the model. Table 1 shows the implementation details of OSADSNet.

Figure 3. The one-shot aggregation module, where x0, x1, x2, x3, x4, and x5 denote the input FigureFigure 3. 3.The The one-shot one-shot aggregation aggregation module, module where, wherex0, x1, xx02,, xx13,, xx24, ,x and3, x4x, 5anddenote x5 denote the input the of input each of each layer of the convolution layer; y0 denotes the output; and concat denotes the fea- layerof each of the layer convolution of the convolution layer; y0 denotes layer the; y0 output; denote ands the concat output denotes; and theconcat feature denote concatenations the fea- tureoperation.ture concatenation concatenation Rectified operation linear operation unit. (ReLU)Rectified. Rectified denotes linear linear an unit unit activation (ReLU) (ReLU) function. denotes denotes Batchan an activation activation normalization function. function. (BN) BatchdenotesBatch normalization centeringnormalization and (BN) scaling (BN) denotes activationsdenotes centering centering within mini-batches.and and scaling scaling activations Conv2dactivations denotes within within a 2D mini mini convolutional--batches.batches. Conv2dlayer.Conv2d denotes denotes a 2Da 2D convolutional convolutional layer. layer.

(a)( aS)tandard Standard convolution convolution (b(b) )Depthwise Depthwise separable separable convolution convolution

FigureFigureFigure 4. 4. Comparison4.Comparison Comparison of of ofdifferent different different convolution convolutions. convolutions.s Depthwise. Depthwise Depthwise separable separable separable convolution convolutions convolutionss are are special special cases casesofcases standard of ofstandard standard convolution. convolution. convolution.

TableTable 1. An1. An overview overview of ofthe the implementation implementation details details of of OSADSNet. OSADSNet. Table 1. An overview of the implementation details of OSADSNet. StageStage StructureStructure Stage Structure 3 3× ×3 3conv,64, conv,64, stride stride = = 2 2 3 × 3 conv,64, stride = 2 3 3× ×3 3dsconv,64, dsconv,64, stride stride = = 1 1 Stage1 3 × 3 dsconv,64, stride = 1 Stage1 Stage1 3 3× ×3 3dsconv,64, dsconv,64,3 × 3 dsconv,64, stride stride = stride= 1 1 = 1 3 3× ×3 3maxpool2d, maxpool2d,3 × 3 maxpool2d, stride stride = stride= 2 2 = 2 [3 × 3 dsconv,128,[3 × 3 dsconv,128, stride stride= 1] × =3 1] × 3 Stage 2 [3 × 3 dsconv,128, stride = 1] × 3 StageStage 2 2 × concatconcat & &concat 1 1× ×1 1conv,256 & conv,256 1 1 conv,256 [3 × 3 dsconv,160, stride = 1] × 3 Stage Stage3 3 [3 × 3 dsconv,160, stride = 1] × 3 Stage 3 [3 × 3 dsconv,160,concat stride & 1 × =1 conv,5121] × 3 [3 × 3 dsconv,192, stride = 1] × 3 Stage 4 concat & 1 × 1 conv,768 [3 × 3 dsconv,224, stride = 1] × 3 Stage 5 concat & 1 × 1 conv,1024

2.2.3. K-RPN The faster-RCNN uses a region proposal network (RPN) to effectively generate high- quality region proposals, where each of them corresponds to the probability and position information of the cargo ship. Meanwhile, the network shares the feature maps with the feature extraction network to shorten the computing time. In order to locate objects of different sizes effectively, the anchors of different scales are used in RPN, but the original anchors are designed manually without the prior information of cargo ship sizes in remote sensing images. Therefore, in this paper, the K-Mean++ J. Mar. Sci. Eng. 2021, 9, x FOR PEER REVIEW 6 of 19

concat & 1 × 1 conv,512 [3 × 3 dsconv,192, stride = 1] × 3 Stage 4 concat & 1 × 1 conv,768 [3 × 3 dsconv,224, stride = 1] × 3 Stage 5 concat & 1 × 1 conv,1024

2.2.3. K-RPN The faster-RCNN uses a region proposal network (RPN) to effectively generate high- quality region proposals, where each of them corresponds to the probability and position information of the cargo ship. Meanwhile, the network shares the feature maps with the feature extraction network to shorten the computing time. In order to locate objects of different sizes effectively, the anchors of different scales are used in RPN, but the original anchors are designed manually without the prior infor- mation of cargo ship sizes in remote sensing images. Therefore, in this paper, the K- Mean++ algorithm [27] is used to cluster the bounding boxes of cargo ships in the dataset to obtain anchors suitable for cargo ships. The structure of K-RPN is shown in Figure 5. In the process of clustering, the Euclidean distance clustering method will lead to a greater error for the larger bounding boxes than for the smaller bounding boxes, resulting in a greater error in the intersection over union (IOU) between the anchors and the bound- ing boxes. In order to make the IOU of the anchors and the bounding boxes larger and obtain better anchors, the IOU of the bounding boxes and the cluster center bounding boxes is used as the distance function of the K-Mean++ algorithm to complete clustering:

d (box, centroid)  1 IOU (box, centroid) (1) J. Mar. Sci. Eng. 2021, 9, 932 6 of 17 where centroid represents the bounding boxes of the cluster centers; box represents the bounding boxes of the cargo ships; IOU represents the intersection over union of the cargo ship bounding boxes of cargo ships and the cluster center bounding boxes of the cluster centers;algorithm and [31 d] represents is used to clusterthe distance the bounding between boxes the bounding of cargo shipsboxes in of the cluster dataset centers to obtain and theanchors bounding suitable boxes for of cargo cargo ships. ships. The structure of K-RPN is shown in Figure5.

Figure 5. The structure of K-RPN in proposed model. Figure 5. The structure of K-RPN in proposed model. In the process of clustering, the Euclidean distance clustering method will lead to a 2.2.4. Directional Discrimination greater error for the larger bounding boxes than for the smaller bounding boxes, resulting in a greaterSince error remote in the sensing intersection images over are taken union above (IOU) the between cargo theships, anchors the images and the contain bounding not onlyboxes. the In category order to and make location the IOU information of the anchors of the and cargo the bounding ships but boxesalso the larger direction and obtainal in- formationbetter anchors, of thethe cargo IOU ships. of the Mastering bounding the boxes course and information the cluster of center cargo boundingships is important boxes is forused cargo as the ships distance to sail function along the of expected the K-Mean++ route, algorithm which not to only complete can ensure clustering: the safe navi- gation of the cargo ships but also can help to reduce the cost of navigation. In this paper, d (box, centroid) = 1 − IOU (box, centroid) (1)

where centroid represents the bounding boxes of the cluster centers; box represents the bounding boxes of the cargo ships; IOU represents the intersection over union of the cargo ship bounding boxes of cargo ships and the cluster center bounding boxes of the cluster centers; and d represents the distance between the bounding boxes of cluster centers and the bounding boxes of cargo ships.

2.2.4. Directional Discrimination Since remote sensing images are taken above the cargo ships, the images contain not only the category and location information of the cargo ships but also the directional information of the cargo ships. Mastering the course information of cargo ships is im- portant for cargo ships to sail along the expected route, which not only can ensure the safe navigation of the cargo ships but also can help to reduce the cost of navigation. In this paper, the problem of directional discrimination of cargo ships is transformed into a classification problem. The direction of the cargo ship is divided into four directions: east, south, west, and north (Figure6). The object detection model is used to classify the directional information into category information to learn, and the model outputs the category of cargo ships and directional information at the same time. In this model, the directional information of cargo ships can be obtained without adding extra parameters. J. Mar. Sci. Eng. 2021, 9, x FOR PEER REVIEW 7 of 19

the problem of directional discrimination of cargo ships is transformed into a classification problem. The direction of the cargo ship is divided into four directions: east, south, west, and north (Figure 6). The object detection model is used to classify the directional infor- J. Mar. Sci. Eng. 2021, 9, 932 mation into category information to learn, and the model outputs the category of 7cargo of 17 ships and directional information at the same time. In this model, the directional infor- mation of cargo ships can be obtained without adding extra parameters.

FigureFigure 6.6. TheThe divisiondivision ofof direction.direction. N, E, S S,, and W, respectively respectively,, denote the north, east, east, south south,, and and westwest directionsdirections ofof thethe cargocargo ship. ship.

3.3. ExperimentsExperiments 3.1.3.1. ExperimentalExperimental DatasetDataset (1)(1) Data collection.collection. AsAs farfar as as we we know, know, there there is is no no publicly publicly available available remote remote sensing sensing image im- cargoage cargo ship ship dataset dataset with with category category and directional and direction information.al information. To detect To detect cargo shipscargo andships discriminate and discriminate directions directions of cargo of cargo ships, ship it iss, necessaryit is necessary to collect to collect corresponding correspond- remoteing remote sensing sensing images. images. Remote Remote sensing sensing images images selected selected in this in experiment this experiment are from are Googlefrom Google Earth. Earth. The The resolutions resolutions of images of images are are 16, 16, 17, 17 and, and 18 18 levels. levels. The The bands bands ofof images areare red,red, green,green, andand blue.blue. TheThe datadata contentcontent coverscovers differentdifferent backgroundsbackgrounds andand various positionalpositional relationships,relationships, whichwhich cancan meetmeet thethe needneed forfor practicalpractical tasks.tasks. DueDueto to the limitationlimitation ofof computercomputer memorymemory capacity,capacity, thethe collectedcollected imagesimages areare croppedcropped intointo × 800 × 800800 pixels, pixels, and the images containing three types ofof cargocargo shipsships areare filteredfiltered out—, container, and —ensuring that each image contains at least out—bulk carrier, container, and tanker—ensuring that each image contains at least one cargo ship. The examples are shown in the Figure7. one cargo ship. The examples are shown in the Figure 7. (2) Data annotation. In order to establish a cargo ship dataset of remote sensing images, it is necessary to annotate the collected remote sensing images for the training and testing of the models. In this paper, labeling software (Labelimg) [32] is used on the cargo ships in remote sensing images. Labeled objects include bulk carrier, container, and tanker. According to the directions of cargo ships, they can be divided into four categories, east, south, west, and north, and the angle range of each category is 90 degrees. After the labeling is completed, a corresponding labeling file will be formed, which mainly records the location, category, and direction of the cargo ship. The obtained annotated dataset is uniformly processed into the format of VOC2007 [33] to provide a standard dataset for the training of models. (3) Data augmentation and splitting. In the process of model training, the larger and more comprehensive the dataset, the stronger the model recognition ability. Therefore, (a) (b) (c) in this paper, the collected remote sensing images are rotated clockwise by 90 degrees, Figure180 7. degrees,Examples andof cargo 270 degrees,ships in remote as well sensing as flipped image horizontallys. (a) bulk carrier; and vertically,(b) container; toexpand (c) tanker.the data. The number of the dataset is expanded by six times, and its directional label is adjusted accordingly. Finally, 15,654 remote sensing images are obtained. Table2 (2) showsData annotation. the details In of order various to establish cargo ships a cargo in theship dataset. dataset of The remote dataset sensing is randomly images, dividedit is necessary into training to annotate set, validation the collected set, andremote test sensing set according image tos for the the ratio training of 8:1:1. and testing of the models. In this paper, labeling software (Labelimg) [28] is used on the cargo ships in remote sensing images. Labeled objects include bulk carrier, container, Table 2. The number of cargo ships used per category after applying augmentation techniques.

Category East South West North Total Bulk carrier 1505 1423 1505 1423 5856 Container 2207 2299 2207 2299 9012 Tanker 2013 1989 2013 1989 8004 Total 5725 5711 5725 5711 22,872 J. Mar. Sci. Eng. 2021, 9, x FOR PEER REVIEW 7 of 19

the problem of directional discrimination of cargo ships is transformed into a classification problem. The direction of the cargo ship is divided into four directions: east, south, west, and north (Figure 6). The object detection model is used to classify the directional infor- mation into category information to learn, and the model outputs the category of cargo ships and directional information at the same time. In this model, the directional infor- mation of cargo ships can be obtained without adding extra parameters.

Figure 6. The division of direction. N, E, S, and W, respectively, denote the north, east, south, and west directions of the cargo ship.

3. Experiments

J. Mar. Sci. Eng. 2021, 9, x FOR PEER3.1. REVIEW Experimental Dataset 8 of 19

(1) Data collection. As far as we know, there is no publicly available remote sensing im- age cargo ship dataset with category and directional information. To detect cargo shipsand and tanker. discriminate According directions to the directions of cargo ofship , it isships, necessary they can to collect be divided correspond- into four ingcategories remote sensing, east, s outh,images. west Remote, and nsensingorth, and images the angle selected range in ofthis each experiment category areis 90 fromdegrees. Google After Earth. the Thelabeling resolutions is completed, of images a corresponding are 16, 17, and labeling 18 levels. file Thewill bebands formed, of imageswhich are ma red,inly greenrecords, and the blue. location, The dcategory,ata content and covers direction different of the backgroundscargo ship. The and ob- varioustained positional annotated relatio datasetnships, is uniformly which can processed meet the into need the for format practical of VOC2007 tasks. Due [29] to to theprovide limitation a standard of computer dataset memory for the capacity,training of the models. collected images are cropped into J. Mar. Sci. Eng. 2021, 9, 932 (3) 800Data × 800 augmentation pixels, and theand images splitting. containing In the process three typesof model of cargo training, ships the are larger filtered8 ofand 17 outmore—bulk comprehensive carrier, container the dataset,, and tanker the —strongerensuring the that model each recognition image contains ability. at leastThere- onefore, cargo in thisship. paper, The examples the collected are shown remote in sensing the Figure images 7. are rotated clockwise by 90 degrees, 180 degrees, and 270 degrees, as well as flipped horizontally and vertically, to expand the data. The number of the dataset is expanded by six times, and its di- rectional label is adjusted accordingly. Finally, 15,654 remote sensing images are ob- tained. Table 2 shows the details of various cargo ships in the dataset. The dataset is randomly divided into training set, validation set, and test set according to the ratio of 8:1:1.

Table 2. The number of cargo ships used per category after applying augmentation techniques.

Category East South West North Total Bulk carrier 1505 1423 1505 1423 5856 Container(a) 2207 (b2299) 2207 2299( c) 9012 Tanker 2013 1989 2013 1989 8004 FigureFigure 7. 7. ExamplesExamples of of cargo cargo ship shipss in in remote remote sensing sensing image images.s. ( a)) b bulkulk carrier; (b)) container;container; ((cc)) tanker. tanker. Total 5725 5711 5725 5711 22,872 3.2. The Anchors Clustering (2)3.2. Data ToThe obtain Aannotation.nchors anchors Clustering In order suitable to establish for cargo a ships, cargo thisship paper dataset introduces of remote the sensing K-Mean++ images, al- gorithmit Tois necessary obtain into the anchors RPNto annotate to suitable propose the for collected the cargo K-RPN, ships, remote and this sensing thepaper K-Mean++ introduces images for algorithm the the K training-Mean++ is used and algo- to automaticallyrithmtesting into of the thegenerate RPN models. to propose the In appropriatethis thepaper, K-RPN, labeling anchors and softwarethe instead K-Mean++ (Labelimg) of the algorithm manual [28] method. isis usedused onto If auto-the the anchorsmaticallycargo are ships generate not in selected remote the appropriate properly, sensing theimages. anchors final Labeled detection instead objects results of the include will manual be affected.bulk method. carrier, Figure If containerthe8 anchorsshows, theare length not selected and width properly, distribution the final of detection cargo ships results in the will dataset be affected. and the Figure clustering 8 shows results the oflength K-Mean++. and width Table distribution3 shows the of cargo comparison ships in between the dataset original and the anchors clustering and K-Mean++results of K- anchors. Mean++. Table 3 shows the comparison between original anchors and K-Mean++ anchors.

FigureFigure 8. 8.The The distribution distribution of of width width and and height height of of annotated annotated bounding bounding boxes boxes in thein the cargo cargo ship ship dataset. da- Thetaset. red The dots red represent dots repr theesent clustering the clustering results. results.

Table 3. Comparison between original anchors and K-Mean++ anchors.

Width 91 128 181 181 256 362 362 512 724 Original Anchors Height 181 128 91 362 256 181 724 512 362 Width 37 128 101 103 114 210 259 279 474 K-Mean++ Anchors Height 88 128 40 295 109 207 478 97 261

3.3. Implementation Details The experiments are conducted on a Dell T3640 workstation. The operating system is Ubuntu 16.04 LTS. The model is written in Python and supported by Torch on the backend. J. Mar. Sci. Eng. 2021, 9, x FOR PEER REVIEW 9 of 19

Table 3. Comparison between original anchors and K-Mean++ anchors.

Width 91 128 181 181 256 362 362 512 724 Original Anchors Height 181 128 91 362 256 181 724 512 362 Width 37 128 101 103 114 210 259 279 474 K-Mean++ Anchors Height 88 128 40 295 109 207 478 97 261

J. Mar. Sci. Eng. 2021, 9, 932 9 of 17 3.3. Implementation Details The experiments are conducted on a Dell T3640 workstation. The operating system is Ubuntu 16.04 LTS. The model is written in Python and supported by Torch on the Thebackend. model The input model size isinput set to size 800 is× set800. to Stochastic800 × 800. gradientStochastic descent gradient is used descent in training is used to in minimizetraining to the minimize loss function. the loss The function. initial learning The initial rate is learning set to 0.025, rate andis set the to learning 0.025, and rate the is reducedlearning by rate 0.1 is times reduced in 30,000 by 0.1 andtimes 40,000 in 30, iterations.000 and 40 The,000 batchiterations. size isThe two. batch The size value is two of . momentumThe value of is momentum set to 0.9. The is set value to 0.9. of weight The value decay of isweight set to decay 0.0005. is The set to model 0.0005. is trained The model for 50,000is trained iterations. for 50,000 Figure iterations.9 shows Figure the change 9 shows of loss the duringchange training.of loss during training.

FigureFigure 9. 9.The The curve curve of of loss loss of of training. training.

3.4.3.4. EvaluationEvaluation Metrics Metrics ThisThis paper paper used used four four evaluation evaluation metrics: metrics: precisionprecision ( P(P),), recall recall ( R(R),), average average precision precision (AP(AP),), and and mean mean average average precision precision ( mAP(mAP)) to to evaluate evaluate the the performance performance of of the the model. model. Preci-Preci- sionsion indicates indicates the the percentage percentage of of true true positives positives of of cargo cargo ships ships in in the the sum sum of of true true positives positives ofof cargo cargo ships ships and and false false positives positives of of cargo cargo ships. ships. Recall Recall indicates indicates the the percentage percentage of of true true positivespositives ofof cargocargo ships to to the the ground ground truths truths of ofcargo cargo ships. ships. AP APis theis area the areaunder under the curve the curveof precision of precision-recall,-recall, which which is the is popular the popular evaluation evaluation metric metric of object of object detection, detection, and is and often is oftenused usedto measure to measure the advantages the advantages and disadvantages and disadvantages of object of object detection detection models. models. mAPmAP indi- indicatescates the themean mean of AP of APacrossacross all allclasses. classes. Precision: Precision: Tp P = (2) Tp +TpFp P  (2) Recall: Tp Fp Tp R = (3) Recall: Tp + Fn AP: Tp RZ 1 (3) AP = TpP(R) FndR (4) 0 mAP:AP: n ∑ AP(i) = mAP = i 0 (5) n where Tp represents the number of cargo ships correctly identified in the detection results, Fp represents the number of cargo ships falsely identified in the detection results, Fn represents the number of cargo ships that have been missed, and n represents the number of types of cargo ships. J. Mar. Sci. Eng. 2021, 9, 932 10 of 17

3.5. Experimental Results 3.5.1. Performance of Different Models To quantitatively analyze the effectiveness of the proposed model, the self-built dataset is used to train and test the model, and the evaluation indexes P, R, and AP are used to evaluate the test results of the model. Table4 lists the test results of our model on the test set.

Table 4. The number of cargo ships for test set results and the evaluation of our model.

Number Class P (%) R (%) AP (%) GT Tp Fp Fn Bulk_Carrier_North 159 150 14 9 91.46 94.34 93.46 Bulk_Carrier_East 135 130 11 5 92.20 96.30 94.94 Bulk_Carrier_South 122 117 26 5 81.82 95.90 95.20 Bulk_Carrier_West 142 133 6 9 95.68 93.66 93.39 Container_North 192 181 27 11 87.02 94.27 93.17 Container_East 203 190 31 13 85.97 93.60 93.00 Container_South 226 207 34 19 85.89 91.59 90.10 Container_West 222 207 24 15 89.61 93.24 92.26 Tanker_North 207 185 41 22 81.86 89.37 88.10 Tanker_East 192 176 19 16 90.26 91.67 91.26 Tanker_South 205 184 26 21 87.62 89.76 88.77 Tanker_West 202 183 31 19 85.51 90.59 89.87

Table5 gives statistics on model size, training duration, mAP, and single image prediction duration of different models. As can be seen from the experimental results, in terms of model size, our model has fewer parameters (model size of 110 MB) and a shorter training time (79.5 min), so it is more suitable to be deployed on the spaceborne platform. In terms of detection accuracy, our model can achieve a higher detection accuracy (mAP of 91.96%) similar to the two-stage object detection algorithm faster-RCNN (mAP of 94.41%). In terms of detection speed, our model maintains a fast detection speed (prediction time of 46ms per image) and meets the requirements of real-time detection.

Table 5. Comparison of test performance between different models.

Feature Extraction Region Proposal Weight Training Input Prediction Model mAP Network Network Size/MB Time/min Size Time/ms Faster-RCNN ResNet50 RPN 319.5 100.4 94.41% 800 × 800 56 YOLOv3 DarkNet53 None 235.0 758.0 83.98% 800 × 800 41 Our OSASPNet K-RPN 110.0 79.5 91.96% 800 × 800 46

Figure 10 shows the detection results of different models on test sets. It can be seen from the figure that our model and faster-RCNN can complete the classification, positioning, and directional discrimination of cargo ships in remote sensing images well, while the YOLOv3 model has different degrees of missed detection.

3.5.2. Performance of Other Remote Sensing Images Given that Google Earth images are the integration of multiple satellite images and aerial images, this paper conducts tests on high-resolution remote sensing images of multiple satellites, including Skysat 1.0 m/pixel resolution images (Figure 11), Deimos-2 0.5~2 m/pixel resolution images (Figure 12), and QuickBird 0.5~2 m/pixel resolution images (Figure 13), to verify the performance of our model on remote sensing images of a single source. J. Mar. Sci. Eng. 2021, 9, x FOR PEER REVIEW 11 of 19

Table 5. Comparison of test performance between different models.

Feature Extraction Region Proposal Weight Training Input Prediction Model mAP Network Network Size/MB Time/min Size Time/ms Faster-RCNN ResNet50 RPN 319.5 100.4 94.41% 800 × 800 56 YOLOv3 DarkNet53 None 235.0 758.0 83.98% 800 × 800 41 Our OSASPNet K-RPN 110.0 79.5 91.96% 800 × 800 46

Figure 10 shows the detection results of different models on test sets. It can be seen J. Mar. Sci. Eng. 2021, 9, 932 11 of 17 from the figure that our model and faster-RCNN can complete the classification, position- ing, and directional discrimination of cargo ships in remote sensing images well, while the YOLOv3 model has different degrees of missed detection.

J. Mar. Sci. Eng. 2021, 9, x FOR PEER REVIEW 12 of 19

(a) Faster-RCNN

(b) YOLOv3

Figure 10. Cont.

J. Mar. Sci. Eng. 2021, 9, 932 12 of 17 J. Mar. Sci. Eng. 2021, 9, x FOR PEER REVIEW 13 of 19

(c) Our Model

FigureFigure 10. The 10. The experiment experiment result result for for cargo cargo ships. ships. ((a–ca–c) are the the detection detection results results of cargo of cargo ships ships by faster by faster-RCNN,-RCNN, YOLOv3, YOLOv3, J. Mar. Sci. Eng. 2021, 9, x FOR PEER REVIEW 14 of 19 and ourand model, our mode respectively.l, respectively.

3.5.2. Performance of Other Remote Sensing Images Given that Google Earth images are the integration of multiple satellite images and aerial images, this paper conducts tests on high-resolution remote sensing images of mul- tiple satellites, including Skysat 1.0 m/pixel resolution images (Figure 11), Deimos-2 0.5~2 m/pixel resolution images (Figure 12), and QuickBird 0.5~2 m/pixel resolution images (Figure 13), to verify the performance of our model on remote sensing images of a single source.

Figure 11. The Skysat image of Tianjin Port. Figure 11. The Skysat image of Tianjin Port.

Figure 12. The Deimos-2 image of Dalian Port.

J. Mar. Sci. Eng. 2021, 9, x FOR PEER REVIEW 14 of 19

J. Mar. Sci. Eng. 2021, 9, 932 13 of 17

Figure 11. The Skysat image of Tianjin Port.

J. Mar. Sci. Eng. 2021, 9, x FOR PEER REVIEW 15 of 19 Figure 12. The Deimos-2Deimos-2 image of Dalian Port.Port.

Figure 13. The QuickBird image of Pescara Port. Figure 13. The QuickBird image of Pescara Port. 4. Discussion 4. Discussion Using remote sensing technology to obtain the remote sensing images of the cargo shipsUsing, combined remote with sensing the deep technology learning to object obtain detection the remote algorithms sensing to images obtain of the the classifi- cargo cation,ships, combinedposition, and with direction the deepal learninginformation object of the detection cargo ship algorithms in real time, to obtain is of thegreat classi- sig- nificancefication, position,to cargo ship and monitoring. directional informationIn this paper, of to the further cargo excavate ship in realthe direction time, is ofal infor- great mationsignificance of cargo to cargoships in ship remote monitoring. sensing images, In this paper,by using to the further unique excavate perspective the directional of remote sensinginformation images, of cargo the direction ships inal remote recognition sensing problem images, of by cargo using ships the unique is transformed perspective into of a classificationremote sensing problem, images, theso that directional the model recognition can complete problem the of cargodirectional ships isdiscrimination transformed whileinto a completing classification the problem, detection so task. that By the extracting model can the complete heatmap the of directionalthe feature discrimina-map corre- tion while completing the detection task. By extracting the heatmap of the feature map sponding to the detection box, it is found that the model pays more attention to the head corresponding to the detection box, it is found that the model pays more attention to the of the cargo ship, which is consistent with the actual judgment of the direction of the cargo head of the cargo ship, which is consistent with the actual judgment of the direction of ship (Table 6). At the same time, real-time processing of remote sensing images on the the cargo ship (Table6). At the same time, real-time processing of remote sensing images space-borne platform is the future trend. However, due to the limited computing and on the space-borne platform is the future trend. However, due to the limited computing storage resources of the spaceborne platform, the complex and huge target detection and storage resources of the spaceborne platform, the complex and huge target detection model has not been widely used. Therefore, we propose a novel cargo ship detection and directional discrimination method for remote sensing images based on a lightweight net- work, which has important practical significance.

J. Mar. Sci. Eng. 2021, 9, 932 14 of 17

J.J. Mar.J. Mar. Mar. Sci. Sci. Sci. Eng. Eng. Eng. 2021 2021 2021, 9, ,,9 9x, , x FORx FOR FOR PEER PEER PEER REVIEW REVIEW REVIEW 161616 of of of19 19 19

J.J. Mar.J. Mar. Mar. Sci. Sci. Sci. Eng. Eng. Eng. 2021 2021 2021, 9, ,9, x,9 x,FOR xFOR FOR PEER PEER PEER REVIEW REVIEW REVIEW 1616 16of of of19 19 19 model has not been widely used. Therefore, we propose a novel cargo ship detection J.J. Mar.J. Mar. Mar. Sci. Sci. Sci. Eng. Eng. Eng. 2021 2021 2021, 9, ,,9 9x, , x FORx FOR FOR PEER PEER PEER REVIEW REVIEW REVIEW 161616 of of of19 19 19 and directional discrimination method for remote sensing images based on a lightweight network,TableTableTable 6. 6. which6.Detection Detection Detection has results results importantresults and and and corresponding corresponding practicalcorresponding significance. h h eatmapsheatmapseatmaps. . . TableTableTable 6. 6. 6.Detection Detection Detection results results results and and and corresponding corresponding corresponding h heatmaps heatmapseatmaps. . . DDDetectionetectionetection R R esultResultesult RegionRegionRegion P P roposalProposalroposal HeatmapHeatmapHeatmap ClassClassClass P P redictionPredictionrediction DirectionDirectionDirection P P redictionPredictionrediction TableTableTableTable 6. 6. 6. 6.Detection Detection DetectionDetection results results results results and and and and corresponding corresponding corresponding corresponding h h eatmapsh heatmaps.eatmapseatmaps. . . DDetectionDetectionetection R R esultResultesult RegionRegionRegion P Proposal Proposalroposal HeatmapHeatmapHeatmap ClassClassClass P Prediction Predictionrediction DirectionDirectionDirection P Prediction Predictionrediction DDDetectionDetectionetectionetection R R ResultesultResultesult RegionRegionRegion P ProposalP roposalProposalroposal Heatmap HeatmapHeatmapHeatmap ClassClassClass Class P P rediction PredictionPredictionrediction DirectionDirectionDirection Direction P P redictionPrediction Predictionrediction

BulkBulkBulk carrier carrier carrier EastEastEast BulkBulkBulk carrier carrier carrier EastEastEast BulkBulkBulkBulk carrier carrier carrier EastEastEast East

ContainerContainerContainer SouthSouthSouth ContainerContainerContainer SouthSouthSouth South ContainerContainerContainer SouthSouthSouth

TankerTankerTanker NorthNorthNorth North TankerTankerTanker NorthNorthNorth TankerTankerTanker NorthNorthNorth

5.5.5. Conclusions5. Conclusions Conclusions Conclusions 5.5. 5.Conclusions Conclusions Conclusions ThisThisThisThis paper paper paper paper proposes proposes proposes proposes a a a novela novel novel novel cargo cargo cargo cargo ship ship ship ship detection detection detection detection and and and and directional directional directional directional discrimination discrimination discrimination discrimination 5.5.5. Conclusions Conclusions Conclusions methodmethodmethodmethodThisThisThis for for for paper for paperremote paperremote remote remote proposes proposes proposessensing sensing sensing sensing aimage aimage anovelimage imagesnovel novelss baseds cargobased cargobased basedcargo on shipon shipon on shipa a alightweight alightweight detectionlightweight detection lightweight detection and network, and network, andnetwork, network, directional directional directional which which which which discrimination can discriminationcan discriminationcan can efficiently efficiently efficiently efficiently methodandandmethodandmethodand accurately This accuratelyThisaccuratelyThis accurately for for paperfor paperremote paper remote remote classify, classify, proposesclassify, proposes classify, proposessensing sensing sensing locate locate locate locate,aimage aimage a novelimage novel,novel ,and , andands and sbased scargobased discriminatecargobased cargodiscriminatediscriminate discriminate on shipon shipon shipa a lightweight a lightweight detection lightweight detection detectionthe thethe the direction directiondirection direction and network, and andnetwork, network, directional directionalof directional ofof ofcargo cargowhichcargo cargowhich which ships ships discriminationshipscan ships discriminationcan discrimination can efficientlyin efficientlyin inefficiently inremote remoteremote remote andsensingmethodsensingandmethodsensingmethodandsensing accurately accurately accurately images.for images. for images.for images. remote remote remote classify, Aimingclassify, Aiming classify,Aiming Aimingsensing sensing sensing locate at locateat locateat attheimage the image theimage the, problem, and problem ,andproblem problemsand sbaseds based discriminate baseddiscriminate discriminate in in on in on inwhichon which whicha whicha alightweight lightweight lightweight the the the the complexdirection complex directioncomplexdirection complex network, network, network, and of andandof and ofcargo largecargo whichlarge cargolargewhich largewhich ships object ships object objectshipscanobject can can inefficiently dete efficientlyin deteefficiently indete detectionremote remote remotectionctionction sensingnetworkandnetworksensingandnetworkandsensingnetwork accurately accuratelyaccurately images. isimages. is images. isnot isnot not not conducive conducive classify,Aimingconducive classify, conduciveAimingclassify, Aiming locate at locatetoat locateto attheto torealthe realthe real ,real-time problem, and- problem, time and-problemand-timetime discriminate discriminateautonomous discriminateautonomous autonomous autonomousin in inwhich which which the operationthe theoperationthe operation operationcomplexdirection complex direction directioncomplex on on onand onofand the ofandthe of thethecargo largecargo spaceborne cargolargespaceborne largespaceborne spaceborne shipsobject shipsobject shipsobject in platform,dete inplatform,dete in platform,deteremote platform, remoteremotectionctionction networkwesensingwenetworksensingwesensingnetworkwe use use use use one images. one isimages. oneisimages. one-shotisnot -notshot- notshot-shot conducive conducive Aimingconducive feature Aiming feature Aimingfeature feature ataggregation toataggregation attoaggregation the toreal aggregationthe thereal real problem -problem time-problemtime-time autonomousmodules modules autonomous modulesautonomous in modulesin in which which which and and and the operation andthe thedepthwiseoperation depthwise operation depthwisecomplex complex complex depthwise on on separableonand separableand the separableandthe the separablelargespaceborne large spaceborne largespaceborne convolutionsobject convolutions objectconvolutionsobject convolutions platform,dete platform,dete deteplatform,ctionctionction to to to wedesignnetworkdesignwenetworkdesignnetworkweto use use designuse an one an oneis an one is efficient isnot-efficient notshotefficient - annotshot-shot conducive efficientconducive conducivefeature feature featureand and and lightweight andlightweight aggregation lightweighttoaggregation to aggregationto real lightweightreal real-time--timetime feature feature modulesautonomousfeature modulesautonomous autonomousmodules feature extraction extraction extraction and and and extraction operation depthwiseoperation operationdepthwise depthwise network, network, network, network, on on onseparable separable thenamely separablethenamely thenamely spaceborne spaceborne spaceborne namely the convolutionsthe convolutionsthe convolutions one one one the platform,- platform, shot-platform,-shot one-shotshot ag- ag- to ag-to to designgregationwegregationdesignwegregationwedesignaggregation use useuse an one an oneanone efficient and efficient- andshot andefficient--shotshot and depthwise depthwisefeature depthwise feature featureand depthwiseand and lightweight lightweightaggregation lightweight aggregationaggregation separable separable separable separable feature feature modulesnetwork feature modulesnetworkmodules network network extraction extraction extraction and (OSADSNet). and(OSADSNet).and(OSADSNet). (OSADSNet).depthwise depthwisedepthwise network, network, network, At Atseparable At separablenamely separablenamely the Atnamely the the the same same same the sameconvolutions the convolutionstheconvolutions time,one time,one time,one time,-shot- shot- theshot the the the ag- Kag-to Kag- to- Kto K--- gregationMean++designMean++gregationdesignMean++designgregationMean++ an an algorithm analgorithm algorithm efficientand efficientandalgorithmefficient and depthwise depthwise depthwise is and is and isintroducedand introduced isintroduced lightweight lightweightintroducedlightweight separable separable separable into into into feature intothe feature networkfeaturethe networkthe network RPN theRPN RPN extraction extraction RPNextractionto to (OSADSNet).to (OSADSNet).form (OSADSNet).form form to formthe network, the network,thenetwork, K K -K theRPN- RPN-RPN At At K-RPNAtnamely namelytothe namelyto theto thegenerate generate generate same sametosame the thethe generate time,one atime, onea time,onemorea more -moreshot- -shot theshot the a thesuit- suit- ag- more suit- Kag- Kag-- K-- Mean++ablegregationableMean++gregationablegregationMean++suitable region region region algorithm algorithm algorithm regionand andproposal andproposal proposal depthwise depthwise depthwiseproposal is is is introduced for introduced forintroducedfor cargo cargo forcargo separable separable separable cargo ship ship into shipinto into shipdetection. thedetection. networkthedetection. networkthe network detection.RPN RPN RPN to Theto (OSADSNet).The to(OSADSNet).formThe (OSADSNet).form form Thecurrent current current the thecurrent the K K -detectionK RPN-detection RPNdetection- AtRPN detectionAtAt to the to the thetogenerate generate results generatesameresults sameresults same results time, ofa time, of time,a ofmore a the ofmore the more the the theca theca suit- casuit- rgo cargosuit- rgoK rgo K- K - - ableshipMean++shipableMean++shipMean++ableship inregion inregion in region inthe the algorithmthe algorithm thealgorithm remote remoteproposal remoteproposal remoteproposal sensing issensing issensing is sensingintroducedfor introduced forintroduced for cargo cargo image cargoimage image image ship ship intoare ship into are intoare are detection.output thedetection.output theoutput detection.the output RPN RPN RPN as as as to asthe The to the toThe theformThe theform categoryform category current categorycurrent categorycurrent the the the K K and-detectionK andRPN- detection and-RPNdetection andRPN location location locationto locationto to generate generate resultsgenerate results results of of of ofthe the the of thea of acargo ofamorecargo themore cargomorethe cargothe ca ship,casuit- ship, casuit-rgo ship,suit-rgo ship,rgo shipbutablebutshipablebutableshipbut theinregion theinregion the theregioninthe directionthe directionthedirection directionalremote remoteproposal remoteproposal proposalal alsensingal sensinginformation sensing information informationfor informationfor for cargo cargoimage cargoimage image ship ship of areship ofareof ofarethe thedetection.output the detection.outputthe detection. outputcargo cargocargo cargo as as shipas the ship The theship shipThetheThe category is categorycurrent iscategorycurrent is iscurrentlacking. lacking.lacking. lacking. anddetection anddetection anddetection To TolocationTo locationTo locationsolve solvesolve solve results results results this of thisofthis thisofthe the problem, theof problem, problem, ofcargo problem, of cargo the cargothe the ca ship,ca ship,ca rgoweship, rgo wergowe we buttransformshiptransformbutshiptransformshipbuttransform the inthe in thein the directionthe thedirection direction the remote the remote the theremote direction direction direction directionalal alsensing alsensing information sensinginformation informationalalal recognition recognitionimage recognitionimage image of areof are ofarethe the output theproblemoutput outputproblem problemproblemcargo cargo cargo as as asship oftheship ofthe shipofthe the the category theis category iscategory cargois lacking. cargocargo lacking.cargo lacking. ship andshipship and shipand To To locationin inTolocationin locationinsolve the thesolve the solvethe r remote emoter thisr emoteofemotethis of thisof the the problem, the problem, sensing problem,cargosensing cargosensing cargo ship, ship,image im-ship,we im- we im-we transformagebutagetransformbutagebuttransforminto intothe into theintothe a direction classificationa directiona directionaclassificationthe classificationthe classificationthe direction direction directionalalal information informationinformation problemal problemal problem alrecognitionproblem recognition recognition without of without ofwithout ofwithout the the theproblem problem introducingproblemcargo cargo cargointroducing introducing introducing ship of ship ofship ofthe the isthe additional is iscargo lacking. cargoadditional lacking.cargoadditional lacking.additional ship ship ship To parameters, ToinTo in parameters solveinparameters the parameterssolvethesolve the r emote r thisemote r thisemotethis which ,problem, ,problem,whichsensing ,problem, which sensing whichsensing completes com- com- im-com-we im- weim-we agepletestransformpletesagetransformpletestransformagethe into into into directionalthe the thea a directionclassificationthea directionclassification the directiontheclassification direction direction direction estimationalalal estimation estimation alestimationproblem alproblemal recognitionproblem recognition recognition while whilewithout while without whilewithout completing problem problem completingproblem completing completing introducing introducing introducing of of theof the the the the detectionthecargo cargoadditional cargodetectionadditional detection additionaldetection ship ship ship task. in task.inparameters task.in parameterstask. the parametersthe the Finally, Finally,r Finally, emoter Finally,remoteemote, a ,whichsensing a ,whichcomparativesensing a whichsensingacompara- compara- compara- com- com- im-com- im- im- pletestiveagetivepletesagetiveagepletesexperiment intoexperiment into experiment into experimentthe the thea adirectionaclassification directionclassification classificationdirection is is conductedis isconductedal conducted alconducted alestimation estimation estimationproblem problem problem based based based based whilewithout whilewithout onwithoutwhile on theon oncompleting completingthe completing the self-builtintroducingthe introducing introducing self self self-built--builtbuilt the dataset.the the dataset. additional dataset.detectionadditional dataset.detectionadditional detection Experimental Experimental Experimental Experimental task. parameters task.parameters task.parameters Finally, Finally, Finally, results ,results , resultswhich a,results which a which compara-a compara- showcompara- show com- showcom- showcom- that tivethatpletesthattivepletesthatpletestiveour experimentour experiment our ourexperimentthe model the themodel modeldirection modeldirection direction can can is caniscan meet isconducted al conducted almeet conductedal meetestimationmeet estimation estimation the the the requirementsthe requirements based requirementsbased requirementsbased while while while on on oncompleting completingthe completingthe ofthe selfof spaceborneself ofofself spaceborne - spacebornebuilt-spacebornebuilt-built the the the dataset. dataset.detection dataset.detection detection platform platform platformplatform Experimental Experimental Experimentaltask. task. cargotask. cargo cargoFinally, cargoFinally, Finally, ship ship shipresultsship results detectionaresults a acompara-detection compara- detectioncompara-detection show show show and thatandtiveandthattiveandtivethatdirectional directionalexperimentour directional experimentour directionalexperimentour model model model discrimination discrimination can isdiscrimination can discrimination iscan isconducted conductedmeet conductedmeet meet the the the inrequirements based requirements inbased requirementsbasedin termsin terms terms terms on onon of ofthe of themodelofthe model model ofself model selfof self ofspaceborne -spaceborne built size-spaceborne -builtsize builtsize size (110 dataset. (110 dataset. (110 dataset.(110 MB), MB)platform MB)platform MB)platform Experimental Experimental, detection Experimental,detection ,detection detection cargo cargo cargoaccuracy accuracy ship accuracy shipaccuracy results ship resultsresults detection detection detection ((mAPshow mAP( show(mAPshowmAP of andofthatofandthatofthatand 91.96%),91.96%) 91.96%) directional91.96%)our directional ourdirectionalour model model,model and,a ,a ndandnddetection discriminationcan detection discrimination can detectiondiscriminationcandetection meet meetmeet speed the speed thespeed thespeed requirements requirements(predictionin requirementsin in(prediction terms (predictionterms(prediction terms of of of timemodel model of modeltime ofof time timespaceborne of spacebornespaceborne size46 of size of sizeof ms46 46 (110 46(110 per(110ms msms MB)platform image).perMB) platform perplatformMB)per, image) ,detection image),detectionimage) detection Considering cargo cargocargo. .Considering . Considering Considering accuracy ship accuracy shipaccuracyship detectionthat detectiondetection ( GooglemAP ( thatmAP ( thatmAPthat ofGoogleandGoogleofandGoogleand ofEarth91.96%) 91.96%) directional91.96%) directional directional Earth imagesEarth Earth, ,a ,and aimagesnd imagesnd images discrimination aredetection discrimination detectiondiscrimination detection the are are are integration the the speed thespeed speedintegration integration integrationin in in(prediction terms(prediction terms (prediction ofterms multiple of of ofof of model multiple modelmultiple modelmultipletime time satellitetime size of size ofsize ofsatellite 46 satellite 46satellite(110 46images(110 (110ms ms ms MB) perMB) MB)imagesper imagesper images and , image) ,detection image), detection image)detection aerial and and and. .Considering .aerialConsidering images, aerial Consideringaccuracyaerial accuracy accuracy images, images, images, to ( test mAP( that(mAP mAPthat tothat to theto GoogletestoftestGoogleoftestofGoogle effectiveness91.96%) 91.96%)the 91.96%)the the Eartheffectiveness effectivenessEarth effectivenessEarth, ,a , and aimagesnd ndimages images ofdetection detectiondetection the ofare of are modelof arethe the the thespeed themodel speed speedmodel integrationmodel forintegration integration (predictionremote (predictionfor (predictionfor for remote remote remote of sensingof ofmultiple multiple timemultiple sensing timesensing timesensing imagesof of ofsatellite 46 satelliteimages 46satelliteimages 46images ms ms thatms per imagesper thatper images arethat imagesthat image) image) image)are from are are and fromand fromand .from a.Considering .aerial ConsideringsingleaerialConsidering aaerial a singlea single single images, spaceborneimages, images, space- space- space-that that thatto to to testbGooglebtestGoogleornebGoogletestorneorne the the theplatform, platform,effectivenessEarth platform,effectivenessEarth Eartheffectiveness images images images remote remoteremote of are of are ofarethe sensingthe the sensingthesensingthe themodel model integrationmodel integration integration images imagesforimages for for remote remote remote offrom of fromoffrom multiple multiple multiplesensing sensingdifferent sensing differentdifferent satellite satelliteimages satelliteimages imagessatellites satellitessatellites imagesthat images thatimages that are are areare and fromandselected andfrom selected selectedfrom aerial aerial aeriala a singlea single forsingle images, for forimages, images, testing, testing,space- testing,space- space- to to to bandtestandbtestorneandtestborneorne thethe the the thetheplatform, platform, effectivenessexperimental platform,effectiveness experimental effectivenessexperimental remote remote remote of resultsof resultsof resultsthe sensingthe thesensing sensing model model modelshow show show images images for imagesthefor forthe the remote remoteeffectiveness remoteeffectiveness effectivenessfrom from from sensing sensingdifferent sensingdifferent different of of imagesof images the imagesthesatellites thesatellites satellites model. model. model. that that that are are are are fromselected fromselected fromselected a a singlea single forsingle for for testing, testing,space- testing,space- space- andbandbornebandorneorne the the theplatform, platform,experimentalplatform, experimental experimental remote remoteremote results results resultssensing sensingsensing show show show images images imagesthe the the effectiveness effectiveness effectivenessfrom fromfrom different differentdifferent of of ofthe thesatellites the satellitessatellites model. model. model. are areare selected selectedselected for forfor testing, testing,testing, andandand the the the experimental experimental experimental results results results show show show the the the effectiveness effectiveness effectiveness of of of the the the model. model. model.

J. Mar. Sci. Eng. 2021, 9, 932 15 of 17

platform, remote sensing images from different satellites are selected for testing, and the experimental results show the effectiveness of the model. This study has some limitations. The proposed method can complete the classification, positioning, and directional discrimination of cargo ships in remote sensing images well. The direction of the cargo ship is divided into four directions: east, south, west, and north. Remote sensing images of cargo ship contain rich information, and this paper only mined part of this information. Therefore, the direction of cargo ship can be further refined to obtain more detailed directional information. In the future, we will consider refining the direction of the cargo ship into north, northeast, east, southeast, south, southwest, west, northwest, or even a degree.

Author Contributions: Conceptualization, P.W., J.L., and Y.Z.; methodology, P.W. and J.L.; software, P.W.; validation, P.W., J.L., and Y.Z.; data curation, P.W., J.L., Y.Z., Z.Z., Z.C., and N.S.; writing— original preparation, P.W. and J.L.; writing—review and editing, P.W., J.L., and Y.Z.; visualiza- tion, P.W.; supervision, P.W. and J.L.; project administration, P.W., J.L., Y.Z., Z.Z., Z.C., and N.S. All authors have read and agreed to the published version of the manuscript. Funding: This research was funded by High-Level Talent Research of Zhengzhou University, grant number 32310216. Institutional Review Board Statement: Not applicable. Informed Consent Statement: Not applicable. Data Availability Statement: The data could be available on request from the corresponding author ([email protected]). Acknowledgments: The authors would like to thank the anonymous reviewers and editors for their useful comments and suggestions, which were of great help in improving this paper. Conflicts of Interest: The authors declare no conflict of interest.

Abbreviations The following abbreviations are used in this manuscript: AP Average precision AIS Automatic identification system BN Batch normalization CNN Convolutional neural network Fn False negatives Fp False positives GT Ground truth IOU Intersection over union mAP Mean average precision P Precision R Recall R-CNN Region-based convolutional neural networks ReLU Rectified linear unit ROI Regions of interest RPN Region proposal network SSD Single shot multibox detector Tp True positives YOLO You only look once

References 1. Tang, J.X.; Deng, C.W.; Huang, G.B.; Zhao, B.J. Compressed-Domain Ship Detection on Spaceborne Optical Image Using Deep Neural Network and Extreme Learning Machine. IEEE Trans. Geosci. Remote Sensing 2015, 53, 1174–1185. [CrossRef] 2. Cheng, G.; Han, J.W. A survey on object detection in optical remote sensing images. ISPRS-J. Photogramm. Remote Sens. 2016, 117, 11–28. [CrossRef] 3. Yang, X.; Sun, H.; Fu, K.; Yang, J.R.; Sun, X.; Yan, M.L.; Guo, Z. Automatic Ship Detection in Remote Sensing Images from Google Earth of Complex Scenes Based on Multiscale Rotation Dense Feature Pyramid Networks. Remote Sens. 2018, 10, 132. [CrossRef] J. Mar. Sci. Eng. 2021, 9, 932 16 of 17

4. Liu, T.; Pang, B.; Zhang, L.; Yang, W.; Sun, X.Q. Sea Surface Object Detection Algorithm Based on YOLO v4 Fused with Reverse Depthwise Separable Convolution (RDSC) for USV. J. Mar. Sci. Eng. 2021, 9, 753. [CrossRef] 5. Li, X.G.; Li, Z.X.; Lv, S.S.; Cao, J.; Pan, M.; Ma, Q.; Yu, H.B. Ship detection of optical remote sensing image in multiple scenes. Int. J. Remote Sens. 2021, 29. [CrossRef] 6. Wang, Q.; Shen, F.Y.; Cheng, L.F.; Jiang, J.F.; He, G.H.; Sheng, W.G.; Jing, N.F.; Mao, Z.G. Ship detection based on fused features and rebuilt YOLOv3 networks in optical remote-sensing images. Int. J. Remote Sens. 2021, 42, 520–536. [CrossRef] 7. Yokoya, N.; Iwasaki, A. Object localization based on sparse representation for remote sensing imagery. In Proceedings of the 2014 IEEE International Geoscience and Remote Sensing Symposium, Quebec City, QC, Canada, 13–18 July 2014; pp. 2293–2296. 8. Li, Z.M.; Yang, D.Q.; Chen, Z.Z. Multi-Layer Sparse Coding Based Ship Detection for Remote Sensing Images. In Proceedings of the 2015 IEEE International Conference on Information Reuse and Integration, San Francisco, CA, USA, 13–15 Auguest 2015; pp. 122–125. 9. Zhou, H.T.; Zhuang, Y.; Chen, L.; Shi, H. Ship Detection in Optical Satellite Images Based on Sparse Representation. In Signal and Information Processing, Networking and Computers; Sun, S., Chen, N., Tian, T., Eds.; Springer: , 2018; Volume 473, pp. 164–171. 10. Bhagya, C.; Shyna, A. An Overview of Deep Learning Based Object Detection Techniques. In Proceedings of the 2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT), Chennai, India, 25–26 April 2019. 11. Wu, X.W.; Sahoo, D.; Hoi, S.C.H. Recent advances in deep learning for object detection. Neurocomputing 2020, 396, 39–64. [CrossRef] 12. Kwak, N.-J.; Kim, D. Object detection technology trend and development direction using deep learning. Int. J. Adv. Cult. Technol. 2020, 8, 119–128. 13. Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. 14. Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision, Las Condes, Chile, 11–18 December 2015; IEEE: New York, NY, USA, 2015; pp. 1440–1448. 15. Ren, S.Q.; He, K.M.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [CrossRef][PubMed] 16. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; IEEE: New York, NY, USA, 2016; pp. 779–788. 17. Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2016; IEEE: New York, NY, USA, 2017; pp. 6517–6525. 18. Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. 19. Bochkovskiy, A.; Chien-Yao, W.; Liao, H.Y.M. YOLOv4: Optimal speed and accuracy of object detection. arXiv 2020, 17, 17. 20. Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Computer Vision—Eccv 2016, Pt I; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer International Publishing Ag: Cham, Switzerland, 2016; Volume 9905, pp. 21–37. 21. Lin, H.N.; Shi, Z.W.; Zou, Z.X. Fully Convolutional Network With Task Partitioning for Inshore Ship Detection in Optical Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1665–1669. [CrossRef] 22. Sun, Y.J.; Lei, W.H.; Ren, X.D. Remote sensing image ship target detection method based on visual attention model. In Lidar Imaging Detection and Target Recognition 2017; Lv, D., Lv, Y., Bao, W., Eds.; Spie-Int Soc Optical Engineering: Bellingham, WA, USA, 2017; Volume 10605. 23. Wei, S.H.; Chen, H.M.; Zhu, X.J.; Zhang, H.S. Ship Detection in Remote Sensing Image based on Faster R-CNN with Dilated Convolution. In Proceedings of the 39th Chinese Control Conference, Shenyang, China, 27–29 July 2020; pp. 7148–7153. 24. Li, Q.P.; Mou, L.C.; Liu, Q.J.; Wang, Y.H.; Zhu, X.X. HSF-Net: Multiscale Deep Feature Embedding for Ship Detection in Optical Remote Sensing Imagery. IEEE Trans. Geosci. Remote Sensing 2018, 56, 7147–7161. [CrossRef] 25. Zhang, S.M.; Wu, R.Z.; Xu, K.Y.; Wang, J.M.; Sun, W.W. R-CNN-Based Ship Detection from High Resolution Remote Sensing Imagery. Remote Sens. 2019, 11, 631. [CrossRef] 26. Yun, W. Ship Detection Method for Remote Sensing Images via Adversary Strategy. In Proceedings of the 2019 IEEE International Conference on Signal, Information and Data Processing (ICSIDP), Chongqing, China, 11–13 December 2019; p. 4. [CrossRef] 27. Lee, Y.W.; Hwang, J.W.; Lee, S.; Bae, Y.; Park, J. An Energy and GPU-Computation Efficient Backbone Network for Real-Time Object Detection. In Proceedings of the 2019 IEEE/Cvf Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–17 June 2019; pp. 752–760. 28. Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2016; IEEE: New York, NY, USA, 2017; pp. 1800–1807. 29. Kaiming, H.; Xiangyu, Z.; Shaoqing, R.; Jian, S. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [CrossRef] J. Mar. Sci. Eng. 2021, 9, 932 17 of 17

30. Huang, G.; Liu, Z.; van der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2016; IEEE: New York, NY, USA, 2017; pp. 2261–2269. 31. Olukanmi, P.O.; Nelwamondo, F.; Marwala, T. k-Means-Lite plus plus: The Combined Advantage of Sampling and Seeding. In Proceedings of the 2019 6th International Conference on Soft Computing & Machine Intelligence, Johannesburg, South Africa, 30 June 2019; pp. 223–227. 32. Tzutalin, D. Labelimg. 2019. Available online: https://github.com/tzutalin/labelImg (accessed on 14 April 2020). 33. Everingham, M.; Van Gool, L.; Williams, C.K.I.; Winn, J.; Zisserman, A. The Pascal Visual Object Classes (VOC) Challenge. Int. J. Comput. Vis. 2010, 88, 303–338. [CrossRef]