<<

sensors

Article A Machine Learning Approach to Argo Data Analysis in a

Yu Jiang *, Yu Gou, Tong Zhang, Kai Wang and Chengquan Hu *

College of Computer Science and Technology, Jilin University, Changchun 130012, China; [email protected] (Y.G.); [email protected] (T.Z.); [email protected] (K.W.) * Correspondence: [email protected] (Y.J.); [email protected] (C.H.); Tel.: +86-431-8516-6269 (Y.J. & C.H.)

Received: 10 August 2017; Accepted: 21 September 2017; Published: 28 September 2017

Abstract: With the rapid development of sensor networks, big marine data arises. To efficiently use these data to predict , we propose a machine learning approach. We firstly focus on analyzing how , , and geographic location features affect the formation of thermocline. Then, an improved model based on entropy value method for the thermocline selection is demonstrated. The experiments adopt BOA Argo data sets and the experimental results show that our novel model can predict thermoclines and related data effectively.

Keywords: thermocline; machine learning; statistical learning; entropy value calculation

1. Introduction The marine environment is important to living and the exploration of the marine environment has become necessary. Thermoclines, an important marine phenomenon, helps a lot in detecting area. The past decades have seen much research on thermoclines, most of which adopt traditional statistical methods to analyze thermoclines. Representative works are as follows. Zhang et al. demonstrated the distribution characteristics of thermoclines and rules of its variety via the mass observation method [1]. They analyzed the ocean by investigating data from 1900 to 2004 of the South China and the sea area around Taiwan, and finally got the distribution of thermoclines in coastal areas in China. Jiang et al. obtained the law of thermoclines’ changes monthly through sampling analysis [2]. Wang also analyzed how the main marine characteristics affected underwater vehicles using the method of mass observation. All methods above commonly used statistical methods when analyzing and processing data. With the augmentation of modern detection methods, an increased magnitude and complexity of hydrologic data available brings higher capacity of predicting. Our demands for a data processing model are not only for mathematical statistics processing but also for prediction. Statistical methods can obtain the developing trend of the future situation at the premise of the existing situation through analyzing data. The correctness of this trend depends on the stability of the environment. Natural disasters and human activities may bring unpredictable impacts on the open marine environment. In this case, methods based on statistics do not have enough robustness [3]. The underwater environment is tough and it is hard for to undertake activities in deep . Relying on technological ways to perceive the sea is the best [4]. In order to better study the thermocline and get access to its properties and application scenarios, we use an underwater sensor network to perceive the marine environment. The sensor network has three functions, which are data acquisition, data processing and data transmission. Underwater sensor networks are composed of stationary or mobile self-organized sensors. At present, the underwater sensor network is applied to environmental monitoring, geographic information acquisition, underwater research, disaster prevention and auxiliary navigation, etc. The first country to carry out underwater

Sensors 2017, 17, 2225; doi:10.3390/s17102225 www.mdpi.com/journal/sensors Sensors 2017, 17, 2225 2 of 16 sensor networks was the United States. In the 1950s, the United States had deployed acoustic Sensors 2017, 17, 2225 2 of 16 monitoring system—SOSUS in the Atlantic and Pacific . Ocean Network Canada and the Universitynetworks of Victoriawas the Uniteddeployed States. the In VENUSthe 1950s, and the Un NEPTUNEited States withhad deployed underwater acoustic sensor monitoring networks to collectsystem—SOSUS and transmit in the data. Atlantic China and Pacific deployed oceans the. Ocean first Network underwater Canada surveillance and the University system of of the SouthVictoria China Seadeployed in May the 2013. VENUS and NEPTUNE with underwater sensor networks to collect and Astransmit the underwater data. China environment deployed the isfirst more underwater complicated surveillance than expected, system currentof the South sensor China networks Sea in may May 2013. not fully meet the needs. Traditional sensor networks are limited to collect temperature, humidity, As the underwater environment is more complicated than expected, sensor networks position, intensity, , biochemical and other scalar data; however, in reality we need more media may not fully meet the needs. Traditional sensor networks are limited to collect temperature, informationhumidity, such position, as video intensity, and audio pres image.sure, biochemical Under such and demands, other scalar a new data; sensor however, network—Multimedia in reality we Sensorneed Networks—appears. more media information MSN such introduces as video video,and audio audio image. and Under image such sensors demands, into traditional a new sensor sensor networks.network—Multimedia The new networks Sensor offer Netw us aorks—appears. better perception MSN ofintroduces the ocean. video, In addition, audio and multiple image sensors platforms’ data fusioninto traditional offers us sensor access networks. to the marineThe new environment networks offer in us all a situations. better perception We will of notthe ocean. only limitIn to sensoraddition, networks. multiple An integrated platforms’ platformdata fusion with offers buoys us access data, to remote the marine sensing environment data, AUV in all data situations. and sensor networkWe will information not only limit helps to mankind sensor networks. to know An the inte ocean,grated protect platform the with ocean buoys and data, make remote use ofsensing the ocean. Thedata, traditional AUV data thermoclineand sensor network selection information rule is “the helps stronger, mankind the better”,to know i.e., the toocean, choose protect a layer the by its ocean and make use of the ocean. definition concerning temperature, depth and strength. In this article, we propose a new thermocline The traditional thermocline selection rule is “the stronger, the better”, i.e., to choose a layer by selection method. On the basis of the traditional method, we add more characteristics including the its definition concerning temperature, depth and strength. In this article, we propose a new informationthermocline on geography, selection method. salinity, On year the and basis month, of the andtraditional then get method, a new we sample add more criteria-score. characteristics In order to ensureincluding thermocline-defining the information on consistency geography, salinity, between year the and traditional month, and selection then get method a new and sample the new method,criteria-score. we set up In a order one-to-one to ensure mapping thermocline-defining relationship from consistency strength between to score. the Withtraditional mapping selection strength beingmethod critical and value the tonew score method, criteria, we set we up could a one-to-one analyze mapping thermoclines relationship qualitatively from strength and quantitatively. to score. The applicabilityWith mapping of strength this method being has critical been value proved to byscore the criteria, regional we (79.5 could◦ S, analyze 79.5◦ N, thermoclines 180◦ W–80 ◦ E) experimentalqualitatively results. and quantitatively. The applicability of this method has been proved by the regional Besides,(79.5° S, 79.5° we propose N, 180° W–80° a process E) experimental to tackle marine results. data-related problems. From Figure1, we can see Besides, we propose a process to tackle marine data-related problems. From Figure 1, we can that the first step of the research is to define the exact type of the problem, which can be generally see that the first step of the research is to define the exact type of the problem, which can be generally divided into regression, classification and clustering. After that, we should preprocess the data into divided into regression, classification and clustering. After that, we should preprocess the data into properproper forms forms for analysis.for analysis. According According to to the the characteristicscharacteristics of of the the data, data, we we then then select select an algorithm, an algorithm, whichwhich is used is used for datafor data visualizations. visualizations. The The visualized visualized results then then generate generate feedback feedback to the to algorithm the algorithm model,model, which which improves improves the the model. model.

FigureFigure 1. 1.Marine Marine datadata analytical process. process.

The rest of this paper is organized as follows. In Section 2, we perform a brief review of existing Theresearch rest works. of this In paper Section is 3, organized we introduce as follows.the implementation In Section steps2, we of the perform methods a briefin detail. review In of existing research works. In Section3, we introduce the implementation steps of the methods in detail.

Sensors 2017, 17, 2225 3 of 16

In Section4, experimental evaluation and results are given. Finally, we conclude this work and point out an employment prospect.

2. Related Work In this section, we demonstrate the forming reasons and classification of the thermocline. Through the research development at home and abroad, we here introduce the significance of thermoclines.

2.1. Thermocline A thermocline is a thin but distinct layer in a large body of fluid (e.g., water, such as an ocean or ; air, such as an atmosphere) in which temperature changes more rapidly with depth than it does in the layers above or below. In the ocean, the thermocline divides the upper from the calm deep water below [5]. Salinity is almost stable in the open ocean and pressure only has a slight effect on , so temperature has become one of the most important factors in density. The temperature of the sea surface is higher than that of the deep ocean; accordingly the sea surface has smaller density compared with deep cold water. Temperature and density in thermoclines change quickly which makes thermoclines an important interface to sea circulation [6]. The reasons for the formation of thermoclines are very complex. In general, temperature is the main factor in water delamination, followed by salinity. For most sea areas, the main power of water delamination comes from intrusion of outland high-density water. Water layering and the change of layers’ thickness are closely related to the size and the direction of wind, along with the rainfall strength of the surface layer and low salt water affected by runoff. TIW (Tropical Instability Wave), EUC (Equatorial Under Current) together with ocean currents and atmosphere make come into layers and then form thermoclines. The upper bound of thermoclines mainly mixes inertia gravity wave, Kelvin wave and tropical unstable wave. The sheer of TIW and EUC leads to fusion events caused by the lower Richardson number. Lower space appears unstable, which may be ascribed to the absorbing and saturating of waves’ energy. Solar radiation can only maintain the temperature of upper seawater. With the increase of depth, the differences in temperature are getting bigger and bigger. In a certain depth of water, a great temperature gradient is formed, which is called a thermocline. Thus, the cause of thermoclines is very complicated [7–9].

2.2. Thermocline Classification Thermocline can be divided into two categories. (1) Seasonal thermoclines are mainly distributed in the mid-latitude area, and most of this kind form and develop in and summer. In general, the depth is between 50 and 100 m. The depth and intensity have obvious seasonal changes. In low-latitude areas, the seasonal temperature difference is not big enough to form a layer, so the main thermocline is relatively close to the ocean surface. The depth of the upper bound is roughly between 100 m and 150 m. (2) Main thermoclines (permanent thermoclines) are an important part of the ocean thermal structure, and its strength changes with latitude and longitude. In latitude, the main thermocline near the equator has stronger strength and thinner thickness. The upper bound tends to be shallower. With latitude increased, the main thermocline in the mid-latitude becomes weaker, and the depth of the upper bound gets deeper. The thickness of the thermocline gets a slight increase. At higher latitudes, the main thermocline gets shallow again, and the thickness decreases while the intensity increases. Finally, in the very front area, the thermocline appears in the ocean surface. Along with meridians, the thermoclines around the equator gradually become shallow from west to east. At the same time, the thermocline can be divided into positive thermocline, reverse thermocline, thermocline and mixed thermocline according to its form. Extracting thermocline kinds and forms from marine data is the foundation for all subsequent studies [10]. Figures2–5 show the four thermocline forms mentioned above respectively. Sensors 2017, 17, 2225 4 of 16 Sensors 2017, 17, 2225 4 of 16 Sensors 2017,, 17,, 22252225 4 of 16

Figure 2. Thermocline. FigureFigure 2.2. Thermocline.Thermocline.Thermocline.

Figure 3. Inverse Thermocline. FigureFigure 3.3. InverseInverseInverse Thermocline.Thermocline.Thermocline.

Figure 4. Multi-Thermocline. FigureFigure 4.4. Multi-Thermocline.Multi-Thermocline.Multi-Thermocline.

Figure 5. MixMix Thermocline.Thermocline. FigureFigure 5.5. MixMixMix Thermocline.Thermocline.Thermocline. Figures 4 and 5 also point out that ocean temperature changes with depth in a complicated way, Figures 4 and 5 also point out that ocean temperature changes with depth in a complicated way, whichFigures makes4 anda temperature-changing5 also point out that oceanprediction temperature model a changes tougher with task. depth The inresult a complicated analyzed in which makes a temperature-changing prediction model a tougher task. The result analyzed in way,traditional which statistical makes a temperature-changingmethods is actually a predictiontrend representing model a tougherthe correlation task. The between result values. analyzed In traditionaltraditional statisticalstatistical methodsmethods isis actuallyactually aa trentrend representing the correlation between values. In incontrast, traditional machine-learning statistical methods method iss usually actually get a trendthe inner representing correlation the among correlation water features, between which values. is contrast, machine-learning methods usually get the inner correlation among water features, which is Inobviously contrast, closer machine-learning dependency. methods Hydrological usually getcharacteristics the inner correlation may become among abnormal water features, under which certain is obviously closer dependency. Hydrological characteristics may become abnormal under certain obviouslyconditions, closer which dependency. will lead to a Hydrologicaldeviation from characteristics previous predicting may become trends and abnormal thus form under a relatively certain conditions, which will lead to a deviation from previous predicting trends and thus form a relatively conditions,big error. Machine which will learning lead to amethods deviation seem from more previous suit predictingable for this trends situation. and thus In forma machine a relatively learning big big error. Machine learning methods seem more suitable for this situation. In a machine learning method, we can get the interior correlation, and values may change while correlation is stable. method, we can get the interior correlation, and values may change while correlation is stable.

Sensors 2017, 17, 2225 5 of 16 error. Machine learning methods seem more suitable for this situation. In a machine learning method, we can get the interior correlation, and values may change while correlation is stable. According to the correlation, predicted values change with characteristic values and then get a prediction model closer to the reality.

2.3. Significance of Thermocline Research Thermocline research is of great significance not only in the academic field but also the production, living and military field. In the military field, for example, waves in seawater are affected by the longitudinal and vertical section in a large degree, so it is vital for warfare and anti-submarine warfare to figure out how thermoclines are influenced by temperature and density, and how thermoclines influences sound propagation. Thermoclines are also known as the liquid seabed. Around the bottom of the sea exists an invisible layer, which makes the sound waves’ spread reflect. The sound waves ripping into the liquid seabed can also produce refraction. Therefore, in the bottom of the liquid seabed, a hidden submarine can avoid enemies’ effectively. Only by figuring out the changing rules of the sea temperature can we take advantage of this factor better in the future.

2.4. Thermocline Research Actuality In the past, researchers often used sensor nodes and buoys to collect information of hydrological characteristics, such as temperature, density, pressure and underwater. The information at first was transferred to the ground station through cable and then sent to the server. In recent years, researchers started to collect data using AUVs, which can collect data dynamically. There are several famous thermocline-researching institutions, such as MIT Research institutions, the Woods Hole Oceanographic Institute (http://www.whoi.edu/), the Monterey Bay Aquarium Research Institute (http://www.mbari.org/), Russia’s Far East Ocean Research Institute and the University of Porto. The University of Porto used simplified models of thermocline with the vertical gradient method. The vertical gradient method is a common observation analysis method for thermoclines. Research institutions such as MIT vertically analyzed the thermocline data (https://auvlab.mit.edu/) collected with the vertical gradient method. Then, they integrated time and space, and put forward the “Adaptive Environmental Assessment” (Autonomous the Adaptive Environmental Assessment, AAEA) algorithm, in the case of updating AUVs’ navigation information in an unsupervised judgment area [11]. The use of Argo data for marine research has become a hot spot internationally. The heat budget in the upper ocean (0–300 m) of the North Atlantic from 20–60◦ N is studied using data from Argo profiling floats from 1999 to 2005 and the NCEP/NCAR and NOC surface flux datasets [12]. It is investigated that the Argo profiling float dataset can be utilized to estimate the temperature and heat storage of the upper ocean in the North Atlantic [13]. In situ observations from the autonomous Argo float array are used to assess the heat content change of the basin-averaged ocean driven by tropical cyclones (TCs) in the North Pacific for 2000–2008 [14]. The means, variances, and three-dimensional spatial covariances of the ocean temperature and salinity anomalies in the upper 1400 m layer have been estimated using data of the Argo profiling floats from 2005 to 2007 [15]. It is demonstrated how to use unsupervised classification of the Argo temperature profiles to achieve coherent thermal patterns [16]. In recent years, Chinese researchers from Harbin Engineering University, Shanghai Jiao Tong University, Shenyang Institute of Automation, Chinese Academy of Sciences and the Chinese Academy of Sciences Institute of Marine and so on mainly aimed at thermocline characteristics of Chinese sea areas. Professor Ge used the step function approximation method to calculate the shelf sea thermocline characteristics—step function, research standard thermocline and inverse thermocline—which can fit the temperature curve perfectly [10]. Zhang from the Navy Marine Hydro-Meteorological Center compared the non-equidistant differential method, the vertical gradient method and a 7.2x smoothing algorithm in ocean thermocline eigenvalue analysis and calculation [17]. He aimed at excluding and merging multi-thermoclines as long as thick ones in complicated . Zhang from the Dalian Naval Academy expounded the vertical gradient method and optimal segmentation. He compared and Sensors 2017, 17, 2225 6 of 16 analyzed the performance of these two approaches when determining a thermocline boundary [18]. Thermocline changes from season to season. From winter to fall, these are non-leap period, growth period, strong period and fade period. This means time will also be an important factor in our study. Existing research works are basically according to quarter or month. Given that our research mainly depends on thermoclines, we should try to refine time to form the most accurate thermocline-changing map. At the same time, due to the large number of data, data analysis and processing become the first step in our study. Common methods to interpolate data are as follows: inverse distance weighted interpolation, the Kriging interpolation method, the minimum curvature method, the multiple regression method, the natural adjacent points interpolation method, the nearest neighbor interpolation method, local polynomial method and so on. In this paper, we use the k nearest neighbor interpolation method to deal with the data. After that we will classify the collected data through the machine learning method. After analyzing data, we need to give each group of data a tag, which represents whether the point is in the thermocline. Then we obtain thermocline distribution by calculation and statistical analysis. Further research in the study area aims at setting correlation among time, temperature and depth and acquiring thermocline distribution. With all the information obtained above, we can finally draw a 3D map of temperature, depth and intensity. With the research done thoroughly, the value of the thermocline will become larger than we expect.

3. Methodology

3.1. Determine the Thermocline To study the thermocline, we need to judge whether the thermocline exists in the area. Then we will determine the upper bound and lower bound of the thermocline, and find out the exact location of the thermocline. There are many calculating methods to determine upper and lower bounds for the thermocline. The common ones are the S-T method, vertical gradient method and maximum curvature point method [19]. The so-called depth of the upper bound of the thermocline is the depth from surface to the thermocline intensity threshold. We use the maximum curvature point method and vertical gradient method together to calculate the upper thermocline bound. Here, we briefly explain the concept of thermocline strength, which refers to the quotient of temperature difference divided by depth difference. The current thermocline strength evaluation standard follows the Standard of Marine Survey, in which shallow water depth (less than 200 m) intensity threshold is 0.2 ◦C/m and deep depth (more than 200 m) intensity threshold is 0.05 ◦C/m [20]. As the temperature of the study area changes little after 200~300 m, called a static area, we use 0.2 ◦C/m as the critical value of judging a thermocline for unified calculation so that we can filter out a thermocline of better quality. We can know the maximum regional strength through the vertical gradient method and the maximum curvature point method. If the maximum value cannot reach the critical value, no thermocline exists and the position can be excluded from consideration, which can reduce the sample space to some extent. It is not advisable to determine the upper bound by a single method, and only after being integrated with multiple methods can the best result be found. We here present how to determine a thermocline. First of all, we need to input the matrix of ocean temperature and depth. Combining the vertical gradient method with the maximum curvature point method, we use this matrix to calculate each point’s temperature strength according to the definition. With the calculated strength, we then screen out those points meeting the standard of thermocline (the temperature is more than 0.2 ◦C/m). If the point is in the thermocline, we write the data into a specific file or the data is discarded. After determining the thermocline, we tag all samples. For the convenience of the following work, we will introduce all the characteristics used in this paper and all the samples having these characteristics. Table1 is a table of characteristics and their brief descriptions. Sensors 2017, 17, 2225 7 of 16

Table 1. Characteristics and their descriptions.

Characteristics Brief Descriptions Latitude/Longitude Where the data were collected Year/Month When the data were collected Depth Distance from sea surface to the point Temperature Temperature of the point Strength Temperature difference/depth difference Salinity Hydrological characteristics

3.2. Merge the Thermocline There usually exist 3–5 thermoclines at the same position vertically. These thermoclines need to be merged. It is a fact that thin thermoclines cannot meet the needs of underwater vehicles or ’s diving movement range solely, and a bigger thickness means a higher temperature span. It is good to see more temperature changes, as it is momentous for our research. Note that instruments and some hydrological characteristics will introduce disturbances to data we use. The fluctuations can produce many thinner thermoclines [17]. Their thickness is commonly between 1–2 m. Such thermoclines have all the features of a thermocline such as large span, small thickness, big strength and so on. Although the features are obvious, such layers cannot be viewed as real thermoclines. In order to eliminate fluctuations in our experiments, we decided to interpolate data vertically to a 1 m-interval data, and filter out the thermoclines whose thickness are less than 5 m. The process of merging thermoclines is as follows. After determining a thermocline, we get several untreated thermoclines, and we need to separate them according to their thicknesses. Those of thickness greater than 5 m are needed. If the nearby thermoclines have an interval between current lower bound and next upper bound more than 5 m, the current thermocline is set as an independent layer; if not they are merged with the adjacent thermoclines. After this, the thermoclines produced by fluctuations are filtered out.

3.3. Choose the Thermocline In traditional ocean field, thermocline-related problems are always computed with statistical knowledge. In previous studies, thermocline selection follows the principle of “the stronger, the better”, i.e., the thermocline with the maximum intensity value is optimal at a single vertical gradient [17]. If the intensity of two adjacent thermoclines is close and thickness difference is bigger, the larger thickness one is selected. If the intensity and thicknesses are all the same, the one whose strength is slightly larger is chosen. It is not hard to see that traditional statistical methods sort several thermocline characteristics by importance. However, this approach is only a rough one. When considering strength, other hydrological characteristics may be neglected. Hence, in this paper we are trying to use a machine learning method—“entropy value method”—to analyze and compare several characteristics, and sort them by their importance. Thermocline has a lot of characteristics such as strength, depth, salinity, etc. However, they are not equally important towards thermocline. Our purpose is to comprehensively consider these factors in each sample to get its quality. The advantage is that we can comprehensively evaluate each sample and each attribute, reduce the experimental error caused by statistical error and select a better thermocline. While using “entropy value method” to calculate, the following concepts are needed. Here is a brief introduction. Information entropy: an indicator to measure sample collection’s purity. Assume that the proportion of the first k classes in the data set D sample is pk (k = 1, 2, 3, . . . , |y|)

|y| pk Ent(D) = ∑ pk log2 k=1 Sensors 2017, 17, 2225 8 of 16

The more uncertain of the variables, the greater the entropy values are. Information gain: The bigger the information gain, the greater the purity gained by using attributes “a” to divide. The steps of “entropy value method” are shown in Algorithm 1.

Algorithm 1 Entropy Value Method

STEP 1. Select n samples attributes, then xij shall be the value of the i-th sample’s the j-th attribute (i = 1, 2, ... , n; j = 1, 2, ... , m). STEP 2. Normalize the index and make the homogeneity data a homogeneity. Make

xij = xij n o xij − min xij, ··· , xnj xij = n o n o max x1j, jxnj − min xij, ··· , xnj

then xij shall be the value of the i-th sample’s j-th attribute (i = 1, 2, ..., n; j = 1, 2, ... , m). STEP 3. Calculate the proportion of the i-th sample in j-th attribute.

xij pij = n (i = 1, . . . , n; j = 1, . . . , m) ∑i=1 xij

STEP 4. Calculate the entropy value of the j-th attribute.

n ej = −k∑ pij ln(pij) i=1 1 k = > 0, e ≥ 0 ln(n) j STEP 5. Calculate the information gain.

dj = 1 − ej STEP 6. Calculate the of each index.

dj wj = m ∑j=1 dj STEP 7. Calculate the comprehensive score of each sample.

m si = ∑ wj pij j=1 where, s is the comprehensive score of each sample when it comes to form a thermocline. w is the important degree of each attribute for the formation of thermocline.

Compared with the traditional principle “the stronger, the better”, we propose a new method to evaluate thermoclines. Each point that consists in a thermocline has its own composite score “s”. We add all these “s” values and then calculate the average for each thermocline. Compared with the traditional principle “the stronger, the better”, the machine learning method of “entropy value method” combines all the attribute values. Apparently, such method is more comprehensive, and more accurate. The “entropy value method” can remedy the defects caused by only considering strength without other factors, and make the result more objective. Sensors 2017, 17, 2225 9 of 16 moreSensors accurate.2017, 17, 2225 The “entropy value method” can remedy the defects caused by only considering9 of 16 strength without other factors, and make the result more objective.

3.4.3.4. Use Use Machine Machine Leaning Leaning Method Method to to Predict Predict CollectingCollecting historical historical data data is isto to summarize summarize rules rules and and get get the the predictions predictions for for future future situations. situations. TheseThese goals goals need need us to useuse machinemachine learning learning methods. methods. Machine Machine learning learning is anis an inevitable inevitable outcome outcome with withthe developmentthe development of artificial of artificial intelligence. intelligence. As the As core the of core artificial of artificial intelligence, intelligence, constructing constructing machine machinelearning learning frameworks frameworks has the followinghas the following steps. (1) steps. Divide (1) data Divide into data training into set training and testing set and set. testing (2) Use set.the (2) training Use the set training and its set characteristic and its characteristic vectors to ve trainctors algorithm. to train algorithm. (3) Use the(3) testingUse the set testing to evaluate set to evaluatealgorithm algorithm learnt earlier learnt [ 4earlier]. This [4]. paper This involves paper involves the following the following terms of terms machine of machine learning. learning. DataData set set: :A A batch batch of datadata collectedcollected about about the the object. object. In In this this paper, paper, the the data data sets sets are fromare from the global the globalocean ocean Argo gridArgo data grid set data (BOA set Argo),(BOA recordingArgo), recording the temperature the temperature from 2012~2015 from 2012~2015 about the about the hydrologyfactors’ changing factors’ suchchanging as temperature such as temperature and salinity. and salinity. SampleSample: :A A record record in in the the data data set. set. A A sample sample is is a description ofof anan eventevent oror an an object, object, also also known known as asa characteristica characteristic vector, vector, such such as (latitude as (latitude = 16.5 ◦=, longitude16.5°, longitude = 115.5◦ ,= year 115.5°, = 2015, year month = 2015, = 7, depthmonth = = 97 7, m, depthtemperature = 97 m, =temperature 29.334381 ◦ =C, 29.334381 temperature °C, temper strengthature = 0.295814, strength salinity = 0.295814, = 33.688723). salinity = 33.688723). CharacteristicsCharacteristics: Reflect: Reflect event event or or object’s object’s performance performance and and character, character, such such as as “depth”, “depth”, “strength” “strength” “temperature”,“temperature”, etc. etc. AttributeAttribute value value: Values: Values of of attributes. attributes. SampleSample space space: :The The space space consists consists of of attributes. attributes. Fo Forr example, example, set set “strength”, “strength”, “depth” “depth” and and “temperature”“temperature” as threethree axes axes in in Figure Figure6, and 6, thenand theythen form they a form three-dimensional a three-dimensional state space state describing space describingunderwater underwater thermocline. thermocline.

FigureFigure 6. 6. Three-dimensionalThree-dimensional state state space space de describingscribing underwater underwater thermocline. thermocline.

Every sample in thermocline can find its corresponding coordinates in the sample space, which Every sample in thermocline can find its corresponding coordinates in the sample space, which means that each point in the sample space corresponds to a sample. means that each point in the sample space corresponds to a sample. Training set: Data set used to train model. Training set: Data set used to train model. Testing set: Data set used to test learning model’s performance. Testing set: Data set used to test learning model’s performance. According to the features of the predictive content, machine learning tasks can be divided into According to the features of the predictive content, machine learning tasks can be divided into the following categories. (1) Classification: predict the discrete value. (2) Regression: predict the the following categories. (1) Classification: predict the discrete value. (2) Regression: predict the continuous value. The main task of this paper is to classify the data and separate the thermocline continuous value. The main task of this paper is to classify the data and separate the thermocline area area from other no jump areas. from other no jump areas. The Argo data are unmarked and our task is to mark these data according to whether they are The Argo data are unmarked and our task is to mark these data according to whether they are in the thermocline. The water data collected are the basis of our rule summary work and future in the thermocline. The water data collected are the basis of our rule summary work and future predictions. That is to say, we are going to use this large data set for both training and testing at the predictions. That is to say, we are going to use this large data set for both training and testing at the same time. Because the sample’s data structure is relatively simple in our research, we will divide same time. Because the sample’s data structure is relatively simple in our research, we will divide the the data set into training set S and testing set T through “set aside method”. At first, we separate data set into training set S and testing set T through “set aside method”. At first, we separate these

Sensors 2017, 17, 2225 10 of 16

Sensorsthese 2017data, 17 into, 2225 two parts according to whether their strength reaches the critical temperature10 of 16 strength value. In addition, then we take 3/5 as the training set, 2/5 as a testing set from positive cases and negative cases respectively. This ensures the consistency of positive and negative cases’ datadistribution into two [4]. parts according to whether their strength reaches the critical temperature strength value. In addition,When forming then we the take state 3/5 space, as the trainingwe obtain set, the 2/5 rule as amodel testing containing set from positivelocal changing cases and rules negative at the casessame respectively.time. Where Thisthere ensures is a new the data consistency set, there ofis positivean instant and result negative of the cases’ thermoclines’ distribution distribution. [4]. This Whenwill prevent forming us thefrom state mean space,ingless we and obtain time-consuming the rule model work. containing local changing rules at the same time. Where there is a new data set, there is an instant result of the thermoclines’ distribution. This3.5. Map will Thermocline prevent us fromStrength meaningless to Score and time-consuming work.

3.5. MapFrom Thermocline the definition Strength of a tothermocline, Score we use temperature and depth to calculate each vertical position’s strength. Through the entropy value method, we get each sample’s scores comprehensivelyFrom the definition considering of a thermocline,all the relative we factor use . As mentioned and depthbefore, to the calculate score contains each vertical more position’sinformation strength. than strength. Through At the first, entropy we need value to method, make sure we that get eachthe two sample’s methods scores can comprehensively select the same consideringsamples, namely all the the relative thermocline. factors. AsThen mentioned we will before,map the the strength score contains to score. more The information thermocline than is strength.quantitatively At first, analyzed we need on to makethe premise sure that of thesame two select methods criteria. can select The thethermocline same samples, with namelythe higher the thermocline.score will be chosen. Then we will map the strength to score. The thermocline is quantitatively analyzed on the premise of same select criteria. The thermocline with the higher score will be chosen. 4. Experimental Evaluation 4. Experimental Evaluation

4.1. Data Sets The datadata usedused in in this this paper paper is fromis from the the 2012~2015 2012~2015 global global Marine Marine Argo Argo grid datagrid setdata (BOA set Argo).(BOA TheArgo). chosen The chosen data is data from is the from China the ArgoChina real-time Argo real-time data center data (http://www.argo.org.cn/center (http://www.argo.org.cn/)) from Januaryfrom January 2004 to 2004 December to December 2015 global 2015 ocean global (79.5 oc◦eanS–79.5 (79.5°◦ N, S–79.5° 180◦ W–180 N, 180°◦ E) Argo’sW–180° temperature E) Argo’s andtemperature salinity. Theseand salinity. data sets These have data passed sets profile have pa datassed real-time profile anddata part-timereal-time delayand part-time quality control delay  inquality all countries. control in The all data countries. is recorded The data monthly, is recorded and horizontal monthly, resolution and horizontal is 1◦ × 1resolution◦. From sea is level11× to. 2000From m sea beneath level to water, 2000 m the beneath sea is divided water, the into sea 26 is layers divided vertically. into 26 Thislayers production vertically. isThis from production the State Oceanicis from the Administration State Oceanic of Administration Marine Environment of Ma Keyrine Laboratory Environment ofInformation Key Laboratory Security of Information Technology andSecurity National Technology Marine and Information National Center.Marine OnInformatio this basis,n Center. we reprocess On this thebasis, data we using reprocess the k-nearest the data interpolationusing the k-nearest method interpolation vertically. method As a result, vertically. we make As a result, the data we setmake a 1 the m-interval data set a data 1 m-interval product. Ourdata studiesproduct. focus Our on studies the thermocline; focus on the however, thermocl a largeine; however, number of a visualizationlarge number experiments of visualization have shownexperiments that seawater have shown beneath that 500 seawater m entered beneath a static 500 state m entered and no longera static producestate and thermoclines. no longer produce Hence, thethermoclines. area under Hence, 500 m the is meaningless area under to500 our m study.is mean Wheningless reprocessing to our study. data, When we dividereprocessing the sea data, into 500we layersdivide withthe sea a 1 m-intervalinto 500 layers from seawith level a 1 tom-interval 500 m underwater. from sea Ourlevel present to 500 studym underwater. is in its infancy, Our andpresent we willstudy deal is in with its theinfancy, data toand in we the will further deal study, with whichthe data will to bein releasedthe further on study, the website. which will be releasedFigure on7 the describes website. the distribution map of 100,000 samples selected in BOA Argo. Table2 is the resultFigure of 7 the describes accuracy the analysis distribution of 100,000 map of samples 100,000 selected samples in selected BOA Argo. in BOA It is Argo. demonstrated Table 2 is that the theresult accuracy of the accuracy increases analysis with the of increase 100,000 of samples the training selected set samples,in BOA Argo. which It isis indemonstrated accordance withthat the characteristicsaccuracy increases of the with accuracy the increase and training of the set traini in theng machine set samples, learning. which is in accordance with the characteristics of the accuracy and training set in the machine learning.

25˚N

20˚N

15˚N Ocean Data View Ocean 110˚E 115˚E 120˚E Figure 7. The samples distribution map.

Sensors 2017, 17, 2225 11 of 16

Table 2. Accuracy vs size of training data. Sensors 2017, 17, 2225 11 of 16 Train Set Test Set Accuracy Table 2. Accuracy vs size of training data. 5000 95,000 0.99245 10,000Train Set 90,000Test Set Accuracy 0.99486 15,0005000 85,00095,000 0.99245 0.99570 20,00010,000 80,00090,000 0.99486 0.99623 25,00015,000 75,00085,000 0.99570 0.99671 30,00020,000 70,00080,000 0.99623 0.99718 40,00025,000 60,00075,000 0.99671 0.99741 30,000 70,000 0.99718 50,000 50,000 0.99763 40,000 60,000 0.99741 70,000 30,000 0.99801 50,000 50,000 0.99763 70,000 30,000 0.99801 4.2. Determine the Thermocline 4.2. Determine the Thermocline We start from the definition of temperature strength to determine the thermocline. Make the We start from the definition of temperature strength to determine the thermocline. Make the temperature t, depth d, layer number n, and temperature strength of l. temperature t, depth d, layer number n, and temperature strength f l. t − t ntt++−1 − n−1 l =l = nn11 n = 2,3,2, 3, 4, 4,··· dn+1 − dn−1 ddnn+−11 There exists no thermocline in the surface, and the ignorance of the first layer will not influence There exists no thermocline in the surface, and the ignorance of the first layer will not influence our test. From the previous thermocline in figures (Figures2–5), we can observe that there may exist our test. From the previous thermocline in figures (Figures 2–5), we can observe that there may thermoclineexist thermocline and reverse and reverse thermocline thermocline at the at same the same time, time, which which means means temperature temperature strength strength may may be positivebe positive or negative. or negative. To simplify To simplify calculation, calculation, here here we we regulate regulate ll=>||0 l =|l|> 0 According to the judging standard of temperature strength, we filter the “l > 0.2” samples and According to the judging standard of temperature strength, we filter the “l > 0.2” samples and write them into data.txt file. The file will be used when merging and selecting thermoclines. write them into data.txt file. The file will be used when merging and selecting thermoclines. Figure 8 shows the correlation among temperature, depth and month. Red represents higher Figuretemperature8 shows while the correlation stands for amonglower temperatur temperature,e. From depth Figure and 9 we month. can find Red that represents temperature higher temperatureis changing while with blue depth. stands From for surface lower to temperature.100 m underwater, From the Figure temperature9 we can stays find stable. that temperatureBetween is changing100 m and with 300 depth. m, temperature From surface falls sharply. to 100 Within m underwater, the depth of the 200 temperature m or so there stays are more stable. than Between 20 100 m°C and temperature 300 m, temperature difference. fallsAfter sharply. 300 m in Within depth, the the depth water of gets 200 minto or a so stable there state are more again than and 20 ◦C temperaturetemperature difference. changes Afterlittle. Points 300 m in in Figure depth, 10 the illu waterstrate the gets correlation into a stable between state depth again and and strength. temperature changesDiamond little. is Points on behalf in Figure of a certain 10 illustrate depth’s the thermocl correlationine strength. between x = depth 0.2 in andthe black strength. line represents Diamond is on behalfthe of thermocline a certain depth’s intensity thermocline threshold. The strength. points on x the = 0.2 left in side the of black the line line are represents not in the thermocline the thermocline area, while those points on the right side are in the interior of a thermocline. Similar to Figure 10’s intensity threshold. The points on the left side of the line are not in the thermocline area, while those observation, points satisfied with the critical strength are more likely at 100 m to 300 m underwater. pointsFigure on the 11 rightis the side attribute are in space the interiormade up of of a thermocline.temperature, strength Similar toand Figure depth. 10 Each’s observation, hollow circle points satisfiedrepresents with thea sample. critical strength are more likely at 100 m to 300 m underwater. Figure 11 is the attribute space made up of temperature, strength and depth. Each hollow circle represents a sample.

Figure 8. Three-dimensional Correlation among temperature, depth and time. Figure 8. Three-dimensional Correlation among temperature, depth and time.

Sensors 2017, 17, 2225 12 of 16 Sensors 2017, 17, 2225 12 of 16 SensorsSensors 2017 2017, ,17 17, ,2225 2225 1212 of of 16 16

Figure 9. Correlation between depth and temperature. FigureFigure 9.9. Correlation Correlation betweenbetween depth depth and and temperature. temperature.

Figure 10. Correlation between depth and strength. FigureFigure 10.10. Correlation Correlation between between depth depth and and strength. strength.

0.7 0.70.7 0.6 0.60.6 0.5 0.50.5 0.4 0.40.4 0.3 0.30.3 0.2

0.2strngth 0.2 strngth strngth 0.1 0.10.1 0 00 -0.1 -0.1-0.1 -0.2 -0.2-0.2500 500500 400 30 400400 300 25 3030 300300 25 200 20 25 20 200200 15 20 100 15 100 10 15 depth 100 10 depth 0 5 10 temperature depth 00 5 temperaturetemperature 5 Figure 11. Attribute space made up of temperature, strength and depth. FigureFigure 11.11. Attribute Attribute space space mademade upup ofof temperature,tetemperature,mperature, strengthstrength and and depth.depth. depth.

Sensors 2017, 17, 2225 13 of 16

4.3. ChooseSensors 2017 the, Thermocline17, 2225 13 of 16

4.3.Merge Choose the the samples, Thermocline whose depth differences are within 5 m, to one thermocline and get a certain number of thermoclines. Then we will take the “entropy value method” to choose among these Merge the samples, whose depth differences are within 5 m, to one thermocline and get a mergedcertain thermoclines. number of thermoclines. Then we will take the “entropy value method” to choose among theseWe use merged MATLAB_R2015b thermoclines. as the experiment platform and finally get the result. TableWe3 showsuse MATLAB_R2015b the weights changing as the experiment of each platform attribute and when finally forming get the theresult. thermocline in the experimentalTable area3 shows monthly. the weights Figure changing12 represents of each weights attributes of when each forming point from the thermocline sea surface in to the 500 m underwater.experimental As area we mentionedmonthly. Figure earlier 12 represents in this paper, weights temperature s of each point strength from issea the surface most to important 500 m factorunderwater. when forming As we a thermocline.mentioned earlier Temperature in this pape gradientr, temperature is consisted strength of is temperature the most important and depth. The gradientfactor when isinfluenced forming a thermocline. more by temperature Temperature than gradient depth. is As consisted another of hydrological temperature characteristic,and depth. salinityThe also gradient affects is theinfluenced formation more of by the temperature thermocline than slightly. depth. Although As another from hydrological the previous characteristic, experiments we cansalinity conclude also thataffects the the thermoclines formation areof distributedthe thermocline in the slightly. region ofAlthough 100–300 from m in depth,the previous Figure 12 experiments we can conclude that the thermoclines are distributed in the region of 100–300 m in shows good-quality samples distributed between 100 m and 150 m in depth. It is worth noting that depth, Figure 12 shows good-quality samples distributed between 100 m and 150 m in depth. It is the longitude and latitude occupy a considerable proportion in the formation of the thermocline. worth noting that the longitude and latitude occupy a considerable proportion in the formation of Thisthe conclusion thermocline. perfectly This conclusion proved that perfectly the “the proved stronger, that the the “the better” stronger, principle the better” is not thatprinciple rigorous is not and machinethat rigorous learning and can machine make an learning improvement. can make an improvement.

TableTable 3. Weights 3. Weights of of each each attribute attribute when when formingforming thermocline thermocline monthly. monthly.

w Latitude LongitudeDepth Temperature Strength Salinity w Latitude Longitude Depth Temperature Strength Salinity January 0.1266 0.1266 0.0752 0.1737 0.4867 0.0112 JanuaryFebruary 0.12660.1360 0.1266 0.1360 0. 0.07520808 0.1810 0.1737 0.48670.4554 0.0108 0.0112 February 0.1360 0.1360 0.0808 0.1810 0.4554 0.0108 March 0.1391 0.1391 0.0826 0.1827 0.4454 0.0112 March 0.1391 0.1391 0.0826 0.1827 0.4454 0.0112 April 0.1388 0.1388 0.0825 0.1867 0.4388 0.0144 April 0.1388 0.1388 0.0825 0.1867 0.4388 0.0144 MayMay 0.13610.1361 0.13610.1361 0.0809 0.1885 0.1885 0.4365 0.0218 0.0218 JuneJune 0.13490.1349 0.13490.1349 0.0802 0.1866 0.1866 0.4399 0.0235 0.0235 JulyJuly 0.13500.1350 0.13500.1350 0.0802 0.1905 0.1905 0.4500 0.0094 0.0094 AugustAugust 0.13380.1338 0.1338 0.1338 0.0795 0.1926 0.1926 0.4490 0.0112 0.0112 SeptemberSeptember 0.13390.1339 0.1339 0.1339 0.0796 0.1910 0.1910 0.4534 0.0082 0.0082 OctoberOctober 0.13390.1339 0.1339 0.1339 0.0796 0.1898 0.1898 0.4519 0.0110 0.0110 NovemberNovember 0.13370.1337 0.1337 0.1337 0.0795 0.1909 0.1909 0.4553 0.0069 0.0069 DecemberDecember 0.13420.1342 0.1342 0.1342 0.0797 0.1908 0.1908 0.4544 0.0067 0.0067

FigureFigure 12. 12.Weights Weights s ofs of each each point point from from seasea surface to to 500 500 m m underwater. underwater.

4.4. Mapping from Strength to Score 4.4. Mapping from Strength to Score Initially, it is necessary to ensure whether the two methods can select the same samples, namelyInitially, thermocline. it is necessary We to use ensure Matlab_R2015b whether the as two the methodsexperimental can selectplatform the sameand finally samples, got namelythe thermocline.results. We use Matlab_R2015b as the experimental platform and finally got the results.

Sensors 2017, 17, 2225 14 of 16 Sensors 2017, 17, 2225 14 of 16

Sensors 2017, 17, 2225 14 of 16 FromFrom Figure Figure 13, it13 canabove, be foundit can be that found the mapthat the relation map relation between between strength strength and scoreand score is fitted is as: y = 0.00205fittedFrom as:x + yxFigure0.000151.=+0.0025 13 above, 0.000151 it can be. found that the map relation between strength and score is =+ fitted as: yx=+0.0025 0.000151.

FigureFigure 13. 13.Correlation Correlation betweenbetween strength strength and and score. score. Figure 13. Correlation between strength and score. The strength criteria and the corresponding score criteria select the same samples. The result The strength criteria and the corresponding score criteria select the same samples. The result satisfiedThe ourstrength assumption. criteria and the corresponding score criteria select the same samples. The result satisfiedsatisfied our assumption.our assumption. 4.5. Marine Data Curve Fitting 4.5. Marine4.5. Marine Data Data Curve Curve Fitting Fitting 4.5. MarineIn this Data step Curvewe still Fitting use MATLAB_R 2015b to fit the ocean data curve. In thisInFigure stepthis step14 we shows we still still use marine use MATLAB_R2015b MATLAB_R data curve2015b and tothe to fitfit fitting thethe ocean function data data curve, curve. curve. which are fitting perfectly. FigureFigureFigure 1514 shows shows 14 shows the marine residual marine data plot.data curve Incurve the andtable,and thethe fitting fitting fitting function function function and curve, all curve, the which residuals which are arefitting are given. fitting perfectly. perfectly. FigureFigure 15 shows 15 shows the the residual residual plot. plot. In In the the table, table, fittingfitting function and and all all the the residuals residuals are aregiven. given.

Figure 14. Marine data curve and the fitting function curve. Figure 14. Marine data curve and the fitting function curve. FigureFigure 14. 14.Marine Marine data data curvecurve and the the fitting fitting function function curve. curve.

Figure 15. Residual plot. FigureFigure 15.15. Residual plot. plot. After several fitting attempts, we found that a 10-times function can fit the data better. Table 4 showsAfter fitting several function fitting and attempts, all the weresiduals. found thatIt is a important10-times function that an canoverfitting fit the data phenomenon better. Table may 4 After several fitting attempts, we found that a 10-times function can fit the data better. Table4 shows fitting function and all the residuals. It is important that an overfitting phenomenon may shows fitting function and all the residuals. It is important that an overfitting phenomenon may appear in a higher-order fitting, so we will try various ways to reduce overfitting phenomenon in the following works. Sensors 2017, 17, 2225 15 of 16

Table 4. Fitting function and all the residuals.

Fitting Function y = p1 × z10 + p2 × z9 + p3 × z8 + p4 × z7 + p5 × z6 + p6 × z5 + p7 × z4 + p8 × z3 + p9 × z2 + p10 × z + p11 P1 −1.1173 P2 0.75808 P3 8.3512 P4 −6.2657 P5 −20.714 Coefficient P6 17.92 P7 15.847 P8 −18.203 P9 6.6455 P10 −6.0711 P11 7.3533 Residual 7.2057 z (x-mu)/sigma mu 249.5 sigma 144.48

5. Conclusions and Future Work The traditional thermocline selection rule is “the stronger, the better”, i.e., choose a layer by its definition concerning temperature, depth and strength. In this article, we proposed a new thermocline-choosing method. On the basis of the traditional method, we add more characteristics including geography information, year/month information and salinity, and get a new sample criteria-score. In order to ensure thermocline-defining consistency between traditional selection method and the new method, we set up a one-to-one mapping relation from strength to score. With mapping strength being of critical value to score criteria, we can qualitatively and quantitatively analyze and the thermocline. The applicability of this method has been proved by the regional (79.5◦ S, 79.5◦ N, 180◦ W–180◦ E) experimental results. In the future, we will use hydrological data sets to make a decision tree. The “entropy value method” mentioned in this paper is on the basis of the information gain to divide characteristics aimed at each characteristic’s weight and each sample’s comprehensive score. The key to making a decision tree is the division of characteristics. ID3 decision tree learning algorithms based on information gain, C4.5 decision tree algorithms based on information gain ratio and CART decision trees based on Gini coefficients are the three commonly used divide methods. However, the ID3 algorithm is inclined toward characteristics with more dereferences, the C4.5 algorithm prefers those having fewer dereferences, and the CART algorithm is weak in continuous value prediction. Comparing all these methods, we should try to combine their advantages and balance their drawbacks. In terms of the decision tree, there always exists an “overfitting” phenomenon in its learning procedure. To lower the risk caused by overfitting, we plan to use pruning or the random forest method. These two methods also need to be analyzed and measured. The detailed steps about how to choose an appropriate method are of great significance.

Acknowledgments: The work was supported in part by the National Natural Science Foundation of China (51409117, 51679105). Author Contributions: Yu Jiang and Chengquan Hu conceived and designed the experiments; Yu Gou and Tong Zhang performed the experiments; Tong Zhang and Kai Wang analyzed the data; Yu Jiang and Yu Gou wrote the paper. Conflicts of Interest: The authors declare no conflict of interest.

References

1. Zhang, M.; Liu, J.; Mao, K.; Li, Y.; Zhang, X.; Shi, Y. The general distribution characteristics of thermocline of China Sea. Mar. Forecast. 2006, 23, 51–58. 2. Jiang, G.; Hao, S.; Tao, D. Monthly analysis of characteristics of northern South China Sea thermocline change. Mar. Forecast. 2011, 28, 40–45. Sensors 2017, 17, 2225 16 of 16

3. Wang, K. Influence of main ocean environments on the navigation of underwater vehicles. CAAI Trans. Intell. Syst. 2015, 2, 316–323. 4. Zhou, Z. Machine Learning; Tsinghua University Press: Beijing, China, 2016; pp. 10–92. 5. Hosoda, S.; Minato, S.; Shikama, N. Seasonal temperature variation below the thermocline detected by Argo floats. Geophys. Res. Lett. 2006, 33, 338–345. [CrossRef] 6. Yin, J.; Wang, Y.; Xu, J. Daya bay thermocline formation and its influence to the related environmental elements. Mar. Sci. Bull. 2006, 25, 1–8. 7. Chao, J.; Gao, X.; Feng, L. Shear instability of basic flow in a two-layer equatorial ocean model. Acta Meteorol. Sin. 2007, 65, 1–17. (In Chinese) 8. Chen, X.; Hu, B.; Li, W. The equator unstable wave numerical simulation analysis of the impact of air-sea interaction. Chin. J. Atmos. Sci. 2009, 33, 145–154. 9. Liu, C.; Köhl, A.; Liu, Z.; Wang, F.; Stammer, D. Deep-reaching thermocline mixing in the equatorial pacific cold tongue. Nat. Commun. 2016, 7.[CrossRef][PubMed] 10. Ge, R.; Qiao, F.; Yu, F. A method of calculating the shelf sea thermocline characteristics-step function approximation method. Adv. Mar. Sci. 2003, 21, 393–400. 11. Liu, J.; Li, D.; Ji, D. The methods of detecting thermoclines in oceans by AUV: A review. J. Ocean Technol. 2014, 33, 127–136. 12. Wells, N.C.; Josey, S.A.; Hadfield, R.E. Towards closure of regional heat budgets in the North Atlantic using Argo floats and surface flux datasets. Ocean Sci. 2009, 6, 59–72. [CrossRef] 13. Hadfield, R.E.; Wells, N.C.; Josey, S.A.; Hirschi, J.J.-M. On the accuracy of North Atlantic temperature and heat storage fields from Argo. J. Geophys. Res. Oceans 2007, 112.[CrossRef] 14. Park, J.J.; Kwon, Y.; Price, J.F. Argo array observation of ocean heat content changes induced by tropical cyclones in the north Pacific. J. Geophys. Res. Atmos. 2011, 116, 210–222. [CrossRef] 15. Resnyanskii, Y.D.; Tsyrulnikov, M.D.; Strukov, B.S.; Zelenko, A.A. Statistical structure of spatial variability of the ocean thermohaline fields from argo profiling data, 2005–2007. Oceanology 2010, 50, 149–165. [CrossRef] 16. Maze, G.; Mercier, H.; Fablet, R.; Tandeo, P.; Radcenco, M.L.; Lenca, P.; Feuche, C.; Le Goff, C. Coherent heat patterns revealed by unsupervised classification of Argo temperature profiles in the North . Prog. Oceanogr. 2017, 151, 275–292. [CrossRef] 17. Zhang, X.; Zhang, X.; Li, Y. Ocean thermocline eigenvalue analysis and calculation. Mar. Forecast. 2011, 28, 69–76. 18. Zhang, X.; Zhang, Y.; Nie, B. Comparative analysis of thermocline boundary between vertical gradient method and optimal segmentation method. Mar. Sci. Bull. 2008, 27, 20–26. 19. Jiang, B.; Wu, X.; Ding, J. Comparison of the calculation methods of the thermocline depth of the South China Sea. Mar. Sci. Bull. 2016, 35, 64–73. 20. Hao, J.; Chen, Y.; Wang, F.; Lin, P. Seasonal thermocline in the China and northwestern Pacific Ocean. J. Geophys. Res. Oceans 2012, 117, 2022. [CrossRef]

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).