On Deep Learning Models for Detection of Thunderstorm Gale 909

On Deep Learning Models for Detection of Thunderstorm Gale 909 On Deep Learning Models for Detection of Thunderstorm Gale Yan Li1, Haifeng Li2, Xutao Li2, Xian Li2, Pengfei Xie2 1 School of Artificial Intelligence, Shenzhen Polytechnic, China 2 Shenzhen Key Laboratory of Internet Information Collaboration, Harbin Institute of Technology, Shenzhen, China [email protected]; [email protected]; [email protected]; [email protected]; [email protected]* Abstract can be leveraged its detection. In [3], Yu et al. developed a method to detect thunderstorm gale with a The purpose of this paper is to perform a comprehensive combination of the mid-altitude radial convergence study on the performance of different deep learning (MARC) and bow-like echo features. Dong et al. [4] models for detection of thunderstorm gale. We construct found that vertical integrated liquid (VIL) is a a benchmark dataset from the radar echo images in distinguish factor for thunderstorm detection. If the Guangdong province of China. Each radar image is VIL is larger than 30kg/m2, the likelihood that partially labeled according to the wind velocities recorded thunderstorm gale may appear is high. Once the VIL is by meteorological observation stations. We design four larger than 40kg/m2, the confidence becomes very deep learning models to address the thunderstorm gale strong. In [5], Yan et al. showed that if the VIL and detection problem, including a simple convolution neural center of a storm decrease rapidly and simultaneously, network (CNN), a recurrent neural network (S-RCNN), a a thunderstorm gale is quite likely to appear. In [6], the time context recurrent convolutional neural network (T- correlation and connection between squall line and RCNN), and a spatio-temporal recurrent convolutional thunderstorm gale were studied. In [7], Wang et al. neural network (ST-RCNN). Ten traditional machine showed two important features for thunderstorm gale learning algorithms are selected as comparison baselines. detection: (1) the thunderstorm gale often appears in Experimental results demonstrate that four deep learning isolated storms; (2) the position of appear at the bow- models can achieved better detection performance than like echo region or the tail part of a jet inflow. In [8], traditional machine learning algorithms. Zhou et al. showed that four features are important to detect thunderstorm gale, which are a high radar Keywords: Radar echo images, Thunderstorm gale detection, Machine learning, Deep learning reflectivity factor, a high VIL, a fast-moving speed, a decrease of echo top and MARC. Liao et al. [9] developed a detection method based on the shape 1 Introduction features, VIL and wind fields. In [10-11], a fuzzy logic model was developed with six heuristic features for Accurate prediction the meteorological disaster is a thunderstorm gale detection. We can see that all the very important and meaningful task. Thunderstorm existing methods for thunderstorm gale detection are gale is typical disaster that often produces very badly quite simple, heuristic and experience based, which destructive damages to agricultures, industries and cannot deliver robust prediction performance. economics. However, the thunderstorm prediction is Moreover, all the methods require users to construct very challenging, compared to mesoscale meteorological features manually. phenomenon because of its small scale, rapid birth and With the rapid development and great success of growth. In the literature, many studies and methods machine learning and deep learning techniques, it is have been conducted and developed by meteorological interesting to study whether it is possible to leverage scientists to address the problem. such techniques for thunderstorm detection. Different A typical line of studies is to leverage the priori from the existing experience-based models, machine knowledge and experiences of meteorologist for the learning and deep learning models are able to detection. For instance, Johns et al. [1] found that the automatically optimize their parameters based on bow-like echo map features play a distinct and historical observations, namely training samples. important role in the thunderstorm gale detection. Moreover, the deep learning models can learn without Based on the extensive correlation analysis between a feature engineering procedure. thunderstorm gale and echo top, Darrah el al. [2] found In this paper, we focus on the investigation of deep an important phenomenon, namely the thunderstorm learning methods to the thunderstorm gale detection gale has a lower top than that of echo top of hail, which problem. We consider the investigation based on a *Corresponding Author: Xutao Li; E-mail: [email protected] DOI: 10.3966/160792642020072104001 910 Journal of Internet Technology Volume 21 (2020) No.4 benchmark data set from Guangdong province. As for minute, we obtain 20 radar images of 700×900. A lot a comparison, we also introduce ten conventional of automatic meteorological stations are distributed in machine learning approaches as baselines. The ten Guangdong province, which can record the wind approaches include Decision Tree Regressor (DT), velocity, rainfall, humidity, temperature, and air Linear Regression (LR), Ridge regression (Ridge), pressure. Each record can be corresponded to a pixel in Lasso regression (Lasso), Random Forest Regressor radar image. Based on such correspondence, we can (RFR), K-nearest Neighbor Regressor (KNNR), label the wind velocity of 1 km×1km regions and thus Bayesian Ridge Regressor (BR), Adaboost Regressor construct the training set. Our aim is to predict the (AR), Support Vector Regressor (SVR), and Gradient wind velocity of each region based on its radar pixel Boosting Regressor (GBR). To apply the ten models, features. This is essentially a machine learning ten important manual features are also constructed. problem, i.e., regress a real value from the radar pixel Different from conventional machine learning features to denote the wind velocity. Specifically, we appraoches, which needs manual features as a divide the radar image into MN× patches, and select prequiresite, deep learning methods are able to make K height level, and T time steps. By doing so, each predictions without a feature engineering step. Hence, MNKT××× sample can be represented as a tensor XR∈ . we design four deep learning architectures to address If we let W denote the corresponding wind velocity, the thunderstrom gale detection problem, including a the thunderstorm detection then aims to find a function simple convolution neural network architecture (CNN), WfX= (). With the prediction function, we are able a recurrent neural network architecture (S-RCNN), a time context recurrent convolutional neural network(T- to calculate the wind velocity and the detection is RCNN), and a spatio-temporal recurrent convolutional conducted by thresholding the velocity with 17m/s. neural network architecture (ST-RCNN). In the models, 2.2 Feature Engineering CNN simply fuses the spatial and temporal contexts for thunderstorm gale detection; S-RCNN and T-RCNN Next, we elaborate the feature engineering procedure recursively exploit the spatial context and temporal for machine learning approaches. In this paper, ten context, respectively; ST-RCNN recursively leverages important features are constructed by referring the both the spatial and temporal contexts. conventional experiences from meteorologists. The four deep learning models are compared with Radar echo intensity：This is original pixel value in the aforementioned ten traditional machine learning radar images, which is usually ranged in [0, 80] and the methods that use manual extraction features. unit is dBZ. Experimental results show that machine learning Radar reflectivity factor: The factor is proportional approaches can effectively identify thunderstorm gale to the spectral diameter of raindrop. The connection of from radar images, and the deep learning methods the feature to radar echo intensity is: perform better. We note that the paper is an extension dbz to our conference paper in [12]. The key difference is Z =10 10 that in our conference paper we mainly focus on the comparison of conventional machine learning Radar combined reflectivity: As there are K height approaches for thunderstorm gale detection, while in level radar images, we can calculate an image based on this paper we introduce and design some deep learning them. Each pixel in the image denotes the maximum models to address the problem. radar reflectivity factor value of the pixels located in the same position in the K radar images. The calculated 2 Methodology image is radar combined reflectivity. Vertical integrated liquid (VIL): The feature denotes the integration of possible precipitation in unit column 2.1 Problem Statement volume, which is computed as follows: The thunderstorm gale refers to the wind whose K −1 ZZ+ qh=××Δ∑3.44 10−6 (ii+1 ) velocity is larger than 17m/s. The thunderstorm gale Max2 i detection task is to find where a thunderstorm gale i=1 appears based on the radar images. In reality, the radar where Zi denotes the radar reflectivity factor of the i- images are switched by the local images captured by th height level, Δh is the height differences between multiple Doppler radars. For Guangdong, the switched i radar image covers 700 km × 900km, which is the i-th and (i+1)-th height level, K is the number of levels. composed of local images captured by seven radars. Radar echo top: This denotes the height that the Each radar scans every six-minute, and the scanned maximal radar reflectivity factor lies in. height is from 500m to 19500m, where the scanned The change of radar echo intensity w.r.t. time: The height interval is 1000m. In the radar images, each feature is calculated as: pixel represents a 1 km×1km region. Hence, every six- On Deep Learning Models for Detection of Thunderstorm Gale 911 Δ=dbz dbzii − dbz −1 randomly sampled subsets (from both instance and feature views) of training samples and final prediction dbz dbz where i and i−1 represents the radar echo is obtained by averaging the outputs of the multiple intensities of current and previous scans, respectively.

Load more