Vol.19 No.1 JOURNAL OF TROPICAL METEOROLOGY March 2013

Article ID: 1006-8775(2013) 01-0059-08 INTERPOLATION TECHNIQUE FOR SPARSE DATA BASED ON INFORMATION DIFFUSION PRINCIPLE—ELLIPSE MODEL

1 1 1 2 Ren (张 韧) , HUANG Zhi-song (黄志松) , LI Jia-xun (李佳讯) , LIU Wei (刘 巍)

(1. College of Meteorology and Oceanography, PLA University of Science and Technology, Nanjing 211101 China; 2. School of Information Science and Technology, Southwest Jiaotong University, 610031 China)

Abstract: Addressing the difficulties of scattered and sparse observational data in ocean science, a new interpolation technique based on information diffusion is proposed in this paper. Based on a fuzzy mapping idea, sparse data samples are diffused and mapped into corresponding fuzzy sets in the form of probability in an interpolation ellipse model. To avoid the shortcoming of normal diffusion function on the asymmetric structure, a kind of asymmetric information diffusion function is developed and a corresponding algorithm-ellipse model for diffusion of asymmetric information is established. Through interpolation experiments and contrast analysis of the sea surface temperature data with ARGO data, the rationality and validity of the ellipse model are assessed. Key words: information diffusion; interpolation algorithm; sparse data; ellipse model CLC number: P731/O241.3 Document code: A

1 INTRODUCTION the ARGO data: the average array interval of ARGO floats was about 300 km, and there is only one profile Accurate and reliable oceanic observational data about every 10 days for each ARGO float. For are very important for marine research and regional oceanic and atmospheric research, the ARGO ocean-atmosphere numerical forecast. However, the data is sparse in space and discrete in time. For partial most difficult obstacle nowadays is the lack of key sea areas, such as the South China Sea and the oceanic environmental data and the lack of data northwest Pacific, the temporal-spatial resolution of analysis and information extraction technique. ARGO data is far from enough. Considering the ARGO (Array for Real-time Geostrophic factors of floats shift and the difference of measuring Oceanography) is a global oceanic observation and periods and regional distribution, the available ARGO investigation project proposed and supported by the data are even rare[2]. United States, Great Britain, France, Japan, China, Interpolation algorithms are common techniques and other countries. The goal of the ARGO project is for estimating and approaching the missing to establish a global real-time observational network information with nearby data. The current which consists of about 3000 automation floats and interpolation algorithms contain the Newton method, can quickly and accurately collect the global Lagrange method, spline interpolation, polynomial temperature and salinity profiling data and their interpolation, finite element method, weighing characteristics. The global ARGO observational function method, variational method, spectrum network can provide more than 10,000 temperature method, successive correction method, optimal and salinity real-time observational profiles monthly. interpolation and so on[3, 4], which can basically meet The ARGO data has such advantages as vast spatial the needs of the interpolation and fitting of large-scale range, long duration and sufficient observational atmospheric and oceanic data. factors and supports whole weather detection, which However, a key premise involved is that the can efficiently make up for the shortage of the oceanic above interpolated techniques need to be provided environment data, especially the observations in the [1] with sufficient data and related information. If the deep layers . observational data is too rare, the precision and However, there are some inherent limitations in reliability of the interpolation algorithm will be

Received 2011-04-21; Revised 2012-10-23; Accepted 2013-01-15 Foundation item: Project of Natural Science Foundation of China (41276088) Biography: ZHANG Ren, Ph.D., professor, primarily undertaking research on air-sea interaction. Corresponding author: ZHANG Ren, e-mail: [email protected] 60 Journal of Tropical Meteorology Vol.19 restricted greatly. There is universal scattered and Matrix probability density function estimation sparse observational data such as ARGO data in the through the information diffusion principle is called oceanic environment, but the general interpolation diffusion estimation. The exact definition of the techniques have serious limitations in dealing with diffusion estimation is defined as follows. If μ ( x) such type of data[5]. From the analysis above, it has important scientific meaning and practical value to is defined on a Borel measurable function ll− develop new interpolation techniques to address the in (−∞+∞, ) , d > 0 is a constant, x = i , then issue of sparse observation data. Ellipse interpolation d algorithm model—a new interpolation technique for n ˆ 1 ⎡⎤ll− i dealing with sparse data based on the information fl()= ∑ μ ⎢⎥ (1) diffusion principle—was proposed in this paper. ndi i=1 ⎣⎦ d is the diffusion estimation of the matrix probability 2 THE PRINCIPLE OF INFORMATION density function f ()l where μ ()x is the diffusion DIFFUSION INTERPOLATION function and d is the window width. 2.1 The Principle of information diffusion 2.2 The idea of information diffusion The technique of information diffusion is a concept of research and a mathematic model put In this paper, the idea of information diffusion is forward for solving the imperfect information existing introduced into the fitting and interpolation of sparse in evaluating strong natural disasters, such as data and a new interpolation technique—algorithm for earthquake, storm surge, mudslides, and etc[6], which information diffusion and interpolation, which is are characterized by high degree of severity and small suitable for sparse data and small samples, is samples. Information diffusion is a fuzzy mathematic proposed. method by optimizing the fuzzy information of 2.2.1 FUZZY MAPPING RELATIONSHIP BETWEEN INPUT samples in order to make up for the missing AND OUTPUT information and can effectively deal with the imperfect sample information by transforming single For an input-output system, Ω is denoted as the samples into fuzzy-set samples in the form of matrix, x as the input variable, y as the output [7] probability . Now, the information diffusion variable, X as the input set, and Y as the output technique is only used for the risk assessment of the set, viz. x ∈ X , y ∈Y , Ω=X ×Y . small-sample events[8-10], and its idea is suitable for solving the imperfect data such as sparse data Let f (,xy ) be the probability density function interpolation. of matrix Ω , then the density of condition probability of y is described as follows with Suppose that WWW= { 12L Wn} is the x = u . knowledge sample series, L the underlying domain, fYX| (|)yu= fuy (,) fuvdv (,) . (2) and the observational value of Wi is li . Let ∫ vY∈ x =−φ ()lli , then if W is imperfect, there is a Based on the fuzzy-set idea, the Ω input-output system is defined to be an output fuzzy set B under function μ ()x which can make the li point % a given input, the membership function of fuzzy set information (value-1) diffuse into l with μ x . ( ) B% is corresponding to the probability density The diffused information distribution pattern function of output, and the normalized results of the nn probability density function are a membership Ql()==∑∑μμφ ( x )() ( l − li ) can better function, i.e., the membership function of fuzzy set jj==11 B% is described as follows, describe the whole structure of W , which is called the information diffusion principle[6]. fuy(, ) fuvdv (,) ∫ fuy(, ) μ ()y ==vY∈ . (3) B% ⎧⎫max{}f (u ,y ) max⎨⎬fuy ( , ) fuvdv ( , ) yY∈ yY∈ ∫ ⎩⎭vY∈

Therefore, in the input-output system, for any given input x ( x ∈ X ), its whole potential output

60

No.1 ZHANG Ren (张 韧), HUANG Zhi-song (黄志松) et al. 61 can be denoted by the fuzzy set B% , which is the n Let qxy= μ , , marked as q , fuzzy mapping relationship between input and output. uvjk∑ uv jk() i i jk Generally, the probability density function i=1 st f (,xy ) of the matrix Ω is hardly acquired at hand tq= ∑∑ jk . By information diffusion, the but currently estimated by mass statistics of samples. jk==11 Under the condition of rare data information, it will estimation of probability density function at point still be given an approximated estimation of the (,)uv in the Ω input-output system can be distribution of whole probability density by the jk information diffusion based on a few sample data denoted as obtained. q fuˆ(,) v = jk . (4) jk t 2.2.2 ESTIMATION OF INFORMATION DIFFUSION FUNCTION S is a set of small sample series for a 2.2.3 MAPPING INTERPOLATION WITH INFORMATION DIFFUSION constructed Ω input/output system, denoted as Substitute the information diffusion estimation Sxyxyxy= {(11 , ),( 2 , 2 ),L (nn , )} . fˆ(,)uv of the probability density function into the For the lack of samples, the probability density jk function is often hardly constructed by current input-output mapping relationship formula (3). When common statistical analysis methods. the input is u , the output is as follows: In the principle of information diffusion, small fuyˆ(, ) samples series S can be regarded as the information B% ==μ ()yy y.(5) ∫∫B% ˆ points scattering in the input/output phase space yY∈∈ yYmaxfuy ( , ) yY∈ {} X ×Y . By point-set mapping, each sample data can be diffused as a fuzzy set with points of multiple (Note that the “ ∫ ” here is not the integration samples. Because of the unclear, blur and flexible symbol and “/” is not for operation of division, but a surrounding borderline, the information collectivity basic fuzzy-set symbol) provided by each sample is a fuzzy information set. Thus, based on the information diffusion principle, Addressing the unclear and blur borderline sparse sample data and limited sample information surrounding the sample point (,x y ), a controlling available, a fuzzy mapping relationship is established ii between the input and the output to determine point is introduced in the input space X and output interpolation results through de-fuzzy operation from space Y respectively, u j ( js=1, 2,L ) and the output. ( ), which are normalized discrete vk kt=1, 2,L 2.3 Algorithm of 2-D information diffusion information points in the input/output space. The sets interpolation for the input and output controlling points are separately marked as follows, There are many interpolation methods but current interpolation algorithms can hardly achieve the , . Uuuu= { 12,,L s} Vvvv= { 12,,L t } expected effect when the data sample is sparse or rare. Thus, the space of controlling points, UV× , is For 2-dimensional data field interpolation, if the longitude and latitude information are regarded as the structured in gridded distribution in the input-output input and the variable values as the output, the space. Through a point of information infusion, two-dimensional data field can be achieved by the information is reasonably and efficiently diffused into technique of information diffusion interpolation. the whole space of controlling points by a suitable The main idea of the information diffusion is as form (of information diffusion formula) to effectively follows. For an imperfect data sample, it can obtain capture and exploit the imperfect sample information. more information of the sample by using a suitable Denote A =Ω×UV × , define a mapping diffusion function μ()x . For the information μ :[0,1]A → for an argument field A to domain diffusion interpolation of sparse data, its goal is to [0, 1], so that seek an objective, accurate and efficient diffusion (,xy ),, uv∈→ Aμ x , y ∈ [0,1] ()uvjk () i i function and to achieve a reasonable diffusion and optimized approach for imperfect sparse data sample. where (,xy )∈Ω , uU∈ , vV∈ , j =1, 2, s , L In mathematic principle, the goal of the above kt=1, 2, , in=1, 2, , and μ x , y is information diffusion interpolation technique is to L L uvjk() i i called the information diffusion function. solve the small sample problem of the imperfect data, which is the chief difference from other interpolation methods.

61

62 Journal of Tropical Meteorology Vol.19

Detailed operational steps are described as , , X = [,XX12 ,,L Xs ]YYYY= [,12 ,L ,]t which follows. are also known as controlling points of the input Let sparse sample series factor. Set a controlling point for the input sample Sxyvxyvxyv= [(111 , , ),( 2 , 22 , ),L ,(nnn , , )] and VVVV= [,12 ,L ,r ] and then the interpolated field interpolate gridded coordinate: can be expressed as ⎛⎞fXYvˆ(( , ), ) FRBR==()% (μ ()) vvR =⎜⎟ij v (6) ij, ∫∫B% ⎜⎟ˆ vV∈∈ vVmaxfXYv ((ij , ), ) ⎝⎠vV∈ {} q str on an “average space model” and “two-point handy ˆ ij, k where fXYV((ij , ), k ) = , tq= ∑∑∑ ij, k , principle”, a simple calculation formula of the t ijk===111 diffusion coefficient can be deduced as follows. n ⎧0.8146(ba− ), n= 5 qxyv= μ (),, ( is=1, 2, , , ij,, k∑ Xij Y V k l l l L ⎪ l=1 ⎪0.5690(ba− ), n= 6 j =1, 2,L , t ), R denotes the de-fuzzy operation, ⎪0.4560(ba− ), n= 7 and μ ()x ,,yv is the diffusion function. ⎪ XYij, V k l l l hban= ⎨0.3860(−= ), 8 (8) ⎪0.3362(ba− ), n= 9 3 INFORMATION DIFFUSION ⎪ INTERPOLATION MODEL ⎪0.2986(ba− ), n= 10 ⎪ ⎩2.6851(ba− ) / ( n−≥ 1), n 11 3.1 Symmetrical information diffusion function— where bx= max{i } , ax= min{i } , n is the normal interpolation model 1≤≤in 1≤≤in The key of information diffusion is to construct sample amount, and the diffusion coefficient is related an exact and reasonable diffusion function. Huang[6] with the amount and the maximum /minimum of the deduced a simple but practical diffusion function by sample. simulating the molecule diffusion. In 2-D information diffusion interpolation of ()ux− 2 sparse data, a 3-D (2-D input and 1-D output) − ji 1 2 diffusion function is needed and the 3-D normal μ ()xe= 2h (7) uij 2π h information diffusion function is as follows. which is called the normal information diffusion function, where xi is the sample point, u j the controlling point, and h diffusion coefficient. Based 1 ()X −−xVv22()Yy− 2 () μ xyv, ,=−−− exp[iljl kl ]. (9) XYij, V k() l l l 3 222 (2)π hhhxyv 222hhhxyv

The normal information diffusion function shows 1 22 that the information diffuses and weakens μ =−+exp[ (x′′y )]. (11) 2π hh symmetrically with space around the data point. Take xy a 2-D normal information diffusion for example. Its The exponential part –( x ''222+=yr) shows function is that the information of sample points diffuses 1 ΔΔx22y symmetrically along all directions after the removal of μ =−−exp( ) (10) 222π hh h22 h the influence of the dimension and unit in the 2-D xy x y data field, and the 3-D diffusion function is also the where hx and hy are diffusion coefficients. same. Therefore, the normal information diffusion is Δx Δy homogeneous and symmetrical. Make x′ = , y′ = . It means that As the actual data samples (such as the 2hx 2hy atmospheric and oceanic observational data) often the removal of the influence of the dimension and unit have complicated structures and asymmetric will then transform the 2-D normal information characteristics, the normal diffusion model is only diffusion function as follows. suitable for the ideal condition and the asymmetric information diffusion model of approaching to the factual condition must be taken into account to

62

No.1 ZHANG Ren (张 韧), HUANG Zhi-song (黄志松) et al. 63 objectively describe and express the generalized, abnormal and asymmetric real data sample structure. y ykxb=+0 3.2 Asymmetric information diffusion—ellipse algorithm model To overcome the shortcomings of the normal O x information diffusion model, a new ellipse algorithm model based on the asymmetric information diffusion is proposed. The data point’s information often b diffuses and extends not in a round symmetrical form a but in an ellipse asymmetric form. That is, in some Figure 1. Sketches for normal (solid line, round) and directions, the information diffuses quickly, which is asymmetric (dashed line, ellipse) information diffusion. defined as the ellipse long axis, while in other directions it diffuses slowly (defined as the ellipse 3.2.1 2-D “ELLIPSE” DIFFUSION FUNCTION short axis). Thus, a “round” symmetrical normal In the 2-D data structure shown in Figure 1, if the information diffusion model can be developed into an direction of fast diffusion is corresponding with the “ellipse” asymmetric information diffusion model, ellipse long axis (line a) and the direction of slow shown in Figure 1. diffusion is corresponding with the ellipse short axis (line b), the information diffusion function can be transformed as follows:

111xy22 xy μ =−exp{2 [ ( ++−kk ) ( ) ]} (12) 21πλhhxy k + 22hhxy 22 hh xy where k is the slope rate of the ellipse long axis, distribution of the sample and the sample information called rotation coefficient, λ the rate square of the diffuses more quickly along the direction of the ellipse long axis, called a flex coefficient, and the ellipse long axis. Thus the probability of the sample function is called the ellipse asymmetric information point around the long axis is the largest and the sum diffusing function. of the square distance from each sample point (,xiiy ) to line ykxb= + 0 can be regarded as 3.2.2 k AND λ PARAMETERS minimum, viz. nn()kx+− b y 2 In the 2-D ellipse asymmetric information Qd==2 ii0 =min (13) ∑∑i 2 . diffusing function, two important parameters are ii==11k +1 introduced, i.e., rotation coefficient k and flex Thus, coefficient λ . k is the slope rate of the ellipse long axis, which is related with the xOy plane ⎧ ∂Q n 2(kx+− b y ) ==ii0 0 ⎪∂+bk∑ 2 1 ⎪ 0 i=1 ⎪ . (14) ⎨ n ⎡⎤2(kx+− b y )( k22 +− 1) 2 k ( kx +− b y ) ⎪∂Q ∑ ⎣⎦ii00 ii ⎪ i=1 ==22 0 ⎩⎪ ∂+kk(1) And it can be simplified into of the sample points to the short axis (line b) and to ⎧ 1 nn the long axis (line a), as shown in Figure 1. ⎪bykx0 =−()∑∑ii ⎨ n ii==11 (15) 3.2.3 MULTI-DIMENSIONAL ELLIPSE DIFFUSION FUNCTION ⎪ 222ˆˆ ˆ ˆ ˆˆ ⎩kxykxyxy⋅+⋅−−=∑∑ii() i i ∑ ii 0 The non-dimensional form of the 1 1 multi-dimensional (m+1–D) normal information ˆ , ˆ diffusing function is presented as follows. where xii=−xx∑ i yii=−yy∑ i, and n n m 1 ⎡⎤⎛⎞22. (16) μ =−+exp xyi′′ b0 and k can be solved. m+1 m ⎢⎥⎜⎟∑ 2π 2 hh ⎣⎦⎝⎠i=1 () yx∏ i The flex coefficient λ denotes the fat/thin i=1 degree of the ellipse, which can be expressed as the By introducing the flex coefficient, the“round” rate of the average (or maximum) distance from each normal symmetrical information diffusing function

63

64 Journal of Tropical Meteorology Vol.19 can be developed as a “ellipse” asymmetric m 2 x′ 2 information diffusing function. −−∑ i y′ is introduced to make the direction of m 2 i=1 λi 1 ⎡⎤⎛⎞x′ 2 μ =−+exp ⎢⎥⎜⎟i y′ .(17) fast/slow diffusion corresponding with the long/short m+1 m ∑ λ 2π 2 hh ⎣⎦⎝⎠i=1 i axis of the projection plane. Then, the coordinate of () yx∏ i i=1 xi′(1im= L ) is transformed in the following steps: The rotation factor (rotation coefficient) in the asymmetric information diffusing function ⎧xxii′′=+cosθ i y ′ sinθ i ⎨ . (18) ⎩yx′′=−iisinθ + y ′ cosθ i Then, 11mm j−1 μθθθθ=−−exp{ [xx′′ cos ( sin sin cos ) + m+1 m ∑∑λ ii jji∏ k 2π 2 hh iji==+11i ki=+1 () yx∏ i i=1 mimm −1 ′′22′ (19) yxysinθθij∏∏∏ cos ]−− [∑ ( iij sin θθ cos ) + cos θ i ] } ji=+111i=1 j = i = where using the Kriging interpolation, the normal x y interpolation model and the asymmetric "ellipse" x ′ = i , y′ = , diffusion model proposed in this work, respectively. i 2h 2h xi y Objective: To examine the accuracy and 1 k reliability of different methods using the 1% cosθθ== ,sin i , λ is the flex ii22i “observation” points to interpolate the original sea kkii++11 surface temperature data. coefficient, k the rotation coefficient. Thus the Figure 2 shows the results of the correlation i coefficient and the mean square error being compared multi-dimensional ( m +1 –D) asymmetric information between the interpolated SST field and the actual field; diffusing function was obtained. the information diffusion interpolation is more If there is a relationship between the input item accurate than the Kriging interpolation method, xi′ and x′j ()ij≠ , the formula above should be especially the asymmetric "ellipse" diffusion model transformed as follows: that has the maximum correlation coefficient and xx′′=+cosθ x ′ sinθ minimal mean square error, meaning that the effect of ⎪⎧ ii ijj ij the asymmetric "ellipse" diffusion model is the best. ⎨ . (20) ′′ ′ Experiment Two: ⎩⎪xxjiijjij=−sinθ + x cosθ Data: The area and time are the same as that in Experiment One but with the interpolated sample 4 ALGORITHM EXPERIMENT points increased by 10% by extracting 230 samples (10%) randomly from all data points as the To examine the practical effectiveness and “observed” data (the rest is treated as missing data), reliability of the information diffusion interpolation conducting interpolation experiments and comparative algorithm, the sea surface temperature (SST) analysis using the Kriging model, the proposed reanalysis data provided by the U.S. National Centers normal interpolation model and the asymmetric for Environmental Prediction (NCEP) / National “ellipse” diffusion model respectively, and examining Center for Atmospheric Research (NCAR) is used to the accuracy and reliability of different methods using make an interpolation test and comparative analysis the 10% “observation” points to interpolate the between different algorithms. original data SST. The year of the reanalysis data is 2009 and the Figure 3 shows the results of the correlation temporal resolution is monthly. Data coverage is the coefficient and the mean square error being compared marine area from 100 to 250°E and from 0 to 60°N; between the interpolated SST field and the actual field; the spatial resolution is 2°×2°, and there are a total of the effectiveness of different interpolation methods is 75 (zonal)×30 (meridional)=2,250 grids and nearly close and the difference is small (the maximum 2,000 data grids are left after deducting the land. difference of the correlation coefficient is 0.002 and Experiment One: the mean square error is 0.2, about an order of Data: 23 samples (about 1%) are randomly magnitude smaller compared with that of Experiment extracted from about 2,000 data points as “observed” One). In contrast, the Kriging interpolation method is data (the rest is treated as missing data); interpolation more effective than the information diffusion experiments and comparative analysis are conducted interpolation, meaning that the Kriging interpolation,

64

No.1 ZHANG Ren (张 韧), HUANG Zhi-song (黄志松) et al. 65 as a classical and mature interpolation algorithm, is reliable and effective.

Figure 2. Comparison of different interpolation algorithms (1% sample): Correlation coefficient (left panel); mean square error (right panel).

Figure 3. Comparison of different interpolation algorithms (10% sample): Correlation coefficient (left panel); mean square error (right panel).

Comparisons of the results of Experiment One observational data belong to the data field distributed and Two are shown as follows. The interpolation sparsely, which means that it is suitable to interpolate methods based on the small-sample information by the information diffusion interpolation method. diffusion proposed in this paper (particularly the Data: ARGO floats at 10-m depth on January 1, asymmetric ‘ellipse’ diffusion model) have shown 2009; range: 100 to 250°E, 0 to 60°N. obvious advantages in dealing with the interpolation Figure 4 shows the ARGO buoy data (The small of very sparse samples. With the increase of sample dots in the figure are for the buoy location, about 80 data, their advantages gradually diminish (some of them) and the 1°×1° grid of the temperature field degree of superiority still exists in tests with 2% to respectively using the symmetrical information 5% of the samples, figure not shown). In the diffusion and the asymmetric ‘ellipse’ model interpolation experiments above, 5% can be interpolation. There are more than 4,000 data grid considered as a critical point. For the sparse data field points in the region (excluding the land), so the result that is less than this criterion, the information is equivalent to interpolation of 2% of ARGO sparse diffusion interpolation model is more preferential than samples. In contrast, the ellipse model has a richer other techniques. For other types of sparse data set, description of the details of the results. the criteria should be a little different and evaluated As there is no operational reanalysis grid data set according to actual condition, but their conjunct of ARGO for contrast tests, the corresponding principle and core are suitable for analyzing and correlation analysis and error analysis were not done. processing the interpolation calculation of imperfect information situations such as sparse data. Algorithm application: The information diffusion interpolation is applied to the standardized processing of the actual field of ARGO scattered elements. As ARGO floats observe once every 10 days on average and much less ARGO floats data can be received within a day, ARGO

65

66 Journal of Tropical Meteorology Vol.19

Figure 4. Grid point interpolation field of ARGO observational data of sea surface temperature: Symmetrical information diffusion (left panel); asymmetric ellipse model interpolation model (right panel).

5 SUMMARY [1] XU Jian-ping. ARGO for Global Ocean Observation [M]. Beijing: Ocean Press, 2002: 16-18. Addressing the difficulties of the scattered and [2] SU Ji-lan. Understand Correctly to ARGO Plan [J]. Ocean Technol., 2001, 20(2): 1-2. sparse observational data in marine science, a new [3] WU Hong-bao, WU Lei. Methods for Diagnosing and interpolation technique based on the information Forecasting Climate Variability [M]. Beijing: Meteorological diffusion idea is proposed in this paper. Based on the Press, 2005: 186-188. idea of fuzzy sets and by constructing the spread [4] FENG Guo-lin, DONG Wen-jie. Nonlinear Space-Time function approximate to the goal data’s distribution Distribution Theory and Methods of Observational Data [M]. structure, the limited data samples are diffused and Beijing: China Meteorological Press, 2006: 115-116. mapped into the corresponding fuzzy sets in the form [5] ZHANG Ren, WAN Qi-lin, LIANG Jian-yin, et al. Scattered data optimization and imperfect information recovery of probability in the information diffusion in geoscience [J]. J. Data Acquis. Process., 2006, 21(2): interpolation model. To avoid the shortcomings of 209-216. asymmetrical data structure in the normal diffusion [6] HUANG Chong-fu. Risk Assessment of Natural Disaster function, a type of asymmetric information diffusion Theory and Practice [M]. Beijing: Science Press, 2006: 63-66. function is developed and a corresponding [7] HUANG C F. Information Diffusion Technique and Small asymmetric information diffusion algorithm-ellipse Sample Problem [J]. Inform. Technol. Decision Making, 2002, model is established. Contrastive experiment analysis 1(2): 229-249. [8] -quan, LI Ning. Quantitative Methods and shows that the chief advantages of the information Applications of Risk Assessment and Management on Main diffusion interpolation technique outdo other methods Meteorological Disasters [M]. Beijing: Beijing Normal and that it is suitable for dealing with the sparse University Press, 2008: 213-215. observational data (usually with less than 5% [9] ZHANG Ren, XU Zhi-Sheng, HONG Mei. samples). It provides reference to the analysis and Decision-making of marine environment risk for combined processing of small-sized samples and sparse data that operations based on imperfect data samples [J]. Milit. Operat. Res. Sys. Eng., 2009, 23(1): 48-52. actually exist in the natural sciences. [10] ZHANG Ren, XU Zh-sheng, HUANG Zhi-song. Information diffusion and its application to evaluating the Acknowledgement: The authors would like to extend their influence of atmospheric oceanic environment on shipborne gratitude to the Chinese center of ARGO Data for providing missile [J]. Fire Contr. Comm. Contr., 2010, 35(2): 41-44. the research data.

REFERENCES:

Citation: ZHANG Ren, HUANG Zhi-song, LI Jia-xun et al. Interpolation technique for sparse data based on information diffusion principle—ellipse model. J. Trop. Meteor., 2013, 19(1): 59-66.

66