<<

INTERNATIONAL JOURNAL OF MATHEMATICS AND SCIENTIFIC COMPUTING (ISSN: 2231-5330), VOL. 6, NO. 1, 2016 13 On the Efficient Identification of an Inflection Point Demetris T. Christopoulos

Abstract—Our task is to find time efficient and statistically Our main task is to evaluate a total of six methods, in order consistent estimators for revealing the true inflection point of a to find the most efficient one and use it to estimate the mode planar curve when we have only a probably noisy set of points for when we have massive data sets. Paper structure: 1. introduc- it, thus we are introducing extremum surface (ESE) and distance estimator (EDE) methods. The analysis is based on the geometric tion, 2. presentation of ESE & EDE methods, 3. generalisation properties of the inflection point for a smooth function. Iterative to their iterative versions, 4. evaluation and application limits, versions of the methods are also given and tested. Numerical 5. an example of a real data set, 6. conclusion – proofs are experiments are performed for the class of sigmoid curves and given as Appendix. comparison with other available procedures is carried out. It is proven that both methods are quite fast in computational execution. Under a rather common noise type EDE can give II.EXTREMUM SURFACE AND DISTANCE ESTIMATOR a 96% confidence interval, while it always provides estimations METHODS for data with more than a million cases at a negligible execution A. Required concepts time. An alternative way of mode computation for a distribution by using its CDF is given as a real massive data example. Let a function f :[a, b] → R, f ∈ C(n), n ≥ 2 which is Index Terms—Inflection point identification, Efficient convex for x ∈ [a, p] and concave for x ∈ [p, b], p is the unique computation, Mode estimation, Iterarive methods. inflection point of f in [a, b] and let an arbitrary x ∈ [a, b]. Definition 2.1: Total, left and right chord are the lines MSC 2010 Codes – 62F12, 65D17, 65D99 connecting points {(a, f(a)), (b, f(b))}, {(a, f(a)), (x, f(x))} and {(x, f(x)), (b, f(b))} with Cartesian g(x), l(x) I.INTRODUCTION and r(x) respectively. Distances from chords are the functions F,Fl,Fr :[a, b] → R with

The sigmoid or S-shaped model is common in many dis- F (x) = f(x)−g(x),Fl(x) = f(x)−l(x),Fr(x) = f(x)−r(x) ciplines like artificial neural networks [10], utility theory [7] (1) & technological substitution [12], [14] in Economics, growth By using elementary Analytical Geometry we can prove that theory [2] & allosterism [1] in Biology, population dynam- b f (a) − a f (b) f (b) − f (a) ics [20], [18] in Ecology, autocatalysis [19] in Analytical F (x) = f (x) − − x (2) Chemistry, dose-response [13] in Medicine and many others. b − a b − a f (a) − f (x)) A key concept for every sigmoid is the inflection point and Fl(t) = f (t) − f (a) + (t − a) , t ∈ [a, x] (3) its accurate and fast computation while the available methods x − a can be classified to (i) adoption of a suitable process - test f (b) − f (x) Fr(t) = f (t) − f (x) + (t − x) , t ∈ [x, b] (4) function - then use of regression or maximum likelihood b − x estimation techniques, (ii) smoothing data under the restriction that second order divided differences change sign once [5], Definition 2.2: The s-left (sl(a, x)) and s-right (sr(b, x)) (iii) use of to define a proper measure are the algebraic surfaces of discrete and then searching for the least measure x b s (a, x) = F (t)dt , s (b, x) = F (t)dt (5) point [16], [9], Gaussian smoothing techniques [15] and shape l a l r x r R R analysis procedures [11]. The x-left (xl) and x-right (xr) are x–values such that At this paper we are presenting two geometric methods xl = argmin {sl (a, x)} , xr = argmax {sr (b, x)} in order to find the inflection point of a curve, when we x∈[a,b+δ1] x∈[a−δ2,b] know only a set of points {(xi, yi) , i = 0, 1, . . . , n} for it, (6) the Extremum Surface Estimator (ESE method) which works with δ1, δ2 > 0 taken as small as necessary for xl, xr to be with planar surfaces and the Extremum Distance Estimator unique unconstrained extremes in the corresponding intervals. (EDE method) which uses only planar distances. We do not A graphical illustration of the above defined left and right- perform neither regression nor splines representation analysis. terms is presented at Fig. 1 and 2 where we observe that In addition, no concepts like discrete or digital curvature are when x = xl, x = xr then we achieve the algebraic minimum defined or used. Instead we focus on finding each time two sl(a, x) and maximum sr(b, x), respectively. The motivation proper points where the true inflection point lies between and for the left- and right- naming was due to the origin of the then we take as an estimator their middle point, just like chords: left/right starts from the left/right edge of the graph. bisection method works in root finding. Definition 2.3: Normal algebraic distance of the curve point (x, f(x)) from the total chord Def. 2.1 is function N :[a, b] → Demetris T. Christopoulos is with the Department of Economics, R with National and Kapodistrian University of Athens, Greece. (E-mail: (f(b)−f(a))x−(b−a)f(x)+b f(a)−a f(b) N(x) = (7) [email protected]) − √(f(b)−f(a))2+(b−a)2 INTERNATIONAL JOURNAL OF MATHEMATICS AND SCIENTIFIC COMPUTING (ISSN: 2231-5330), VOL. 6, NO. 1, 2016 14

mean, for example the uniform U(−r, r) or normal N(0, σ2). If error is distributed without a zero mean, then results are ambiguous. Definition 2.4: For the noisy data Eq. 9 we define the discrete distances from total, left and right chord as the values

Yi = yi−g(xi),Yli = yi−l(xi),Yri = yi−r(xi), i = 0, . . . , n (10) Definition 2.5: Function f, around its inflection point p, is (a) x

f (p + x) + f (p − x) − 2 f (p) = 0, ∀x ∈ R (11)

and is called locally (ǫ, δ) asymptotically symmetric when:

|f(p + x) + f(p − x) − 2f(p)| < ǫ, ∀x ∈ (p−δ, p+δ) (12)

Definition 2.6: The function f :[a, b] → R, with respect to its inflection point p, has 1) Data symmetry if p − b = a − p (c) x = xl (d) x >xl 2) Data left asymmetry if p − b < a − p Fig. 1: Illustration of the algebraic surfaces sl(a, x) < 0 and 3) Data right asymmetry if p − b > a − p x . l The f :[a, b] → R is characterised as totally symmetric, if it is symmetric around p and has also data symmetry. Definition 2.7: For every subsequent xi < xj the elemen- tary trapezoidal estimation is

xj f(xi) + f(xj ) f(x)dx ≈ Ti,j (f, xi, xj ) = (xj − xi) Zxi 2 (13) And for every standard partition the total trapezoidal estima- tion is

b n−1 (a) x >p (b) x = p f(x)dx ≈ Tn+1(f, a, b) = Ti,i+1(f, xi, xi+1) (14) Z a Xi=0

B. The Extremum Surface Method We can prove that Lemma 2.1: The x-left (xl), x-right (xr) are abscissae such that left, right chord respectively are to the graph G(f). Corollary 2.1: For Eq. 6 it holds

′ f(x)−f(a) (c) x = xr (d) x 0 and ′ f(b)−f(x) xr = arg x f (x) = − xr. b x x∈[a−δ2,b] n o

with δ1, δ2 > 0 taken as small as necessary for xl, xr to be For a convex/concave curve the above definition gives N(x) < unique unconstrained solutions in the corresponding intervals. 0 when x < p and N(x) > 0 if x > p. For a not necessarily This tangency condition is illustrated at Fig. 1(c) and 2(c). strictly sorted grid of n + 1 not equally spaced points {xi} of Corollary 2.2: Let a function f :[a, b] → R, f ∈ Eq. 8 –standard partition– we add errors to the values of a C(n), n ≥ 2 which is convex for x ∈ [a, p] and concave for known function f and create the noisy data set {(xi, yi)} by x ∈ [p, b]. Then we have one of the following possibilities:

the process of Eq. 9. 1) If xl, xr ∈ [a, b] then a ≤ xr < xl ≤ b 2) If xl ∈/ [a, b] then xl > b a = x0 ≤ x1 ≤ ... ≤ xn = b (8) 2 3) If xr ∈/ [a, b] then xr < a yi = f(xi) + ǫi, ǫi ∼ iid(0, σ ) (9) We define the next theoretical estimator of the inflection point: Our analysis is applicable for every distribution with zero INTERNATIONAL JOURNAL OF MATHEMATICS AND SCIENTIFIC COMPUTING (ISSN: 2231-5330), VOL. 6, NO. 1, 2016 15

Definition 2.8: The theoretical extremum surface estimator (TESE) is

xl+xr 2 , xl, xr ∈ [a, b] b+xr xS =  2 , xl > b  (16)  xl+a  2 , xr < a Lemma 2.2: If the mesh λ(n) of the standard partition is 2 such that lim nλ(n) = 0 then Tn+1(y, a, b) is a consistent n→∞ estimator of the Tn+1(f, a, b). Now we are able to compute using our trapezoidal rule of Def. 2.7 data estimations for sl(x0, xj ) and sr(xn, xj ): Definition 2.9: The data estimators for the algebraic sur- faces of Def. 5 are

sl,j (x , xj ) = Tj (Yl, x , xj ) +1 0 +1 0 (17) Fig. 3: Illustration of xF 1, xF 2 and minF (x), maxF (x) sr,n−j+1(xn, xj ) = Tn−j+1(Yr, xj , xn)

Let us define data estimators for xl, xr. Definition 2.10: The χl, χr are values such that: Lemma 2.4: For the points of Eq. 21 it holds

′ χl = xjl , jl = argmin {sl,j+1 (x0, xj )} (18) f(b) − f(a) j∈[1,n] xF 1,2 = arg x f (x) = (23) x∈[a−δ1,b+δ2]  b − a  χr = xjr , jr = argmax {sr,n−j+1 (xn, xj )} (19) ∈ − j [0,n 1] with δ1, δ2 > 0 taken as small as necessary for xF 1, xF 2 to be We define now the noisy data estimator of the inflection point: unique unconstrained extrema in the corresponding intervals. Corollary 2.3: Let a function f :[a, b] → R, f ∈ (n) Definition 2.11: The data extremum surface estimator C , n ≥ 2 which is convex for x ∈ [a, p] and concave for (ESE) is x ∈ [p, b]. Then we have one of the following possibilities: χl + χr 1) If xF 1, xF 2 ∈ [a, b] then a ≤ xF 1 < xF 2 ≤ b χS = (20) 2 2) If xF 1 ∈/ [a, b] then xF 1 < a Lemma 2.3: The ESE is a consistent estimator of TESE 3) If xF 2 ∈/ [a, b] then xF 2 > b with all relevant integrals calculated via trapezoidal rule. We define the next theoretical estimator of the inflection point: We have to make a remark about the concept of convex or concave area, as is defined in [16] and as defined in this work. Definition 2.13: The theoretical extremum distance from There the concept of area that is computed is a summation of total chord estimator (TEDE) is such that the distances from a chord and curve, between two critical x + x x = F 1 F 2 (24) points, while our approach computes an actual geometric area D 2 by using the trapezoidal rule. Let now define data estimators of xF 1, xF 2 and xD. Definition 2.14: The data estimations of points defined at C. The Extremum Distance Method Eq. 21 are

Definition 2.12: The xF-left (xF 1), xF-right (xF 2) and xN- χF 1 = xj1 , j1 = argmin{Yj} (25) ∈ left (xN1), xN-right (xN2) are values such that: j [0,n]

χF 2 = xj2 , j2 = argmax{Yj} (26) xF 1 = argmin {F (x)}, xF 2 = argmax {F (x)} (21) j∈[0,n] x∈[a−δ1,b] x∈[a,b+δ2] Definition 2.15: The extremum distance from total chord xN1 = argmin {N(x)}, xN2 = argmax {N(x)} (22) x∈[a−δ1,b] x∈[a,b+δ2] estimator (EDE) is with δ , δ > 0 taken as small as necessary for x , x χF 1 + χF 2 1 2 F 1 F 2 χD = iff χF 2 ≥ χF 1 (27) to be unique unconstrained extremums in the corresponding 2 intervals. Lemma 2.5: The EDE is an unbiased estimator of TEDE. The defined points and the corresponding line segments can be At this stage we have to mention that there exists a similar found at the Fig. 3 where it is also obviously shown that when work with distances from the chord, see [9], where a summa- we achieve a stationary value for F (x) of Def. 2.1, then we tion of the relevant distances from the chord is taken in order achieve also the relevant stationary value for normal distance to define the concept of a discrete curvature for a planar curve. N(x) of Def. 2.3, since the vector defined from N(x) is just An analogous approach is that of [16], where it is also used a the orthogonal projection of the vector defined from F (x) at proper summation of the distances between the chord and the the normal vector to the total chord. Now we shall prove the curve points. Here we do not define and we do not compute next useful Lemma. any kind of curvature, but we just choose only the two extreme INTERNATIONAL JOURNAL OF MATHEMATICS AND SCIENTIFIC COMPUTING (ISSN: 2231-5330), VOL. 6, NO. 1, 2016 16

(0) (0) distances needed for Def. 2.12. Another interesting property of If and only if j2 > j1 , then we use again EDE method for EDE is that we can give a 96% Chebyshev confidence interval (0) (0) st data (xi, yi), i = j1 , . . . , j2 and find the 1 estimation, because we can find an estimation of the error variance using n o again iff (2) (1) next χF 2 > χF 2 Lemma 2.6: Let a data set created from a known function χ(1) + χ(1) i (1) (1) (1) (1) (1) F 1 F 2 f by using a strictly zig–zag process yi = f(xi) + (−1) ǫi [j1 , j2 ], χF 1 = xj(1) , χF 2 = xj(1) , χD = 1 2 2 for adding iid error terms ǫi ∼ U(0, r). Then it holds that (34) n 2 while we use the same stopping criterion as in BESE. In both \ 2 yi − yi−1 s2 = V ar(y − f(x)) ≈ (28) BESE & BEDE methods we begin from the initial interval n  2  th Xi=1 [a0, b0] = [a, b] = [x0, xn] and find the 0 estimation (0) (0) χS , χD ∈ [a0, b0], then we continue with a second interval It is easy to prove also the next (0) (0) Lemma 2.7: The variance of EDE method which is applied [a1, b1] ⊂ [a0, b0]– which is either [a1, b1] = [χr , χl ] (0) (0) st for a data set created using the requirements of Lemma 2.6 is (BESE) or [a1, b1] = [χF 1, χF 2] (BEDE)– and compute the 1 (1) (1) n 2 estimation χS , χD ∈ [a1, b1] and so on until the termination 1 1 y − y − s2 = s2 = i i 1 (29) criteria are applicable. So, in essence, the only difference from D 2 n  2  Xi=1 bisection method is that now edges of interval [ai+1, bi+1], which is the next step i + 1, are not taken by the requirement Corollary 2.4: The 96% Chebyshev confidence interval for (i) (i) (0) (0) EDE estimator of data under requirements of Lemma 2.6 is f(ai+1)f(bi+1) < 0 but are the [χr , χl ] or [χF 1, χF 2] outputs of the previous step i, see Fig. 4 for BESE iterations of − x n n −e5e 100 2 2 a Gompertz noisy data, created using  v 1 yi yi−1 v 1 yi yi−1  f(x) = 1000 e χD 5 u − , χD + 5 u − − u n X  2  u n X  2  starting from [a0, b0] = [347, 960] and adding error term  t i=1 t i=1  (30) ǫi ∼ U(−25, +25) at an equal spaced x–grid with n = 4086. For the same function but for [a0, b0] = [0, 1000] and an

III.ITERATIVE APPLICATION OF ESE&EDE METHODS

a0 a1a2 b b b Another important option is the possibility of iterations 800 2 1 0 similar to those of the well known bisection method in root i ε finding. Recall that for a continuous function if f(α) f(β) < 0 600 + i 1 100 x −

then exists such that , while its first e ξ ∈ (α, β) f(ξ) = 0 5 − e (0) a+b 400 estimation is χ = 2 . Our ESE method always gives an interval that contains the true inflection point p or a point close 1000 e to the edge a or b, if data is just convex (or just concave) and 200

inflection point does not exist. The EDE method also gives an 0

interval in most cases, although it is more sensitive to errors 350 400 450 500 550 600 650 700 xi of x0, xn, so it does not always give a point close to a or b, if simple convexity or concavity exist. Fig. 4: Bisection ESE (BESE) iterations for noisy data ESE iterative method or Bisection-ESE (BESE) We apply to the probably noisy data {(xi, yi), i = 0, . . . , n} 7 our ESE method and obtain the 0th estimation equidistant grid with n = 10 we apply BEDE without error term, just to show its fast convergence to the true (0) (0) (0) (0) (0) (0) (0) χr +χl inflection point, see Table I and Fig. 5, where we show [jr , jl ], χr = xj(0) , χl = xj(0) , χS = 2 r l labels only for the first four iterations. It is obvious that, (31) (0) (0) If and only if jl > jr , then we apply again ESE for data (0) (0) st (xi, yi), i = jr , . . . , j and obtain the 1 estimation l a0 a1 a2a3 b3 b2 b1 b0 n o 800 (1) (1) (1) (1) (1) (1) (1) χr +χl i (1) (1) 600 [jr , jl ], χr = x , χl = x , χS = 1 jr j 2 100 x l − e 5

(32) − e k k ( ) ( ) 400 We continue until jl < jr or until the number of data 1000 e pairs becomes five, since it is meaningless for the method to proceed with less than five points. 200

EDE iterative method or Bisection-EDE (BEDE) 0

400 500 600 700 We use for initial data {(xi, yi), i = 0, . . . , n} the EDE x th i method and find the 0 estimation iff χF 2 > χF 1 Fig. 5: Bisection EDE (BEDE) iterations for accurate data (0) (0) (0) (0) (0) (0) (0) χF 1 + χF 2 [j1 , j2 ], χF 1 = xj(0) , χF 2 = xj(0) , χD = 1 2 2 (33) although Gompertz sigmoid curve is not either symmetric INTERNATIONAL JOURNAL OF MATHEMATICS AND SCIENTIFIC COMPUTING (ISSN: 2231-5330), VOL. 6, NO. 1, 2016 17

TABLE I: First five BEDE iterations form a total of 21

(i) (i) (i) = + 1 i χF 1 χF 2 χD N n 0 372.2823 719.8335 546.0579 10000001 550 1 423.1383 603.1799 513.1591 3475513 2 453.8448 554.5334 504.1891 1800417 3 472.6048 530.1469 501.3759 1006887 4 483.9011 517.0116 500.4564 575422 300

5 490.6044 509.6994 500.1519 331106 Time in msec

80 around inflection point, thus our EDE method will not give

an initial accurate value, the BEDE convergences to the true ESE BESE EDE NLS BEDE L2CXCV value after just three iterations. The number of points that will be used at next step follow approximately the binary Fig. 6: Run times of all methods for 1000 Fisher-Pry sigmoid N0 noisy data sets with N=1381 points exponential decay formula Nk = 0.927 2k , k = 0, 1,...,, so it actually halved the number of points, after some initial steps. The needed time is always negligible, for example the totally elapsed time for all above computations was 1.58 s. Just for a comparison, if we try to use NLS non linear regression which uses SSlogis(x,Asym,xmid,scal){stats} NLS method of [17] to find the inflection point of our initial 107 +1 L2CXCV points we shall fail to obtain an answer, even if we use BEDE the SSgompertz(x, Asym, b2, b3){stats} method, since we EDE observe after 123.87 s that we cannot proceed further due to memory restrictions of R. So, even if we know the exact BESE underlying model, we are not able to find its parameters when ESE 480 490 500 510 520 we have a very large number of observations. Inflection

IV. EVALUATION AND APPLICATION LIMITS Fig. 7: Inflection point estimations for data of Fig. 6 We shall use 6 methods in order to find each time the inflection point, our newly introduced ESE,BESE, EDE,BEDE geometrical methods, the L2CXCV method presented at [5] and the NLS non linear regression which uses SSlogis{stats} method of R. All methods except last one 1500 have been implemented using double precision FORTRAN. Execution is done inside R environment by creating proper 900

dll’s using typical Intel Core i5 CPU with 4 GB RAM and a Time in msec 32 bit OS. We design our numerical experiments by using two suitable sigmoid functions of known inflection point p = 500, 150 an interval [a, b] which is every time the [1%, 99%] of their total capacity and we add a uniform error via ǫ ∼ U(−r, r) ESE EDE NLS i BESE BEDE L2CXCV the process Eq. 8 & 9. Our set of test functions has mem- bers the Fisher-Pry sigmoid with total symmetry, taken from Fig. 8: Run times of all methods for 1000 Gompertz sigmoid noisy data sets with N=1840 points [6], with use in technological substitution models f1 (x) = x 500+ 500 tanh 100 − 5 for x ∈ [a, b] = [270, 730] and the non symmetric Gompertz  sigmoid, after [8], which is often − x −e5e 100 used in population dynamics f2 (x) = 1000 e for x ∈ [a, b] = [347, 960]. The run times in milliseconds and the results for 1000 experiments are presented at Fig. 6, 8 and 7, NLS 9, respectively for the two curves. We have also computed the L2CXCV 96% Chebyshev confidence interval for our EDE Fisher-Pry BEDE estimations, see Fig. 10, where we observe that the intervals’ EDE average width is approximately 4% of inflection point value. Clearly the most time efficient method is EDE and its BESE iterative version BEDE, while the L2CXCV method needs ESE 470 480 490 500 510 520 530 540 more time and has also many outliers. As for the accuracy, Inflection we have to remind that our totally symmetric Fisher-Pry sigmoid has TESE & TEDE, see Eq. 2.8 & 2.13 equal to 500, Fig. 9: Inflection point estimations for data of Fig. 8 thus our methods give the theoretically predicted results. The INTERNATIONAL JOURNAL OF MATHEMATICS AND SCIENTIFIC COMPUTING (ISSN: 2231-5330), VOL. 6, NO. 1, 2016 18

same is true for the non symmetric Gompertz sigmoid, since TABLE III: NH annual average absolute temperature 1800– (TESE,TEDE) = (492.91, 528.77) and from either Fig. 9 2013 description statistics

or by computation we find their medians to be 493.21, 528.62 Min Q1 Median Mean Q3 Max respectively, very close to theoretical values. So BESE, BEDE ◦ C -29.050 4.950 9.192 9.667 14.520 32.100 methods give medians close to exact value p = 500, while L2CXCV is perfectly symmetric around it. Only the NLS method is totally away, since we use as default test function Our first approach will be based on Empirical CDF and application of BEDE method, see Table IV, where after four the SSLogis. We next study the time needed for all our newly ◦ presented methods, in comparison to the previously existed, iterations and just 0.08 sec we find M ≈ 9.279 C. If we use NLS/Logistic we find after 43.73 sec that a 95% as a function of the number N of (xi, yi) pairs. For this task confidence interval is (8.9894◦ C, 8.9928◦ C). Other methods we create using Eq. 8 & 9 with a = 270, b = 730, ǫi ∼ U(−10, +10) the Fisher-Pry sigmoid with total symmetry, are practically unavailable for the original data set. see Table II, where we observe that EDE,BEDE times are In order to apply them we start from the 13263 unique negligible, while ESE,BESE,L2CXCV are not. Observe temperatures, compute first order differences for them and keep only values that have an absolute difference less than than ≈ 10−13 (the FORTRAN ‘epsilon of the machine’). This is required for L2CXCV method because it works only with strict sorted abscissae. Now as an estimation for the unknown 510 CDF we are creating a cumulative frequency table with those 5968 temperature values as bins and use all methods presented 500500 at this paper to produce Table V and Fig. 11, where we have also estimated PDF using function bkde {KernSmooth} [17],

490 [22] with Gaussian Kernel. For a better visual comparison with CDF we have scaled PDF as shown in Fig. 11. Now

200 400 600 800 1000 the 95% confidence interval for NLS/Logistic is estimated to ◦ ◦ Case number be (9.2343 C, 9.2681 C), quite different and closer to other Fig. 10: 96% Chebyshev confidence interval for EDE at Fisher- methods. Pry sigmoid TABLE IV: BEDE estimations using ECDF of 388335 tem- peratures

(i) (i) (i) i χF 1 χF 2 χD N TABLE II: Execution times (in secs) with respect to N 0 -0.691667 19.483333 9.395833 388335 1 4.450000 12.316667 8.383333 310480 N EDE BEDE NLS ESE BESE L2CXCV 2 6.750000 10.683333 8.716667 173517 10001 0.00 0.00 0.44 3.69 4.43 11.15 3 7.841667 10.141667 8.991667 101947 30001 0.00 0.00 1.42 32.83 37.31 195.49 4 8.741667 9.816667 9.279167 63023 50001 0.00 0.00 2.36 101.99 120.77 596.87 70001 0.00 0.03 1.94 183.96 178.59 925.64 100001 0.00 0.01 2.80 342.20 311.20 1756.07 7 TABLE V: mode of annual absolute temperature for years 10 + 1 0.53 1.07 – – – – 1800–2013 in Northern Hemisphere ESE BESE EDE BEDE L2CXCV NLS here that for noisy data with 10 million xy–pairs we needed ◦ mode ( C) 10.550 9.371 9.400 9.371 9.283 9.251 just 0.53/1.07 sec using ESE, BEDE, while other methods were Time (sec) 1.460 1.640 0.000 0.000 12.850 0.410 practically non applicable.

V. A BIG DATA SET FROM REAL LIFE: FINDING THE MODE VI.CNCLUSIONO OF ANNUAL AVERAGE ATMOSPHERE TEMPERATURE FOR Previously available methods can be applied for relatively 214 YEARS small data sets, since their complexity do not allow them to The inflection point of a Cumulative DF (CDF) corre- always give an output in a short time. For example L2CXCV sponds to the maximum of Probability DF (PDF), since its requires strictly sorted abscissae and practically works until second is the first one of latter. An advantage of n ≈ 50000. The model dependent NLS many times is not able using CDF’s for big data comes from smoothing performed even to give a first output if the underlying f is quite different upon their estimation, since cumulation process absorbs local than the adopted model for it. Provided that an inflection point disturbances. We are using data that have been collected is present inside our noisy data set, then Extremum Surface and processed in [4]. Our task is to find the main mode of (ESE) and Distance (EDE) estimators can be used and give us average values for annual absolute temperature in Northern a value close to their theoretical expected TESE (Def. 2.8) or Hemisphere from N = 388335 rows of 1800–2013 yearly TEDE (Def. 2.8). If data follow a zig-zag pattern, then a 96% meteorological station observations. Their five point summary Chebyshev confidence interval (Eq. 30) is available for EDE can be found at Table III. method with relative small width. If the underlying function is INTERNATIONAL ierlal uptfrdt ihmr hnamlinrows million a The than second. more a with just data for in output reliable give computations Numerical that proceed. show accurately to able be and 12) aaaymty(e.26 hnw a s n fteiterative the of of one have use we can we if versions then or 2.6) 11) (Def. (Def. asymmetry point data inflection around symmetric not V and III Tables functions for density as cumulative data simple, mode, of and Estimations 11: Fig. neto on a lasb on,poie htexists. that provided and found, performed be easily always are can data point inflection real to Applications available. in implementations o using for p 10 x PDF and CDF PDF/max(PDF) and CDF where

0.27 0.37 0.47 0.57 0.67 0.0 0.2 0.4 0.6 0.8 1.0 3. 2. 1. 500050941. 002. 30.0 25.0 20.0 15.0 9.4 5.0 0.0 -5.0 -10.0 -20.0 -30.0 .07.00 5.00 BESE ESE f EDE/BEDE JOURNAL sa es oal smttclysmerc(Def. symmetric asymptotically locally least at is 10 xPDF CDF , b nyfrinterval for Only (b) & a l vial eprtr data temperature available All (a) BESE BEDE FORTRAN & OF akg inflection Package R r h ottm fcetadcan and efficient time most the are nodrt oet neighbourhood a to move to order in EDE MATHEMATICS .71.01.015.00 13.00 11.00 9.37 , , °C °C BEDE Maple [ 5 ◦ C, ehd with methods and 15 AND ◦ MATLAB C SCIENTIFIC ] saalbe[3] available is R r also are while , COMPUTING au efidta tholds it that find we value ofo h iert fepce au ehv also have we value expected of linearity that the from so − N − ucin oteminimum 2.1 the Lemma so of function, Proof recalling then is partition consistent. is estimator the the So of equal norm V not is or V partition mesh standard the if λ then case, second spaced the For holds it V n estimator of V variance the now V  ossetetmtr ftetrue thus the the for data, estimator of actual consistent the estimators a of consistent gives data estimation noisy trapezoidal the for rule then spaced, equal is siao ftaeodlcalculated trapezoidal of both interval estimator that the such If calculated. is trapezoidal integrals relevant that the and lmnaytaeodletmto is estimation trapezoidal elementary ✷ and Lemma, of value rightmost trapezoidal elementary the of variance estimation the computing by T χ ✷ case, the possible of every for Thus, x ( x x i ( i n r x w sn aerl,w find we rule, same using ow, +1 +1 2 i 1 ( n ( ( ( ( em 2.3 Lemma em 2.1 Lemma em 2.2 Lemma +1 + i x T T T T 2 2 +1 sahee when achieved is max = ) ( (ISSN: − 1 ( − − f 2 n i,i n i,i x 2 − s T ( a) E − x x +1 +1 ,a b a, f, l TESE x ′ scnaenw Smlri the is (Similar now. concave is i,i +1 +1 − i i x ( i f f y ( ,x a, i ) +1 ( ( i T s lim i a ′ 2 ( ( (  =0 ,a b a, y, b a, y, Leibniz 2231-5330), ( r ,x y, x y, x n σ ) x 2 ( + +1 ( V i ) ,...,n 2 ) = ) ,x y, 2 f V n ) ,x b, − ie yitgascluae i rpzia rule. trapezoidal via calculated integrals by given hsoretmtri nisd econtinue We unbiased. is estimator our Thus . →∞ i i ′ ( f ( ′ ( x , x , W ehv rvni em . httrapezoidal that 2.2 Lemma in proven have We T + ,a b a, y, x ( )) = )) ( y yuigE.3&5 mohesconditions smoothness 5, & 3 Eq. using By i x − ) i x i,i x , + i aetocss fsadr partition standard If cases. two have e )+ i i o vr subsequent every For 1 + ) +1 +1 sa nraigfnto,s h maximum the so function, increasing an is ≤ 1 2 +1 R ( V uefritga ifrnito efind we differentiation integral for rule x − 1 i x f x a +1 i VOL. x → = ) A +1 ( )) = )) l x ,b [a, i ( )= )) ( σ a) x , +1  i 2 T ,x y, 2 PPENDIX 2 )= )) − y χ n − ≤ r s i n x x = r +1 + If ]. − 6, l i ′′ λ i i f 1 +1 ∈ f x x , ( = ( NO. ( 0 V λ x ′ − ( χ ,x a, n ( ,a b a, y, ( x ( i x i x a l n 2 → ) n ytkn h expected the taking by and ,b [a, i  ) +1 n − a +1 ) i n 2 x ) i hni ses oso that show to easy is it Then . P 1, + sahee when achieved is 2 =0 h etotvleof value leftmost the , n P : l − ESE ( i r − P ( 4 ) n )= )) − =0 σ b 1 − f 2016 b x −→ 2 1 − 2 →∞ = ) ] − )=lim = )) 2 x n 1 i ROOFS E n E a < ′ > a) ( i a) 2 ( then T f 2 ) a x x x 2 2 ( ( ( ( = sacnitn estimator consistent a is i,i 2 T s dtettlvrac is variance total the nd a) x T = ) l T σ S σ 0 σ x l 0 x , n − +1 i,i i,i since , 2 − 2 ( T r +1 2  a) n . ,x a, = hn ealn same recalling then, rmorhypothesis. our from pof sn q 4). Eq. using –proof, f b r ESE i,i +1 +1 + ( →∞ x − ( n 2 ( ,x y, = T f x x i +1 ,a b a, y, a e’scompute s Let’ +1 epciey with respectively, ( ( ( ) x i i,i ) ,x y, x ) ,x y, ( 2 l x ( x  + − ) +1 s 2 a i sadecreasing a is ( − − ,x y, saconsistent a is ( i b x , x l x dw obtain we nd + x b ′′ x < − f a 2 − ( r i i i 2 1 ( l ( ) n ( x , x , t 4  a) n i − . ,x f, χ a) ,x a, a) +1 i − 2 a > p > x x , χ l 2 If 2 i i n find and i +1 V +1 σ l ) σ = = ) a χ , i i i = ) 2  2 x +1 +1 x , 2 ) ( )= )) σ )= )) , l y d l b r ,b [a, 2 = i ,b [a, = ) i ′ the , t +1 b > thus ( ) the x are = = = + 0 ✷ ) ) ]. ] . , . 19 INTERNATIONAL JOURNAL OF MATHEMATICS AND SCIENTIFIC COMPUTING (ISSN: 2231-5330), VOL. 6, NO. 1, 2016 20

′ Lemma 2.4 From Eq. 2 we see that F (x) = [15] F. Mokhtarian and A. Mackworth. Scale-based description and recogni- ′ f(b)−f(a) ′ f(b)−f(a) tion of planar curves and two-dimensional shapes. IEEE Transactions f (x) − − = 0 → f (x) = − while ′′ ′′b a b a on Pattern Analysis and Machine Intelligence, 8:34–43, 1986. F (x) = f (x), provided that smoothness conditions allows [16] M. Mokji and S. A. Bakar. Starfruit shape defect estimation based on us to perform differentiations. Thus both necessary and concave and convex area of a closed planar curve. Jurnal Teknologi, ✷ 48:75–89, 2008. sufficient conditions for extrema are satisfied. [17] R Core Team. R: A Language and Environment for Statistical Comput- ing. R Foundation for Statistical Computing, Vienna, Austria, 2015. [18] P. Turchin. Complex Population Dynamics: A Theoretical/Empirical Lemma 2.5 For all Yj , j = 0, 1, . . . , n it holds Synthesis. Princeton University Press, 2003. E (Yj ) = F (xj ), so if we take the noisy data instead [19] S. Upadhyay. Chemical Kinetics and Reaction Dynamics. Springer, 2006. of accurate, we have E min {Yj } = min {F (xj )} and j∈[0,n]  j∈[0,n] [20] P. Verhulst. Notice sur la loi que la population poursuit dans son accroissement. Correspondance mathematique et physique, 10:113–121, E max {Yj } = max {F (xj )} ✷ 1838. j∈[0,n]  j∈[0,n] [21] V. Volterra. Lesons sur la theorie mathematique de la lutte pour la vie. Gauthier-Villars Paris, 1931. Reissued 1990. Lemma 2.6 Due to the zig–zag process we can accept that [22] M. Wand and M. Jones. Kernel Smoothing. Chapman and Hall, 1995. yi+yi−1 2 is approximately equal to both f(xi), f(xi−1) and yi+yi−1 yi+yi−1 proceed to yi − yi−1 = yi − 2 + 2 − yi−1 ≈ i i+1 (yi − f(xi)) − (yi−1 − f(xi−1)) = (−1) ǫi + (−1) ǫi−1, yi−yi−1 1 i i+1 so 2 ≈ 2 (−1) ǫi + (−1) ǫi−1 and 2 yi−yi−1 1 2 1 2 1  2 ≈ 4 ǫi + 4 ǫi−1 − 2 ǫiǫi−1. Now due to the iid property of our error terms it holds n − 2 2 yi yi−1 1 2 2 ✷ n 2 ≈ 2 2 s − 0 = s iP=1    1 Lemma 2.7 From Def. 2.15, χD = 2 min(yi − 1 g(xi)) + 2 max(yi − g(xi)) and by using Eq. 28 we find 2 1 2 1 2 1 2 ✷ V ar(χD) = sD = 2 V ar(yi) + 2 V ar(yi) = 2 s .  

REFERENCES

[1] W. Bardsley and R. Childs. Sigmoid curves, nonlinear double-reciprocal plots and allosterism. Biochemical Journal, 149:313–328, 1975. [2] L. Bertalanffy. Principles and theory of growth. In Fundamental Aspects of Normal and Malignant Growth. New York: Elsevier, 1960. [3] D. T. Christopoulos. inflection: Finds the inflection point of a curve., 2013. Package version 1.1. [4] D. T. Christopoulos. Extraction of the global absolute temperature for northern hemisphere using a set of 6190 meteorological stations from 1800 to 2013. Journal of Atmospheric and Solar-Terrestrial Physics, 128:70 – 83, 2015. [5] I. C. Demetriou. L2CXCV: A fortran 77 package for least squares convex/concave data smoothing. Computer Physics Communications, 174(8):643 – 668, 2006. [6] J. C. Fisher and R. H. Pry. A simple substitution model of technological change. Technological Forecasting and Social Change, 3:5–88, 1971. [7] M. Friedman and L. Savage. The utility analysis of choices involving risk. Journal of Political Economy, 4:279–304, 1948. [8] B. Gompertz. On the nature of the function expressive of the law of human mortality, and on a new mode of determining the value of life contingencies. Philosophical Transactions of the Royal Society of London, 115:513–585, 1825. [9] J. H. Han and T. Poston. Chord-to-point distance accumulation and pla- nar curvature: a new approach to discrete curvature. Pattern Recognition Letters, 22:1133–1144, 2001. [10] J. J. Hopfield. Neurons with graded response have collective compu- tational properties like those of two-state neurons. Proceedings of the National Academy of Scientists, 81:3088–3092, 1984. [11] S. Loncaric. A survey of shape analysis techniques. Pattern Recognition, 31:983–1001, 1998. [12] C. Marchetti and N. Nakicenovic. The dynamics of energy systems and the logistic substitution model report. Technical report, International Institute for Applied Systems Analysis, RR-79-13, Laxenburg, Austria, 1979. [13] J. Meddings, R. Scott, and G. Fick. Analysis and comparison of sigmoidal curves: application to dose-response data. Am J Physiol Gastrointest Liver Physiol, 257:982–989, 1989. [14] T. Modis. Predictions. Simon and Schuster, New York, 1992.