______PROCEEDING OF THE 26TH CONFERENCE OF FRUCT ASSOCIATION

Evaluating Distance Approximation for Implicit Fitting

MarinaGoncharova,AlexeiUteshev ArthurLazdin SaintPetersburgStateUniversity ITMOUniversity St.Petersburg,Russia St.Petersburg,Russia m.goncharova,[email protected] [email protected]

Abstract—The curve or surface fitting problem to the given Here ∇G stands for the gradient column vector and · is data set is treated as the one of comparison of the distance the Euclidean norm. function with its suggested approximation. The comparison is performed in terms of statistical characteristics for the two sets A new point-to-algebraic-manifold distance approximation of random variables. The approach is exemplified via the sets can be expressed by the formula generated by single ellipse or a pair of ellipses. d2(pi,G)= (3) T I. INTRODUCTION |G(pi)| ∇G (pi) ·H(G(pi)) ·∇G(pi) · 1+ G(pi) . The curve or surface fitting problem is one of the nonlinear ∇G(pi) 2 ∇G(pi) 4 optimization challenges with multiple applications in cluster- T Here stands for transposition, H(G) is the Hessian of G(X). ization, computer graphics and computer vision, image and video processing [1], [2], [3], [4]. Given data set The validity of this approximation has been verified in n [12] for the case of 2D and 3D quadrics [12] and for some P = {pi},pi ∈ R ,i= 1,N, n∈ {2, 3}, planar algebraic in [13]; in the latter paper, this visual the problem is stated as of selection the manifold best fitting comparison of the level curves dj (X, G)=const with the to P in the meaning of minimization of an appropriate metric. offset (equidistant) curves to (1). This manifold is frequently selected in an implicit form In the present paper we deal with the formula (3) validation G(X)=0,X∈ Rn,n∈ {2, 3} (1) for arbitrary algebraic curves via generation of data set with predefined estimations of the distance values. This approach where G stands for a polynomial, whereas metric is taken as should be treated as a generalization of analysis of the point- asum of squares of the (geometric) distances from the points to-ellipse approximations carried out in [5], [6], [7]. from P to (1): N 2 II. PLANAR CURVES AND VISUALISATION d (pi,G) → min . i=1 We start a visual comparison of the formulas (2) and (3) For practical reasons, the solution to this problem is sought in with the verified in [12] case of the quadric the set of polynomials of the lowest possible degree. However, G(X):=XT AX +2BT X − 1=0, even in the case deg G =2, i.e. quadric polynomials, solution is met several obstacles one of which is related to the problem T n×n n where A = A ∈ R ,n∈ {2, 3}, and {X, B}⊂R of distance evaluation from a point to a quadric. For the aimsof are the column vectors. Formula (2) is represented as parametric synthesis, the distance is to be represented explicitly as a function of parameters involved into the problem, i.e. 1 |G(pi)| d = · . the coordinates of a point and coefficients of G(X). Such 1 T (4) 2 (Api + B) (Api + B) a formula is hardly expected for the general case of G(X) and, therefore, in modeling practice, the true distance function and formula (3) is represented as is replaced by its easier evaluated approximations. Dozens of A T A A such approximations have been suggested in current literature 1 ( pi + B) ( pi + B) d2 = d1 1+ T G(pi) . (5) for the case of planar quadrics; for the qualitative comparison 4 [(Api + B) (Api + B)]2 of someofthem we refer to [5], [6], [7]. As an example we take the ellipse Extensions of these formulas to the 3D case or to the higher order planar curves is not always possible. An example of x2 G (x, y):= + y2 − 1=0 such a universal formula is given by the so-called Sampson’s 1 16 distance [8], [9], [10], [11]: with the aspect ratio = 4 to better visualize distortions, which |G(pi)| both formulas (4) and (5) produce. Here on Fig. 1 and Fig. 2 d1(pi,G)= . (2) ∇G(pi) and further in this section we show level curves dj (X, G)=

ISSN 2305-7254 ______PROCEEDING OF THE 26TH CONFERENCE OF FRUCT ASSOCIATION

Despite the fact that the distortions seem huge, the usage of Sampson’s distance in the fitting process on noised random point cloud based on curve G2(x, y)=0shows an appropriate result [3]. For the last example in this section we choose a well-known curve, so-called Folium of Descartes 3 3 G3(x, y):=x + y − 3xy =0. Fig. 1. Level curves d1 = const for ellipse G1(x, y)=0(indicated bold)

Fig. 2. Level curves d2 = const for ellipse G1(x, y)=0(indicated bold) const from 0.25 to 3.00 in increments of 0.25. Analysis of the appearance of distortions in certain areas was in [12]. Fig. 5. Level curves d1 = const for Folium of Descartes G3(x, y)=0 Fig. 3 and Fig. 4 illustrate how formulas (2) and (3) (indicated bold) perform in the case of the complex high-order reference curve [2] 4 x G (x, y):= y5 + x3 − x2 + +1 =0. 2 27 2

Fig. 6. Level curves d2 = const for Folium of Descartes G3(x, y)=0 (indicated bold)

The level curves for both formulas (2) and (3) have neither visible significant distortions nor excessive bias. In the following sections there will be more quantitative Fig. 3. Level curves d1 = const for high-order curve G2(x, y)=0 (indicated bold) analysis.

III. QUANTITATIVE ASSESSMENTS The best ellipse fitting problem has been intensively studied in the literature. Since we are able to treat the general type of , we concern ourselves with the case of a pairs of ellipses which is of importance to the multiple ellipse fitting problem [14], [15], [16]. We consider both cases, namely when the ellipses are separated or, on the contrary, have a nonempty intersection. The test data set is composed in the following way. We utilize the analytical expressions for the offset curves and compose a grid from these curves on varying the distance from 0.1 to 3.0 in increments of 0.1. We then find the intersections Fig. 4. Level curves d2 = const for high-order curve G2(x, y)=0 of these curves with lines passing through the center of one (indicated bold) of the ellipses. In this way, we generate about 3500 of points

------103 ------______PROCEEDING OF THE 26TH CONFERENCE OF FRUCT ASSOCIATION

with nearly equally distributed distance values in the interval IV. RESULTS [0.1, 3.0]. We treated two pairs of random ellipses, one pair is sepa- The quality of distance approximations by the formulas (2) rated, another one has nonempty intersection. The assessment and (3) will be estimated with the aid of characteristics similar measures discussed in the previous section was computed for to those utilized in [6]. every pair. In addition, in each case a level curves dj = const visualization similar to the one discussed above is provided.

A. Linearity A. Ellipses with nonempty intersection To find a linear correlation between the true distance value To produce the quantitative assessments we take the pair and its approximates, we compute the Pearson correlation of ellipses coefficient. The compute it, we first find the mean value μi 2 2 for distance approximation for the points of the set lying at G4(x, y):=(x +9(y − 20) − 3600)× the distance zi from the curve. (x2 + xy + y2 − 20x − 60y − 1000) = 0, scaled in comparison to those who were visualized on Fig. 7 (zi − zi)(μi − μi) L = i . (6) and Fig. 8. 2 2 i(zi − zi) i(μi − μi) Distance approximation (3) takes the form

+ M ·|G4| Together with L we also compute its counterparts L and d2 := √ L− for the equidistant curves lying outside or inside the treated 2 kN2 curve (assuming distances inside and outside are signed, or can here be made so). The closer to ±1 is the value (6), the closer to M := 814x12 + 6644x11y − 137200x11+ linear is the dependency of true value on the distance to it 76796x10y2 − 3201920x10y + 24152400x10+ approximation. 374388x9y3 − 23294560x9y2 + 410609200x9y− B. Curvature bias ... To investigate the deviation ranges of the approximations 5 4 from the fixed Euclidean distance, we calculate the variance +1643684348160000000y + 352168819200000000y − 2 σi for the points of each offset (equidistant) curve lying at the 202128721920000000000y3 − 26034048000000000000y2+ distance zi from the initial curve. Similar to those utilized in 2 [6] we combine local measures σi to give a global measure 10581580800000000000000y + 33592320000000000000000, 6 5 5 4 2 4 2 N := 17x +64x y − 1320x + 623x y − 26040x y+ C = σi . (7) i 178400x4 + 1344x3y3 − 83040x3y2 + 1392000x3y−

Together with C we also compute its counterparts C+ and C− ... for the equidistant curves lying outside or inside the treated curve. +1377y6 − 204120y5 + 9525600y4 − 117936000y3− The closer to 0 is the value (7), the better fitting quality 1257120000y2 + 18144000000y + 129600000000, can provide considered distance approximation. k := ((x2 +9(y − 20)2 − 3600)(2x + y − 20)+ 2x(x2 + xy + y2 − 20x − 60y − 1000))2+ C. Asymmetry 2 2 + ((x +9(y − 20) − 3600)(x +2y − 60)+ We compute the mean approximate signed distance μi and − 2 2 2 μi along corresponding offset contours inside and outside the (x + xy + y − 20x − 60y − 1000)(18y − 360)) . treated curve. Then we calculate asymmetry at each contour pair as the normalized difference in their mean approximate distances L L+ L− | + − −| TABLE I. NORMALIZED ASSESSMENT RESULTS , AND FOR μi μi ELLIPSES WITH NONEMPTY INTERSECTION ai = + − . μi + μi L L+ L− d1 1.0000000 1.0000000 1.0000000 Again, similar we produce a global measure d 2 0.9997189 0.9999210 0.9997744 A = ai i Looking at the Fig. 9 and the Table I we can note that both approximations (2) and (3) exhibit linear relationship with which would be better to be close to zero to produce the better Euclidean distance at least close to curve bounds. Formula (2) fitting by the considered distance approximation . outperforms formula (3) not only inside initial curve as we

------104 ------______PROCEEDING OF THE 26TH CONFERENCE OF FRUCT ASSOCIATION

Fig. 9. Mean distances μi of approximated distances d1 (black) and d2 (grey) along offset curves plotted against the corresponding Euclidean distances

Fig. 7. Level curves d1 = const for ellipses G4(x, y)=0(indicated bold)

 Fig. 10. Asymmetry A of approximated distances d1 (black) and d2 (grey) along offset curves plotted against the corresponding Euclidean distances

B. Separated ellipses Again, we take the pair of ellipses 2 2 G5(x, y):=(x +16y − 1600)× 2 2 − − Fig. 8. Level curves d2 = const for ellipses G4(x, y)=0(indicated bold) (x + xy + y 100x 130y + 3600) = 0 scaled in comparison to those who were visualized on Fig. 12 and Fig. 13. could suppose due to the visualization analysis, but the outside Here we give the implicit equation of equidistant curves in too. Global measure also is better for d1. the form As we might assume after viewing some visualizations Eqv(x, y, z)=0, that emphasize the unstable behavior of formula (3) inside the where curve, the best performed formula in the asymmetry category 8 6 2 6 6 4 4 should not be easily determined, but Fig. 10 shows that new Eqv := (x +34x y +28x z − 6200x + 321x y − point-to-algebraic-manifold distance approximation performed 486x4y2z − 150600x4y2 + 166x4z2 − 138200x4z+ well. 14410000x4 + 544x2y6 − 1506x2y4z − 777600x2y4+ Fig. 11 shows the variance of the distance approximations ... for each level contour. Variance increases fast with Euclidean distance from the curve, especially for the internal points. −107520000000y2 + 225z4 − 765000z3+

2 In addition, looking at the Table II, we can note, that 722250000z − 122400000000z + 5760000000000)× outside the initial curve formula (3) shows considerably better (9x8 +18x7y − 2640x7 +45x6y2 − 7740x6y − 27x6z+ results in curvature and asymmetry assessments. 403400x6 +54x5y3 − 14940x5y2 − 72x5yz+

  + − 5 5 5 4 4 TABLE II. NORMALIZED ASSESSMENT RESULTS A, C, C AND C 1369600x y + 7620x z − 42540000x +72x y − FOR ELLIPSES WITH NONEMPTY INTERSECTION ... A C C+ C− d1 1.0000000 1.0000000 1.0000000 1.0000000 +251965000000yz − 269836560000000y+ d2 0.1401102 0.5829589 0.2283491 0.8366350 9z4 − 185900z3 + 1331320000z2−

------105 ------______PROCEEDING OF THE 26TH CONFERENCE OF FRUCT ASSOCIATION

 Fig. 11. Variance C of approximated distances d1 (black) and d2 (grey) along offset curves plotted against the corresponding Euclidean distances

Fig. 13. Level curves d2 = const for ellipses G5(x, y)=0(indicated bold)

Fig. 12. Level curves d1 = const for ellipses G5(x, y)=0(indicated Fig. 14. Mean distances μi of approximated distances d1 (black) and bold) d2 (grey) along offset curves plotted against the corresponding Euclidean distances

3668093000000z + 2360923200000000). case. This may indicate that increasing number of treated ellipses will not lead to appearance of huge distortions as in All the quantitative measures were computed and repre- the case of curve G2(x, y)=0. sented at Fig. 14, Fig. 15, Fig. 16, Table III and Table IV. The The Sampson’s distance approximation linearity assess- description of the obtained results in general repeats the above ment twice was better, but the difference was not huge. We for the case of intersecting ellipses. We just emphasize slight can consider both formulas (2) and (3) as linear with respect difference among the variance values inside and outside the to the Euclidean distance. initial curve.

 + − In general, new point-to-algebraic-manifold distance ap- TABLE III. NORMALIZED ASSESSMENT RESULTS L, L AND L FOR SEPARATED ELLIPSES proximation have good properties close to ellipses and may be used in the curve fitting problems including multiple ellipse L L+ L− fitting. d1 1.0000000 1.0000000 1.0000000 d2 0.9993255 0.9994992 0.9993132 V. C ONCLUSION A C C+ C− TABLE IV. NORMALIZED ASSESSMENT RESULTS , , AND We discuss here the problem of evaluating the validity FOR SEPARATED ELLIPSES of a new point-to-algebraic-manifold distance approximation A C C+ C− formula. We represent the qualitative and quantitative com- d1 1.0000000 1.0000000 1.0000000 1.0000000 parison of the new point-to-algebraic-manifold distance ap- d2 0.1895881 0.3343521 0.4503979 0.3347797 proximation formula (3) with the well-known and widely applicable Sampson’s distance formula (2) . We concentrate to the potential benefits of application of the formula (3) in the C. Overview of results implicit algebraic curves fitting problem and in the multiple- Based on the visualization results, we outline the fact that ellipse fitting. Comparison is performed not only for a single in the both cases of multiple ellipses the features of level curves ellipse, which is usually done in other studies, but also for more dj = const do not change in comparison with a single ellipse complex curves - a pair of intersecting and non-intersecting

------106 ------______PROCEEDING OF THE 26TH CONFERENCE OF FRUCT ASSOCIATION

ACKNOWLEDGMENT This research was supported by the RFBR according to the projects No 17-29-04288 (Alexei Uteshev) and No 18-31- 00413 (Marina Goncharova).

REFERENCES [1] T.Hastie and W. Stuetzle, ”Principal Curves”, J. of the American Statistical Association, vol. 84, no. 406, 1989, pp. 502-516.

[2] M. Aigner and B. Juttler, ”Robust computation of foot points on  Fig. 15. Asymmetry A of approximated distances d1 (black) and d2 (grey) implicitly defined curves”, Mathematical Methods for Curves and along offset curves plotted against the corresponding Euclidean distances Surfaces 2004, 2005, pp. 1-10. [3] M. Hu, Y. Zhou and X.Li, ”Robust and accurate computation of geo- metric distance for Lipschits continuous implicit curves”, Vis. Comput., vol. 33, 2017, pp. 937-947. [4] C. Lu, S. Xia, M. Shao and Y.Fu, ”Arc-support segments revisited: an efficient high-quality ellipse detection”, IEEE Trans. on Image Process., vol. 29, 2020, pp. 768-781. [5] P.L. Rosin, ”Analysing error of fit functions for ellipses”, Pattern Recognition Lett., vol. 17, 1996, pp. 1461-1470. [6] P.L. Rosin, ”Assessing error of fit functions for ellipses”, Graphical Models Image Process., vol. 58, 1996, pp. 494-502. [7] P.L. Rosin, ”Evaluating Harker and O’Leary’s distance approximation for ellipse fitting”, Pattern Recognition Lett., vol. 28, 2007, pp. 1804- 1807. [8] P. Sampson, ”Fitting conic sections to very scattered data: an iterative  Fig. 16. Variance C of approximated distances d1 (black) and d2 (grey) refinement of the Bookstein algorithm”, Comput. Gr. Image Process., along offset curves plotted against the corresponding Euclidean distances vol. 18, 1982, pp. 97-108. [9] G. Taubin, ”Estimation of planar curves, surfaces, and nonplanar space curves defined by implicit equations with applications to edge and range ellipses and several higher-order algebraic curves with different image segmentation”, IEEE Trans. Pattern Anal. Mach. Intell., vol. 13, no. 11, 2002, pp. 1115-1138. properties. [10] G. Taubin, ”Distance approximations for rasterizing implicit curves”, A qualitative comparison of approximate distance formulas ACM Trans. Graph., vol. 13, no. 1, 1994, pp. 3-42. (2) and (3) is performed by constructing level curves di = [11] G. Taubin, F. Cukierman, S. Sullivan, J. Ponce, and D. Kreigman, const ”Parameterized families of polynomials for bounded algebraic curve for all the treated curves. The results are shown in Fig. and surface fitting”, IEEE Trans. Pattern Anal. Mach. Intell., vol. 16, 1 – 8, 12, 13. For distant from the initial curves level contours 1994, pp. 287-303. we detected that formula (2) have larger deviations outside, [12] A. Uteshev and M. Goncharova, ”Point-to-ellipse and point-to-ellipsoid and formula (3) inside the initial curves. distance equation analysis”, J.Comput. Appl. Math., vol. 328, 2018, pp. 232-251. A quantitative comparison of formulas was made for pairs [13] A. Uteshev and M. Goncharova, ”Approximation of the distance from a of ellipses according to the important practical measures - point to an algebraic manifold”, in Proc. of the 8th International Conf. linearity and curvature bias (both calculated inside and outside on Pattern Recognition Applications and Methods – Vol. 1: ICPRAM, the treated curve, and overall), asymmetry. This is carried 2019, pp. 715-720. out via the generation of the data set of test points with the [14] Q. Zhang, C.H. Jin, H.T. Xu, L.Y. Zhang, X.B. Ren at al., ”Multiple- known distance values. The calculated values of measures are ellipse fitting method to precisely measure the positions of atomic columns in a transmission electron microscope image”, Micron, vol. shown in Tables I – IV. Fig. 9 – 11, 14 – 16 demonstrates 113, 2018, pp. 99-104. the dependence of the studied measures on the corresponding [15] C. Panagiotakis and A. Argyros, ”Region-based Fitting of Overlapping Euclidean distance. The results of the comparison demonstrate Ellipses and its application to cells segmentation”, Image and Vision that formula (3) exceeds formula (2) in a number of criteria. Computing, vol. 93, 2020, 103810. [16] Y. Wu, H. Wang, F. Tang and Z. Wang, ”Efficient conic fitting with an For further research remains the inversion of this approach, analytical Polar-N-Direction geometric distance”, Pattern Recognition, i.e. the best manifold fitting problem for the given data set. vol. 90, 2019, pp. 415-423.

------107 ------