Average-Case For Convex Hulls and Voronoi Diagrams

Rex Allen Dwyer

March 1988

CMU-CS-88-132

Submitted to Carnegie-Mellon University in partial fulfillment of the requirements for the degree of Doctor of Philosophy.

@1988 Rex A. Dwyer. Chapter 4 is reprinted from Journal of Applied Probability 25(4), @1988 Applied Probability Trust. Chapter 7 is reprinted from Proceedings of the Second Annual Symposium on , (_)1986 Association for Computing Machinery, and Algorithmica 2(2), @1987 Springer-Verlag.

This research was supported by the National Science Foundation under Grants DCR-8352081, DCR-8416190, ECS-8418392, and CCR-8658139.

Abstract

This thesis addresses the design and analysis of fast-on-average algorithms for two classic problems of computational geometry: the construction of convex hulls and Voronoi diagrams of finite point sets in Euclidean d-space. The main contributions of the thesis are:

• A new algorithm for enumerating the vertices of a that requires between O(n) and O(n 2) time on average for a set of n independent and identically distributed (i.i.d.) points. The exact running time depends on the input distribution. This algorithm is a useful preprocessing step for algorithms for the facet-enumeration and facial-lattice versions of the convex-hull problem.

• A new method for bounding the expected number of vertices of the convex hull of random points, new results on the asymptotic behavior of the expected number of vertices and facets of the convex hull of n i.i.d, points drawn from any of a wide variety of input distributions in d dimensions, and application of these results to the analysis of existing convex-hull algo- rithms. The distributions considered are: spherically symmetric distributions with algebraic, exponential, and truncated tails; certain uniform product distributions; and uniform distri- butions in d-polytopes. For all but the product distributions, it is shown that two well-known convex-hull algorithms require o(n 2) time on average; for some of the distributions, linear time suffices.

• A new method for analyzing the expected combinatorial complexity and other properties of a random Voronoi diagram in d dimensions, and new results on the expected complexity of the Voronoi diagram of n points chosen from the uniform distribution in the unit d-ball.

• A new linear-expected-time algorithm for constructing the Voronoi diagram of n points from the uniform distribution in the unit d-ball.

• A practical improvement to the classic divide-and-conquer algorithm for the two-dimensional Voronoi diagram that decreases its average running time from O(n log n) to O(n log log n) for n points from the uniform distribution in the unit square and similar distributions.

Acknowledgments

Forsan et h_ec olim meminisse juvabit. Virgil, tEneid, I., 203.

There is no better place to begin than with thanks to my parents, my brother Jim, and my sister-in-law Helen. All four have given the same loving attention to this endeavor as to all my others, and, as in all the others, I have benefited from my brother's trailblazing. Although their contributions to my graduate-school career and thesis are indirect, I would like to mention some mathematicians and computer scientists who made important impressions on me. The late Rebecca Nelson, my fifth-grade teacher, taught me that mathematics is enjoyable. Mrs. Nelson became an outstanding professor of mathematics education and friend of mathemat- ically gifted children. Her untimely death of cancer last year at the age of 43 grieved me greatly. Louis J. Cote and the late Gerhard Wollan edited the Indiana School Mathematics Journal during my high-school years. The problems they posed and the readers' solutions they published gave me the pleasure of seeing a few of my own words in print for the first time. Dan Friedman's enthusiastic teaching in my first computing course at Indiana University showed me that there is much more to computer science than base-two arithmetic. My associations with him and with David Wise gave my time in Indiana's Computer Science Department much of its significance. The technical guidance and friendship I have received from my advisor Danny Sleator have been indispensable. I admire the breadth and depth of his knowledge not only of computer science and mathematics but of the sciences in general. Without his encouragement and his challenges to my beliefs about my own abilities, I might well have given up this project long ago. Bill Eddy performed the functions of a co-advisor and certainly would have been one officially if his campus address were not "Statistics Department". His almost collegial attitude toward me in the past two years has been as gratifying as it is incongruous. My technical discussions with him have been extremely valuable. He read drafts exceptionally promptly, and many parts of this thesis are better because of his suggestions. I would particularly like to thank him for pushing me a little harder in the direction that eventually led to the results of Chapter 6. Dana Scott's most tangible contribution to the thesis is its organization on the large scale. An hour or so in his office changed an overly hierarchical outline with two unmanageable chapters into a more flexible one with eight. I will also remember his moral support during difficult times. In addition to giving encouragement over the past year-and-a-half, Doug Tygar stressed the importance of a well-written introduction and gave useful advice on the presentation I made at my thesis defense.

Ken Clarkson of AT&T Bell Laboratories served as outside examiner on my thesis committee. His particular expertise in probabilistic aspects of computational geometry makes his interest in my work particularly gratifying. His suggestions are reflected throughout the thesis. His help with the proof of Lemma 6.7 was especially important. To department chairman Nico Habermann I owe thanks not only for managing human and material resources to provide an excellent working environment for everyone, but for the personal interest he took in resolving difficulties affecting my thesis progress when they arose. Ravi Kannan supervised my research for three years until changing interests and physical sep- aration made this impractical. His comments on drafts of Chapter 7 were quite valuable, and our discussions of the problems addressed in Chapter 4 girded me for my final attack on them. Jon Webb's code for the Guibas-Stolfi algorithm formed the basis of the experiments leading eventually to the results of Chapter 7. Steve Shreve of the Mathematics Department helped me prove Lemma 7.7. Ignace Kolodner of Mathematics gave tips on evaluating the integrals of Chap- ter 6. Luc Devroye of McGill University corrected an error in the proof of Theorem 4.5. An anonymous referee's comments gave insights leading to a shorter proof of Theorem 4.2 with smaller constant factors.

In professor Steve Brookes and fellow students Claire Bono, Mark Derthick, Guy Jacobson, Craig Knoblock, Kevin Lang, fellow Hoosier Cathy McGeoch, Lyle McGeoch, Francesmary Mod- ugno, Harry Printz, Brad White, and Ed Zayas I found fascinating and supportive friends, and their interest in my progress and well-being will not be forgotten. Among friends outside the Computer Science Department, recovered graduate students Frances Dannenberg, Wen-Ling Hsu, and Philip Long have been particularly encouraging.

in Contents

1 Introduction and Summary 1 1.1 Examples of Convex-Hull Calculations ...... 3 1.2 An Example of Voronoi Diagram Calculations ...... 8 1.3 Thesis Summary ...... 11

2 The Convex Hull Problem 15 2.1 Elements of the Combinatorial Theory of Polytopes ...... 15 2.2 The Convex-Hull Problem ...... 18 2.3 Algorithms for the Convex-Hull Problems ...... 19

3 A Probabilistic Approach to the Convex-Hull Problem 26 3.1 Elements of Probability Theory ...... 28 3.2 A Fast-on-Average Vertex-Enumeration Algorithm ...... 29 3.3 A Method for Bounding EFt, ...... 31 3.4 A Method for Bounding EVn Above ...... 36 3.5 A Method for Bounding EVn Below ...... 37

4 Convex Hulls of Samples from Polytopes 38 4.1 Upper Bounds on Vertices ...... 38 4.2 Lower Bounds on Vertices ...... 42 4.3 Upper Bounds on Facets ...... 44

5 Convex Hulls of Samples from Spherically Symmetric Distributions 48 5.1 A General Framework for Spherical Distributions ...... 50 5.2 Distributions with Algebraic Tails ...... 52

iv 5.3 Distributions with Exponential Tails ...... 54 5.4 Distributions with Truncated Tails ...... 60 5.5 Uniform Distributions in Products of Balls ...... 62

6 Voronoi Diagrams of Samples from a Hypersphere 65 6.1 A General Method for Bounding the Expected Complexity of Voronoi Diagrams . . 68 6.2 Bounds for the Uniform Distribution in a d-Ball ...... 72 6.3 A Fast Algorithm for the Unit d-Ball ...... 78

7 Voronoi Diagrams in the Plane 87 7.1 Preliminaries ...... 88 7.2 A Faster Algorithm and Its Worst-Case Running Time ...... 91 7.3 Analysis of Expected Time ...... 92 7.4 Extension to the Lp Metrics ...... 97 7.5 Experimental Results ...... 100

8 Conclusions and Conjectures 102

A Notation, Constants, and Identities 105 A.1 Asymptotic Notation ...... 105 A.2 Symbols ...... 107 A.3 The Gamma and Beta Functions ...... 108 A.4 Slowly Varying Functions ...... 108 A.5 Geometric Constants and Functions ...... 109 Chapter 1

Introduction and Summary

This thesis addresses the design and analysis of fast-on-average algorithms for two of the classic problems of computational geometry: the construction of the convex hull and the construction of the Voronoi diagram of finite point sets in Euclidean d-space. Computational geometry as a subject is concerned with the application of computers to the solution of geometric problems. The historical antecedents of computational geometry have been described in some detail elsewhere [73]; it may be regarded as the confluence of complexity theory and algorithm analysis with algebraic and combinatorial geometry. While he was far from the first to apply a computer to a geometric application, M. Shamos' thesis [84] was a landmark in defining the field as a branch of analysis of algorithms, ttis more recent book [73], written with F. Preparata, gives a good flavor of both the problems and techniques of computational geometry. (A broader, more concise survey has been made by Preparata & Lee [61].) Briefly, computational geometers concern themselves with geometric objects that have finite descriptions (points, lines, line segments, polygons, circles, halfspaces, polytopes, splines, etc.) and with tasks involving these objects such as determining intersections, relative orientation, or proximity. Like other algorithm analysts, computational geometers are interested in the amount of time and (to a lesser extent) memory space their algorithms consume, expressed as a function of input size. Unlike others, they generally adopt a real-RAM model of computation, assuming that each word of computer memory can hold an arbitrary real number, and that arithmetic operations on reals (and often other functions like truncation, square root, cosine, etc.) can be preformed in unit time. This model is realistic in that geometric algorithms are typically implemented by programs that use floating-point numbers and that floating-point numbers can be operated on in constant time. It is unrealistic in that floating-point numbers only approximate real numbers with limited accuracy.Issuesofnumericalprecision,overflow,and underflow,howeverimportanttheymay be inpractice,areusuallythoughtto be sideissuesand leftfornumericalanalysts.For betterorfor worse,itispreciselythisattitudethatisadoptedinthepresentwork. Again likeotheralgorithmanalysts,computationalgeometershavegenerallypreferredworst- caseoveraverage-casaenalysis.There areseveralreasonsforthis.Firstofall,as itsname implies, worst-caseanalysisspecifieshard outerlimitson how poor an algorithm'sperformancecan be on any inputofa givensize.Other thingsbeingequal,"certainly"isbetterthan "probably"or "typically".Ifthecertainworst-caseperformancebound isoptimalorevenjustverygood m as it oftenis,especiallyfortwo-dimensionaproblemsl -- average-caseanalysisissuperfluous. Secondly,average-casaenalysisismore difficultthan worst-caseanalysis,and a worst-case analysisofallor partofan algorithmmust usuallybe athand inordertocarryout an average-case analysis.As a rule,average-caseanalysisiscarriedout onlyforsimplesortsofinputdistributions, most oftenuniformdistributions.In thisconnectionwe may alsopointto a sortofculturalbias among computer scientiststoward discretemathematicsand away from the sortof continuous mathematicsthatoftenfacilitatesaverage-casaenalysiseven when itisnot absolutelynecessary. (Forexamplesof resultshard-won with purelycombinatoriaalnalysisand laterrederivedmore easilyand more preciselywithcontinuousmethods,seethework ofBentleyetaL [4]and Devroye [24]on maximal vectorsor thework ofBollobgs& Simon [9]and Frieze[46]on priorityqueues.) Thirdly,average-casaenalysisrequireths eassumptionofan underlyingdistributionon thesetof possibleinputs.Itisnearlyalwaysdebatablewhetherany particuladistributionr forwhichaverage- caseanalysisistractablme ormerelyfeasibl--e isthe"correct"one inthesenseofmodeling"real" inputsaccurately.Of courseitisalsolikelythatdifferentapplicationssupplydifferentdefinitions of "real"and "correct".In many casesitisdifficultto come up with any non-triviadistributionl forwhichanalysisseemsfeasible. Nevertheless,the two problemsaddressedinthisthesisstand among thoseforwhich proba- bilistiacnalysisisboth tractableand desirableIt. istractabline partbecausethe inputconsists ofpointsratherthan more complexobjectssuchas polygons.Itisdesirablbee causethebest-case and worst-caseperformanceofknown algorithmsdiffesor greatly.For concretenessletus briefly considerthegift-wrappingalgorithmforthe convexhullofn pointsinI_.d. Expressedonlyas a functionofinputsize,itsrunningtimeisO(n) inthebestcaseand O(nl+[d/2J)intheworstcase, an importantdifferenceevenwhen d = 2. Infact,thisalgorithmisoutput-sensitivthe:esizeofits output,h, dependsnot onlyon n but alsoon the particulanrinputpoints,and itsrunningtime depends on h as well as n. Its running time is O(nh), and h can vary from O(1) in the best case to O(n [d/2j) in the worst. These time estimates are nearly worthless for any practical purpose. However, an estimate of Eh, the expected (average) value of h, for a distribution that models typ- ical inputs in our application would yield a very precise estimate of expected running time. Using the methods of geometric probability theory, it is possible to do this for many distributions; it is precisely this program that is undertaken in Chapters 4 and 5. There we will see that Eh can be as small as e(1), but it is not known to be larger than O(n) for any natural distribution. We can reasonably anticipate O(n _) performance on typical inputs. A similar investigation on a smaller scale is carried out for Voronoi diagrams in Chapter 6. Given an estimate of expected output size that is significantly better than the worst-case bound, we may also attempt to modify known algorithms to improve their average running time. Bent- ley & Shamos' randomizing divide-and-conquer technique [5] as applied to vertex enumeration in Chapter 3 succeeds in reducing average time without exploiting specific knowledge about the input distribution. Other techniques, such as the bucketing techniques employed in Chapters 6 and 7, are optimized for a specific distribution. Obviously improvements of the first sort are preferable, but the latter are also useful. The next two sections sketch out the principal probabilistic methods applied in later chapters in an easier-to-understand two-dimensional setting. The final section of this chapter summarizes the contributions of the thesis chapter by chapter.

1.1 Examples of Convex-Hull Calculations

The convex hull of a set of points is the smallest convex set containing all the points. A two- dimensional point set and its convex hull are pictured in Figure 1.1. In the planar case we may imagine that the points are nails driven into a board. To find the convex hull, we stretch a rubber band out far enough to surround all the nails and let go. The rubber band comes to rest on some of the nails. The area surrounded by the rubber band is the convex hull, the nails touched by the rubber band are its vertices, and the straight segments of rubber band running between pairs of nails are its facets. In d dimensions we must imagine a (d - 1)-dimensional sheet of rubber topologically equivalent to the surface of a (d - 1)-dimensional sphere. The facets are defined by (at least) d vertices and are (d- 1)-dimensional polytopes. Dozens of papers have been published on algorithms for planar convex hulls; Preparata & Shamos [73] give an overview. The two quantities of greatest interest for analyzing algorithms are EVn, the expected number

3 A A

o

0

Oo

Figure 1.1: A set of points and its convex hull. of vertices, and EFn, the expected number of facets of the convex hull of a set of n independent and identically distributed (i.i.d.) points. In two dimensions, V_ -- Fn, but in higher dimensions, F_ can be as large as Vn[d/2j . It is instructive to examine different methods for bounding EVn and EF,_ in a two-dimensional context before proceeding to the greater complexity of higher dimensions. For the sake of concreteness, let us consider n i.u.d. (independent and uniformly distributed} points in the unit circle, one of the distributions investigated in R_nyi & Sulanke's seminal paper [75] on this topic. First, we bound EF,_ using a method developed in the plane by R_nyi & Sulanke [75,76] and extended by Efron [39] and Raynaud [74] to higher dimensions. Let P12 be the probability that the first two points, X1 and X2, define a convex hull facet. This probability is the same for any other pair of points, so

Fix Xx and X2 at xl and x_. The probability that xx and x2 form a facet is just the probability that the other n - 2 points lie on the same side of the line xxx2, or (I ''_-2 + (1 - I')'_-2), where I" and 1 - 1" are the probability contents of the two halfplanes formed by the line chosen so that tl r

Figure 1.2: Bounding EFn, the expected number of facets of the convex hull. r < 1/2. Writing g(.) for the density function of X1 and X2, we see that

R 2 1t2

Now we make a change of variables and express the two points in the following terms: Let p be the projection of the origin onto the line xlx2, expressed in polar coordinates (r, 8) with 0 _< r < oo and 0 < 0 < 2r, and let tl and t2 be the signed distances fromp to Xl and frompto x2. (See Figure 1.2.) The Jacobian of this transformation is Itl - t21, the length of the line segment XlX2. We now have

P12 : L2_r Lco/__'co(lco Fco(Fco n-2 + -- F) n-2) tl -- t2, g(tl]r,O)g(t2,r,O)dtldt2drdO, where g(tl t r,O) means the density of tl for a fixed r and 0. There are actually no dependencies on 0 because the distribution is spherically symmetric. The term F n-2 : O(2 -n) is insignificant, and (1 - F) n-2 : O(e-nr). We are left with

P12 : 0(1) Itl - t2l g(tl [ r) g(t2 [ r) dtl dr2 e-"r dr. rL --/co _+-Fco ] The expression

f_+oo_ f_+oo_ tl -- t219(tl I _) g(t2 I ") dtl dt2 f_-+_f_-+_oIt1o - t21g(tl [r) g(t2 [ r) dtl dr2 I:+:f:__(_s_)_(_I_)d_,,2 - (f+__(,1: Ir)_1_) is just a conditional expectation: the expected length of xlx2 for a fixed r. So we now have

/'12 =-o(1) f?E(Itx - t21Ir) -oo g(tx Ir) dtl )'e-"vdr For the uniform distribution in the unit circle, E( tl - t2l I r) and E+__ a(ti [ r)dtl are both proportional to the length of the chord formed by the line XlZ2 in the unit circle; its length is 2v/] - rz = O(x/i'-_. The probability content F is proportional to the area cut off the circle by the line; this lies between (1 - r)V/1 - r2 and 2(1 - r)V/1 - r2 and is O((1 - r)3/2). Substituting t : n(1 - r) 3/2 gives

P12 = O(1) e-t -- t

= O(n -5/3) fnnt_/3e -tdt = 0(n-5/3), since the integral in t is a Gamma integral. It follows that EFn = O(nl/3).

Now let us turn to upper bounds on EVn, the expected number of vertices, for the same distribution. Fix X1 at xl, and draw the line through the origin and Xl and another perpendicular line through Xl. (See Figure 1.3.) If Xl is a vertex of the convex hull, at least one of the four regions defined by these lines must be free of points. This time let F be the probability content of the smallest of the regions. The probability that this region is empty is (1 - F) n-1. The probability that at least one of the four regions is empty is at most 4(1 - 1') n-1. The probability that X1 is a vertex is therefore

P1 _< f 4(1 - r)--_ g(_l)dxl. R2 We estimate 4(1 - F) n-1 -- O(e -nr) and change to polar coordinates: P1 : O(1) f?/0e-nrg(r,O)rdrd# = 0(1) f02e--nrfrdrdO01 = 0(1) /o1e-nrrdr. Figure 1.3: Bounding EVn, the expected number of vertices of the convex hull.

As before, F -- O((1 - r)3/2). Substituting t -- _r

= O(n -2/s) /?e-tt -1/sdt = o(n-2/s).

Since the probability is the same for each of the points, EVn -- O(nl/S). Finally, let us consider lower bounds on EV,_. If a particular fixed line passing through point X1 defines an empty halfplane, then that point is a vertex with probability 1. Let F be the probability content of the fixed halfplane. Then the probability that a given point is a vertex is bounded below by the probability that the corresponding halfplane is empty, which is (1 - F) "-1 Then

P1 _> f (1 - r)"-1gCx,) dxl. R2 We estimate (1 - F) n-1 > exp(-(n- 1)F/(1 - F)) and change to polar coordinates:

g(r,O)r dr dO P1 = 12(1)fo2_fo°°exp(-(n-1)r (f - r) /)

7 7"dT" dO =o(1)f02"(1(n/-0Xexp(F)1)r/ 1 -(_- l_)r,_ -- f_(1) f ° exp( i_-- r) ] rdr.

We substitute w = (1 - r) and choose the line to be perpendicular to the line through xl and the origin. As before, F -- 0((1 - r)3/z) - O(wS/Z). Substituting t - nF gives

1)F Px = f_(1)fnlexp( -(n(i--F)))(1-w)dw

[r*-2/3 -(n - 1)r : f_(1),u exP(cx-n-1))(l-n-2/a)dw

= fl(X) fr*-2/s exp(-nr) dw g0

= a(1) e-* -t = f_(n -2/3) /:e-tt -13dr

f_(n_2/S)e_l _1 t_l/3 dt -- f_(n-213).

Since the probability is the same for each of the points, EVr* = f](nUa).

1.2 An Example of Voronoi Diagram Calculations

Now let us turn to the Voronoi diagram and its dual, called the in two dimensions. The Voronoi diagram of a set of points ! called sites _ is a partition of R d that assigns a surrounding polytope of "nearby" points to each of the sites. More rigorously, the Voronoi diagram of a set Y.r, : {xl,x2,... ,Xn} of n sites in R d is the set of n convex regions "l)i: { x[ Vj" dist(x, xi) _

Sr* : 2n - 2 - Vr*. For i.u.d, points in the unit circle, we have just showed that EVn -- O(nl/3), ! ,,, # s ! s

i S

Figure 1.4: A Voronoi diagram and its dual. so it is immediate that ESn _ 2n; a separate probabilistic analysis of ESn is hardly necessary. Nonetheless it will be instructive to carry one out to lay a framework for higher dimensions. The first three points Xl, x2, x3 define a triangle with probability one. Let us first reckon the probability that they also define a triangle in the dual of the Voronoi diagram. This is just the probability that the other n - 3 points lie outside the circle passing through the three points. Writing g(.) for the density function of the xi and r for the probability content of interior of the circle, we see that this probability is

Pn : f f f (1- r)n-Sg(xl)g(xs)g(xs)dxldx2dx3, R 2R 2R 2 and that the expected number of trianglesistherefore

ESn : Pn : (1 - r)n-S g(xl)g(x2)g(x3) dxl dx2 dxs,

We next carry out a transformation of coordinates resembling that made above to evaluate

EFn. The three points Xl, x2, xs can be expressed in terms of p, the center of the circle they define, r, the radius of the circle, and three angles ¢1, ¢2, and Cs giving the inclination of the lines pxi for i : 1, 2, 3. (Figure 1.5.) The Jacobian of this transformation is shown in §6.1 to be

2r area(Axlx2x3) Figure 1.5: Bounding ESn, the expected number of simplices of the Voronoi dual. and

Pn= 2 /f r-2(1 - iv')n-s ,.2.2.h(xl)h(x2)h(x3)area(Axlx2x3)d¢ld¢2dCs drdp (1.1) R2 0 J0J0 ] with h(xi) = rg(xi). If we define g(r,p) = fo2"h(xl) d¢l, then, since Xl, x2, and x3 are i.i.d.,

fo 2_rfo 2_rfo2rh(xl)h(x2)h(x3) d¢l d¢2 d¢3= (_(r,p)) s, and by an argument similar to that in §1.1, the bracketed quantity of (1.1) can be shown to be

(g(r,p))3E(area(Axlx2xs) I[Ixi - Pll = r for i = 1,2,3) = (_(r,p)) 3 earea(r,p), and so

Pn = 2 r-2(1 - P)n-z(0(r,p))3 earea(r,p) dr dp. R 2

10 It is clear that _, F, and earea depend onlyon r and []p[ if g is spherically symmetric. To exploit this symmetry, we express p in polar coordinates (q, 0). The Jacobian of this transformation is q,

SO

~ 4_r. //f/ qr-203 earea(r,q)exp(-nr)drdq.

and, since 3n) _ -_',n 3

ES, ,_ -----_.27rn3 f0c°f0°° qr-2_ 3 earea(r,q)exp(-nF)drdq.

For the uniform distribution in the unit circle, evaluation of this expression requires the domain of integration to be divided into eight cases according to the geometry of the intersection of the unit circle with the circle defined by q and r. For the present, we consider only the simplest case: 0

g = r -1 for points in the unit circle;

= 21rr/_r = 2r;

earea -- 3r2/2_r (see (A.6));

F -- rr2/r-- r2,

SO

2_r3n3 " _oI_o l-q qr -203 earea(r,q) exp(-nF) dr dq

: 8n3folfol-qqrSexp(-nr2)drdq

4n 3 q e-t dq folfo °° (nt--) 3/2 (t) 1/2dtT = 2n; in fact, this case dominates the other seven and ESn "_ 2n.

1.3 Thesis Summary

Chapter 2 restates required results from the combinatorial theory of polytopes, presents the three versions of the convex-hull problem, and reviews known algorithms for solving these three problems.

11 The size of the output of an instance of the convex-hull problem of size n in d dimensions can be as small as O(1) and as large as O(n[d/2J). An algorithm whose running time depends only on input size is necessarily inefficient for problem instances with small outputs. The running times of the two most practical algorithms, gift-wrapping [90] and shelling [81], are parametrized in terms of output size as well. Chapter 3 argues for the need for estimates of the average output size for the convex hull problem. After necessary results from elementary probability theory are summarized, a fast-on- average vertex-enumeration algorithm is presented that applies the randomized divide-and-conquer technique of Bentley & Shamos [5]. Its expected running time is linear in n for distributions for which EVn, the expected number of vertices, is o(v_, and always subquadratic. This algorithm is a useful preprocessing step for algorithms for the other two versions of the convex-hull problem. The remainder of Chapter 3 is devoted to the exposition of general methods for bounding EVn and EFn, the expected number of facets, for a given input distribution in any number of dimensions. The expected number of facets can be determined exactly using a generalization of the method of §1.1, a method presented by R_nyi & Sulanke [75,76] for two dimensions and by Efron [39] for three. A similar method has been applied by Raynaud [74] and described by Miles [70] and Santal6 [79]; however, Raynaud overlooks certain simplifications, and Miles' and Santal6's expositions resort to concepts such as Pfaffians, exterior product calculus, and Stiefel and Grassmann manifolds. Our presentation attempts to be elementary and complete and to avoid unnecessary generality. We also present new methods for deriving bounds on the expected number of vertices. The upper and lower bounds derived by these methods generally differ by a factor that is exponential in d but independent of n.

The results of Chapters 4 and 5 are summarized in Table 1.1 along with older results on convex- hull expectations. Chapter 4 investigates EVn and EFn for i.u.d, points in a polytope. We show that Devroye's O(log d-x n) bounds on EVn for the d-cube are tight and can be extended to any d-polytope. We derive a similar bound on EFn for simple polytopes, the first asymptotically tight bound on EFn for a distribution that is not spherically symmetric. Both gift-wrapping and shelling algorithms require only linear time on average if the vertex-enumeration algorithm of §3.2 is used for preprocessing. Chapter 5 is devoted to extending Carnal's very general results for circularly symmetric distri- butions in the plane [18] to higher dimensions. In §5.2 and §5.3 we show that EVn and EFn are both subpolynomial for two types of distribution with infinite support. Both the gift-wrapping and

12 distribution § I EVn I EFn I running time 1 uniform in:

-- hypercube ...... [2'4] 0oi_logd-l, 0 ((logon)(d-1)[d/'J) O(n)

product of balls ...... §5.5 0 in(_) logrn-1 n) --ball [74] O n_ +;rg-r) 0 n(_ -_) " ¢a-l_'_ ( ) spherically symmetric: §5 algebraic tail §5.2 0(1) 0(1) O(n) slowly varying tail §5.3 0 ((e(n))('_'---_)) 0 ((e(n))[_J('__)) _ E O(n) --exponential tail ...... §5.3 0 ((log n)(_'L) ') 0 ((log n)('_--"L_)) O(n) normal [74] 0" ((log n) (_r_-!))" 0' [(log n) (_)) " algebraic in ball §5.4 (n (2_"%(_:2L_

-- uniform [74] n(_r_'__) 0 (n( +zg-r))' e-'

(Referto indicatedsectionsof text for descriptionsof parametersd_, m, ,, k. Briefly, d_< d, m < [d/dlJ, k > 0, _a ,C_)= o(__) fo_ _,_u,_> o.)

Table 1.1: Summary of old and new convex-hull results.

13 the shelling algorithms can construct the convex hull in O(n) time on average if the preprocessing step of §3.2 is used. In §5.4 we show that EFn -- O(EV,) = o(n) for a large class of spherical distributions with bounded support. Either algorithm can construct the convex hull in o(n 2) time on average; the exact order of the running time depends on the parameters of the particular distri- bution. As an additional demonstration of the power of our method for bounding EV,.,, we consider i.u.d, points in the Cartesian product of balls of various dimensions in §5.5. It is apparently difficult to apply direct methods to bound EFn; however, we can still show that EVn differs from EVn for i.u.d, points from the component ball of largest dimension by at most a logarithmic factor. Chapters 6 and 7 describe fast-on-average algorithms for Voronoi diagrams. Chapter 6 gen- eralizes to higher dimensions the method of §1.2 for calculating exact asymptotic bounds on the expected complexity of the Voronoi diagram of n i.i.d, points. A new Voronoi-diagram algorithm for n i.u.d, points from the unit d-ball is described. Then the new technique is applied to show that the algorithm requires only O(n) time on average. This algorithm is faster than any previously known. A d-dimensional Voronoi diagram can be also be constructed by exploiting a well-known transformation to a (d + 1)-dimensional convex-hull problem. Our analysis also shows that both gift-wrapping and shelling construct the Voronoi diagram in O(n 2) time on average for this distri- bution.

Chapter 7 presents an easily implemented modification to the divide-and-conquer algorithm for computing the Voronoi diagram and Delaunay triangulation of n points in the plane. The change reduces its O(n log n) expected running time to O(n log log n) for a large class of distributions that includes the uniform distribution in the unit square. While linear-expected-time algorithms exist, they either sacrifice e(n log n) worst-case performance or simplicity of implementation. Experimen- tal evidence presented demonstrates that the modified algorithm performs very well for n _< 216, the range of the experiments. We conjecture that the average number of edges it creates m a good measure of its efficiency m is no more than twice optimal for n less than seven trillion. The improvement extends to the computation of the Voronoi diagram in the Lp metric for 1 < p _<¢x_. Finally, in Chapter 8 we offer several conjectures based on intuitions gained from this research. The most significant among these are that EFn -- O(EVn) -- o(n) and ESn = O(n) for any distribution that has a density with respect to Lebesgue measure in R d. Appendix A describes notational conventions and required identities involving well-known func- tions.

14 Chapter 2

The Convex Hull Problem

The convex-hull problem is central to computational geometry and important in many applications [73]. Its three-dimensional version is important in computer graphics, computer-aided design, and geometric modeling. Higher-dimensional problems occur in operations research and in statistical applications such as robust estimation (outlier elimination), isotonic regression, and clustering. This chapter presents a review of necessary elements of the combinatorial theory of polytopes based on BrCndsted's Introduction [12]. Then the three versions of the convex hull problem -- vertex-enumeration, facet-enumeration, facial-lattice construction m are presented along with existing algorithms for their solution.

2.1 Elements of the Combinatorial Theory of Polytopes

Let tt be the set of real numbers, and let Xn = {Zl, X2,... ,xn} be a set of points (vectors) in R d.

Definition 2.1 A point y E R d is a convex combination of the points in Xn if and only if there exist non-negative real numbers )q,,k2,... ,An summing to 1 such that

y= _ )qxi l

Definition 2.2 The set of all convex combinations of points in Xn is the convex hull of X_, denoted by conv Xn.

It is well known that the following characterization of convex hulls is equivalent.

Proposition 2.3 The convex hull of Xn is the intersection of all closed halfspaces that contain Xn.

15 The following terms will also be useful.

Definition 2.4 A point y E tt d is an affine combination of the points in X, if and only if there exist (possibly negative) numbers A1,A2,... ,An summing to 1 such that

Y= E _iXi l

Definition 2.5 The set of all affine combinations of points in Xn is the affine hull of Xn, denoted by aft Xn.

For example, the convex hull of two points Xl and x2 is the relatively closed line segment XlX2; their affine hull is the entire line that passes through them. The convex hull of three non-collinear points Xl, x2, x3 is the interior and boundary of the triangle Z_XlX2X3; their affine hull is the plane containing them. In two dimensions, one may imagine the points of Xn to be nails in a board. If a rubber band is stretched out to surround all the nails and then released, it will come to rest on the boundary of the convex hull. In the sequel we will often abuse terminology by using "convex hull" and "cony" to refer to the boundary alone.

Although one may properly speak of the convex hull of any subset of R d, we will concern ourselves only with convex hulls of finite sets. The convex hull of such a set is a polytope; conversely, every polytope is the convex hull of a finite set.

Definition 2.6 A subset _" of a polytope P is a face if the relatively open line segment joining any pair of points in P lies either all in Y" or all in P - jr.

The empty set is the only (-1)-dimensional face; the polytope itself is a d-dimensional face. Aside from these two "improper" faces, all faces lie on the boundary of P. Faces of dimension 0, 1, (d- 3), (d- 2), and (d- 1) are called vertices, edges, peaks, ridges (or subfacets), and facets respectively. We will write vert X,, for the set of vertices of the convex hull of Xn. The facets of a two-dimensional polytope (polygon) are the closed line segments that form its boundary; the facets of a three-dimensional polytope are closed polygonal regions. The faces of a polytope under the subset relation form a complete lattice known as the facial lattice. Polytopes with isomorphic facial lattices are said to be (combinatorially) equivalent.

A transformation T : R d _ I_ d is said to be an affine transformation if it can be expressed as a nonsingular linear transformation followed by a translation. Affine transformations preserve the combinatorial properties of polytopes. As a consequence, we have the following useful proposition.

16 Proposition 2.7 Let (') be an affine transformation on R d, let Xn = {Xl,X2, ...,_n} and _,_ - The, Z, onvZ" eqiwt ,t.

The simplest sort of polytope is the simplex. A d-simplex is the convex hull of d + 1 points not all lying on one hyperplane. All d-simplices are equivalent, and each has (_++]) k-faces that are themselves k-simplices. A triangle is a 2-simplex; a tetrahedron is a 3-simplex. If every facet of a d-polytope P is a (d- 1)-simplex, P is said to be simplicial. Every 2- polytope is simplicial; a 3-polytope is simplicial if every facet is a triangle. Simplicial polytopes are of especial interest in the average-case analysis of algorithms. If the n points of Xn are chosen independently from any distribution with a density filnction with respect to Lebesgue measure in R d, then cony Xn is simplicial with probability one, since the probability that any d + 1 points fall on the same hyperplane is zero.

Every d-polytope P has a (combinatorial) dual. The facial lattice of the dual polytope is isomor- phic to the lattice of the faces of P under the superset relation; there is a one-to-one correspondence between k-faces of P and (d - k - 1)-faces of its dual. In fact, it can be shown that the so-called polar set of P,

{yERd]VxEP'(x,y) < 1}, is a dual of P if P contains the origin. The dual of a simplicial polytope is a simple polytope.

While every facet of a simplicial polytope contains d vertices, (2a) edges, etc., every vertex of a simple polytope is contained in d facets, (2a) subfacets, etc. Every 2-polytope is both simple and simplicial, but for d > 2, the d-simplex is the only d-polytope that is both simple and simplicial. Let fi(P) be the number of k-faces of a d-polytope P. Then the d-tuple

f(P) = (fo(P),flCP),...,fd-l(P)) is the f-vector of P. The f-vector is a useful but incomplete description of the combinatorial structure of P; incomplete, since non-equivalent polytopes may have the same f-vector. The main success of the combinatorial theory of polytopes has been a partial characterization of the set of d- tuples that are also f-vectors of some d-polytope. Necessary conditions include the Euler-PoincarJ Relation

E (-1)ifi = O, (2.1) -l

-X

17 the Upper-Bound Theorem for simplicial polytopes

h f (3.f°1) "- O(fYo+l/(j + 1)!) if j _< [d/2] - 1 (2.3) O([d/2]! f [od/21) otherwise and the Lower-Bound Theorem for simplicial polytopes

5 = a(f0) for0 < j < d- 1;

fd-1 > (fo - d)(d-1) + 2. (2.4)

(Asymptotics given are as f0 _ co; BrOndsted gives precise upper and lower bounds.) The upper bounds of (2.3) are achieved by the cyclic polytopes. Such polytopes can be generated by choosing n distinct points on the moment curve, which is parametrized by the equation

=(t)= (t,t_,t_,..., td).

The lower bounds of (2.4) are attained by the stacked polytopes, which can be obtained by start- ing with a d-simplex, then repeatedly replacing each facet with ever-shorter outward-oriented d- simplices in a fractal-like fashion. For simplicial polytopes, the bounds

fJ< j+l fd-1 for0<.j_

fo < fd-1. (2.6)

Since every subfacet is contained by exactly two facets, and every facet contains exactly d subfacets, d fd-2 = -_fd-1. 2.2 The Convex-Hull Problem

There are actually three related convex-hull problems:

* The vertex-enumeration problem: Given Xn, what are the vertices of conv Xn?

18 • The facet-enumeration problem: Given Xn, which d-subsets of Xn define facets of conv In?

• The facial-lattice problem: Given Xn, which subsets of Xn define faces of conv Xn, and what is the lattice of the containment relation among them?

Clearly, solving any of these problems solves the ones before it in the list. We will concentrate on the facet-enumeration problem. We consider vertex enumeration mainly as a preprocessing step for facet enumeration. Since the facial lattice of a (d- 1)-simplex is of constant size for fixed d, the complexity of the facial-lattice problem is essentially the same as that of the facet-enumeration problem if we assume, as we always will, that the input points are in general position and the convex hull is simplicial.

2.3 Algorithms for the Convex-Hull Problems

Literally scores of papers have been written on algorithms for convex hulls in two dimensions. Op- timal O(n log n) worst-case running time is achieved by many algorithms; Kirkpatrick & Seidel's output-sensitive algorithm requires O(n log f0) time. Preparata & Shamos [73] survey most impor- tant results. In three dimensions, O(n log n) performance is achieved by the divide-and-conquer algorithm of Preparata & Hong [72]. As a rule, we will restrict our attention to four and more dimensions, the lower-dimensional cases being essentially solved. In this section we describe the algorithms applicable in higher dimensions.

2.3.1 Vertex Enumeration by Linear Programming

The vertex-enumeration problem can be solved by determining for each point xi in turn whether it is a vertex. This is done by searching for a halfspace containing all of Xn that has xi on its boundary. If xi is a vertex, the system of linear inequalities

= b < b for 1_

19 program has d + 1 variables and n constraints; thus, for fixed d, it can be solved in O(n) time using Megiddo's [67], Clarkson's [20], or Dyer's [33] deterministic algorithm, or Dyer & Frieze's [34] probabilistic algorithm. Overall, O(n 2) time is required, since n such programs must be solved. The dimension-dependent factor of the running time of each of the linear programming al- gorithms cited grows quickly with d -- it is doubly exponential in d for Megiddo's algorithm, exponential in d2 for Clarkson's and Dyer's, and about d3d for Dyer & Frieze's. Thus it is faster to solve two linear programs of only d variables

(a, xi) -- 1

(a, xi) < 1 forl

(a, xi) = -1

(a, xi) _< -1 for l_

2.3.2 A NaYve Algorithm for Facet Enumeration

A naive algorithm for the facet-enumeration problem simply tests each of the (_) = O(n d) d-subsets of Xr, to see if it forms a facet. This can be tested by computing the signed volume of the each of the n - d + 1 simplices formed by the points of the candidate facet and another point of Xn. If the signed volumes are all positive or all negative, the subset does indeed form a facet. Since each test takes O(n) time, O(n d+i) time is required overall in the worst case and on average.

2.3.3 The Gift-wrapping Algorithm

The gift-wrapping algorithm of Chand & Kapur [19], analyzed by Bhattacharya [8] and recently reconsidered by Swart [90], requires O(nfd-1 + fd-2 log fd-2) time. According to the Upper-Bound Theorem, fd-x and fd-2 are O(n[d/2J) in the worst case. At each step, this algorithm generates a facet adjacent to a ridge of some known facet by finding the point that, when adjoined to the ridge, forms the largest angle with the known facet. Each such search takes O(n) time, accounting for the O(nfd_l) term in the running time. In addition, a dictionary data structure allowing quick access to known ridges must be maintained. When a new facet is created, its ridges are searched for in this data structure. If a ridge is found, it is deleted, since both of its adjacent facets are

2O known; if not found, it is inserted so that the other adjacent facet will be searched for later. In the ridge dictionary 2fd-2 searches, fd-2 insertions, and fd-2 deletions are made. Each requires only O(log fd-2) time if any of several efficient data structures are used, e.g., Sleator & Tarjan's [92] self-adjusting trees. Then O(fd_ _log fd-2) time is required in toto. The algorithm and especially its analysis are considerably complicated if the n points are not in general position. In this situation, the largest-angle search may yield a tie, and a recursive invocation of the algorithm in a lower dimension is required to determine the structure of the new facet.

A version of the gift-wrapping algorithm for the facial-lattice problem maintains a separate dictionary of faces for each dimension k for 0 _ k < d- 1. It is not difficult to show that this algorithm constructs the facial lattice in O(nl + I log l) time, where l is the number of arcs in the facial lattice.

2.3.4 The Beneath-Beyond Algorithm

The beneath-beyond algorithms of Kallay [54] and Seidel [82] construct the convex hull by adding new points one at a time. A new point lies beyond a facet of the existing polytope if the affine hull of the facet separates the polytope from the new point; otherwise, it lies beneath the facet. If the new point lies beneath every existing facet, then it lies inside the polytope and no updating is required. If the new point lies beyond some facets, these facets must be deleted, and new facets that contain the new point must be created. The beneath-beyond algorithm maintains a representation of the entire facial lattice of the current polytope. The facial lattice is updated for each new point as follows: First a "candidate facial lattice" is produced by connecting the existing lattice to a copy of itself in which the new point has been added to every face. Next every candidate facet in the lattice is classified as "beneath" or "beyond", depending whether the new point lies beneath or beyond it. Then each candidate vertex is classified as "concave", "reflex", or "supporting", depending on whether the facets to which it belongs are all beneath, all beyond, or some beneath and some beyond. Using this information, some vertices and facets can be deleted, and then deletions can be propagated through the candidate facial lattice according to rather simple rules. The update step requires time proportional to the size of the current facial lattice. Since the intermediate facial lattices may be much larger than that of the final polytope, the strongest statement that can be made about the running time in terms of input and output size is that it is

21 O(nl+ld/2J). Seidel's version operates in the dual space and requires only O(n/(_+1)/21) time.

2._.5 The Shelling Algorithm

Seidel's new shelling algorithm [81] solves the facet-enumeration problem in O(n 2+ fa-1 log n) time and the facial-lattice problem in O(n 2 + I log n) time. To understand its operation, we must first know what a shelling is.

Definition 2.8 A shelling of a polytope is an enumeration of its facets _rl, 72,..., 3rksuch that for 1 < i < k, the union of the first i facets

1

In other words, when enumerating the facets of a polytope in shelling order, a new facet may be added to the enumeration only if it is connected to an already enumerated facet and if its introduction will not close a gap in a way that introduces a "hole" in the surface formed by its predecessors in the enumeration.

Definition 2.9 The ith horizon of a shelling is

U (5 n zk)• j,k j

The ith horizon is the union of all ridges connecting an enumerated and an unenumerated facet. The horizon may be thought of as the "growing edge" of the shelling; any homeomorphism mapping the union of the already enumerated facets to the unit (d- 1)-ball maps the horizon to the boundary of the ball. The horizon is isomorphic to a (d- 1)-polytope; this implies that every horizon peak is contained in exactly two horizon ridges. The shelling algorithm actually produces a restricted type of shelling that Seidel calls a straight- line shelling. The journey of his "traveling observer" provides useful intuition about the properties of these shellings. The observer's journey begins at some interior point of the polytope and proceeds outward along a line. For convenience, let us assume that the polytope contains the origin, and that the journey begins at the origin and proceeds along the x(1)-axis; the observer looks backwards toward the origin. As she passes to the exterior through a facet of the polytope, this facet comes into view, and she lists it as 71 in the enumeration. As she proceeds further down the axis, more

22 facets come into view, and she lists them in the order they appear. (A facet is "in view" as soon every open line segment joining the observer to a point on the facet lies completely in the exterior of the polytope.) Eventually, no more facets come into view. By depressing the hyperdrive button, our observer moves instantly through +c_ to -oo on the x(1)-axis, whence all the previously unseen facets are visible. Moving toward the origin, she see facets disappear as they are obscured by the polytope. She adds these to the list in the order that they disappear. The last facet of the enumeration disappears as the observer passes into the interior of the polytope again. We may parametrize the voyage by letting the time t run from -c_ to +oo and defining position by

x(t)=(-1/t,O,O,...,O).

If the facets of the convex hull were given, it would be an easy matter to determine the straight- line-shelling order by finding the intersection of their affine hulls with the x(1)-axis. Of course, determining the facets of the convex hull is exactly our task. The algorithm we use to do this works in two stages. The first stage expends O(n 2) time to produce a subset of the facets. An important property of this subset is that it includes 71. The second stage maintains a representation of the horizon of the shelling as it develops. The next facet of the shelling is always chosen from among those produced by the first stage and some other candidates that can be deduced from the current horizon.

The first stage of the algorithm determines for each of the n points the time at which it will become visible, and some facet of the polytope that becomes visible at the same time. This is done by solving a variation of the linear program (2.7) for each of the n points. The n linear programs each take O(n) time for a total of O(n _) time. Of course, some points never become visible; their linear programs are infeasible, and they can be eliminated from further consideration. The second stage actually carries out the enumeration of the facets. To do so, it maintains the following data structures:

• A priority queue of candidate facets. The key on which the entries are ordered is the time at which the candidate facet becomes visible to the observer, i.e., -1 divided by the x (1)- intercept of the affine hull of the candidate facet. Initially, the priority queue contains all of the candidate facets generated by the first stage.

• A "horizon graph". This is a representation of the subset relation that holds among the peaks and ridges in the current horizon. Since every :horizon peak is contained in exactly two horizon ridges, the edges of the horizon graph represent peaks and the nodes represent ridges.

23 (Stage One) For each of the n points do Determine its earliest time of visibility and the corresponding facet. Insert the facet into the priority queue. (Stage Two) Remove the candidate facet with earliest visibility time from the priority queue. Report it as the first facet in the shelling. Insert its ridges into the ridge dictionary. Construct the initial horizon graph to represent its ridges and peaks. While the priority queue is not empty do Remove the candidate facet with earliest visibility time from the priority queue. Report it as the next facet in the shelling. If the new facet introduces a new vertex then (One ridge of the new facet is known already; it disappears from the horizon.) Use the ridge dictionary to locate the disappearing horizon ridge in the horizon graph. Remove the disappearing ridge from the horizon graph and ridge dictionary. Remove from the horizon graph disappearing peaks adjacent to the disappearing ridge. Remove from the priority queue candidate facets corresponding to disappearing peaks. Insert the new ridges and new peaks into the horizon graph. Insert the new ridges into the ridge dictionary. Insert into the priority queue a new candidate facet for each new peak. If the new facet corresponds to a horizon peak then Update the horizon graph. Update the ridge dictionary. Update the priority queue.

Figure 2.1: Pseudo-code for the shelling algorithm.

• A dictionary of ridges.

Suppose that the first i facets of the shelling are known. The next facet either introduces a new vertex, or it is formed solely by vertices on the horizon. In the first case, the facet will have been inserted into the priority queue by the first stage of the algorithm. In the second case, it will be formed by the union of the vertices of two horizon ridges that share a common peak. Thus each horizon peak determines a candidate facet. If a candidate facet is inserted into or deleted from the priority queue whenever a horizon peak appears or disappears, it will always be possible to obtain the next facet by removing from the priority queue the candidate with the earliest time of visibility. The algorithm can be summarized by the pseudo-code of Figure 2.1.

24 Each iteration of the loop outputs a facet; thus fd-1 iterations are required. It is not difficult to show that only O(d _) -- O(1) changes to the horizon graph occur at each iteration; all together these require only O(fd-1) time. The ridge dictionary can be implemented by any of a number of data structures that allow look-up, insertion, and deletion in logarithmic time if the ridges are ordered lexicographically. (For such data structures, see Warjan's monograph [92].) There are at most (d/2)fd-1 ridges; so operations in the ridge dictionary require at most O(fd-1 log fd-1) =

O(fd_ 1 log n) time for fixed d. Similar data structures can be used to implement the priority queue; priority queue operations also require O(fd-1 log n) time. The total running time is therefore

O( n2 -+- fd-1 log n). If the entire facial lattice is required, this algorithm can be modified to maintain the entire facial lattice of the horizon; then its running time is O(n2+ I log n), where l is the size of the facial lattice.

25 Chapter 3

A Probabilistic Approach to the Convex-Hull Problem

In Chapter 2 we examined the combinatorial complexity of convex hulls and surveyed existing algorithms for solving the three versions of the convex-hull problem. We have seen that the size of the output of the facet-enumeration (and facial-lattice) problem can vary wildly as a function of input size. In the worst case, n points can generate O(n[d/2J) facets; at the opposite extreme, it is possible that the convex hull of the n points is merely a simplex with dq- 1 facets. In the former case, the gift-wrapping algorithm requires O(n l+[d/2j) time, while the shelling algorithm requires only O(n [d/2j log n) time; in the latter case, gift-wrapping in O(n) time is faster than shelling in O(n 2) time. Unless some estimate of output size is known a priori, it is difficult to know which algorithm will be more efficient in a given situation, or indeed even whether it is practical or cost-effective to attack the problem at all. Fortunately, the running times of most of the algorithms described in Chapter 2 are parametrized in terms of quantities that can be and to some extent have been studied from a probabilistic stand- point. Except for the na'/ve algorithm, whose performance is always so poor that is can be dismissed immediately for almost any practical purpose, and the beneath-beyond algorithm, whose perfor- mance depends on intermediate structures that bear no obvious relationship to the final result, the performance of these algorithms depends on n, the input size, and Fn, the number of facets in the output.

In the remainder of the thesis, we will assume that Xn = {Xl, X2,... , Xn} is a set of n indepen- dent and identically distributed (i.i.d.) random points in R d. Our main concern will be to bound the average running time of the convex-hull algorithms, particularly facet-enumeration algorithms,

26 when applied to Xn. Of course, average running time must always be based on the assumption of a particular distribution of the input points; we will investigate EVn, the average number of vertices of conv Xn, and EFn, the average number of facets, for a variety of distributions. While several earlier authors investigated the properties of the convex hull of small numbers of random points in the plane (cf. Buchta's _lbersieht [15]), the study of the asymptotic behavior of various properties of the convex hull of Xn began with the work of R_nyi & Sulanke [75,76]. They showed the expected number of vertices and edges of the convex hull of planar point sets to be O(log n) for uniform distributions in convex polygons, O(nl/3) for uniform distributions in circles, ellipses, and other convex figures with smooth boundaries, and O(_ for two-dimensional normal distributions. They also investigated the expected perimeter and the expected area of the convex hull. Efron [39] later extended their arguments to derive exact, general integral expressions for expected area, perimeter, surface area, volume, probability content, and number of vertices, edges, and faces for the two- and three-dimensional cases. Carnal [18] extended R_nyi & Sulanke's work on asymptotics to very general sorts of circularly symmetric distributions in the plane. At about the same time, Raynaud [74] demonstrated that the expected number of facets for the uniform distribution in the unit d-ball and the d-dimensional normal distribution are O (n(d-1)/(d+x)) and O(log (d-1)/2 n) respectively for fixed d. Recently, Buchta et al. [17] investigated the uniform distribution on the surface of the d-dimensional hypersphere and found that the number of facets grows linearly. The first computer scientists to address questions of this nature were Bentley, Kung, Schkolnick & Thompson [4]. They showed by rather complicated combinatorial arguments that the expected number of maxima of the dominance relation is O(log d-1 n) when n points are chosen from a d- dimensional distribution that is the product of d one-dimensional distributions. (A point x is said to dominate y if x (i) >_y(i) for 1 < j _ d. A point Xi is a maximum of Xn if no Xk dominates it for 1 < k < n.) A similar bound on the number of convex-hull vertices follows by an easy argument. One example of such a product distribution is the uniform distribution in the unit d-cube. By first reducing all other cases to this one, Devroye [24] was able to tighten the constants of the analysis of Bentley et al. while replacing their combinatorial proof with a simpler analytic one. Dwyer & Kannan [32] showed a O(log d+x n) upper bound on the expected number of convex-hull vertices for independent uniformly distributed points from any d-polytope. In other work Devroye [22] showed that EVn --- o(n) for any distribution that has a density function with respect to Lebesgue measure in R d. He also described a "throw-away step" for

27 convex-hull algorithms. This step finds a combinatorially simple subset of the convex hull that typically contains most of the input points. All points lying inside this simple polytope can be eliminated from further consideration in O(n) time. He demonstrated that the throw-away step is effective in reducing the expected running time of convex hull algorithms for point sets chosen from a class of spherically symmetric distributions with unbounded support that he named "slowly varying radial"; the normal distribution falls into this class. The algorithmic results were rather weak, however, showing only that a T(n)-worst-case algorithm can be improved to have o(T(n)) average running time. Many researchers addressed the average-case analysis of algorithms for planar convex hulls. Their work culminated in Bentley & Shamos' algorithm [5], which finds convex hulls in O(n) time on average for any distribution satisfying EVn = O(n 1-_) for some c_ > 0. Their algorithm was the first to employ the randomizing divide-and-conquer technique applied in §3.2, and has optimal O(n log n) worst-case running time. Other authors have investigated properties of random convex hulls that are not obviously ap- plicable to the average-case analysis of convex-hull algorithms. For example, Eddy & Gale [36,37] investigated the distribution of random convex hulls. Fisher [43,44] examined their shape. Buchta & Miiller [16] considered expected mean width. A more complete survey of results of this sort is given by Buchta [15]. The remainder of this chapter is devoted to reviewing briefly essential results from probability theory and establishing a general framework for determining EFn and EVn for specific distribu- tions.

3.1 Elements of Probability Theory

Familiarity with certain concepts of elementary probability theory is assumed. These include discrete and continuous random variables, distributions and density functions, mean, variance, and higher moments, stochastic independence, etc., as contained in the first eleven chapters of Feller's textbook [40]. One aspect of elementary probability that occasionally confuses non-specialists is the linearity of expectation. Recall that the expected value EX (also called the mean, expectation, or average) of a real random variable X with density function f(.) is defined by EX = [ • R

28 an analogous definition applies to discrete variables. If a and b are constants, it is immediate that E(aX + b) - aEX + b. Moreover, if Y is another random variable, then

E(X + Y) = EX + EY.

Somewhat contrarily to intuition, this holds even if X and Y are correlated. As a simple example, the average total height of a group of married couples is just the average height of the husbands plus the average height of the wives; any propensity of tall women to marry tall men, for example, is irrelevant for the mean, even though it may affect the variance. Average running times of convex-hull algorithms often depend on the expectation of some power of the number of vertices Vn, facets Fn, or faces Hn of the convex hull of n points. Our probabilistic techniques will not always be strong enough to bound all of EVn, EHn, or EFn directly. In general, it is not necessarily true that E(X p) = (EX) p. For example, if X = 0 with probability 1 - 1/n and X = n with probability 1/n, then (EX)n = 1 but E(XP) = nn-1. Fortunately in the case of convex hulls and some other similar problems, we have the following inequality due to Devroye [23].

I, emma 3.1 E(V, p) = O((EV,) p) for p > 1.

(3orollary 3.2 E(H,) <_O(E(Vn[d/2J)) = O((EV,)[d/2J) and

E(F,) < O(E(V,[d/2J))= O((EV,)[d/2J).

The corollary follows immediately from the Upper-Bound Theorem.

3.2 A Fast-on-Average Vertex-Enumeration Algorithm

In this section we will modify the linear-programming algorithm for the vertex-enumeration problem to improve its average running time. This can be accomplished by applying the randomizing divide- and-conquer technique of Bentley & Shamos' planar convex-hull algorithm [5]. The Bentley-Shamos algorithm randomly divides the set of points into two halves and recursively applies itself to construct the convex hulls of the two halves. Then the convex hulls of the two halves are merged to form the convex hull of the whole set. By exploiting the fact that the vertices of the two hulls are sorted by angle about some interior point, it is possible to merge the two convex

29 hulls in time proportional to their sizes. The random divisions can be accomplished in O(n) time overall if the points are stored in an array and shuffled once at the outset of the algorithm. All subsets considered will be stored in contiguous locations of the array. A single division step can be carried out in constant time by computing the index of the midpoint of the interval of the array containing the set to be divided. Thus the overall worst-case running time of the Bentley-Shamos algorithm satisfies TCn) = 2T(n/2)+ O(n), or

T(n) = O(n log n).

Because the division is random, the two subsets of points have exactly the same distribution as the original set. If the points are chosen from a distribution for which EV, is small, the merging step goes quickly on average. The average running time of the algorithm satisfies

A(n) - 2A(n/2) + O(EV,).

If EV, = O(n a) for some c_ < 1, then A(n) = O(n). This is true for nearly every interesting distribution.

An important observation is that a divide-and-conquer algorithm is not a prerequisite for the application of this method; in particular, this technique can be applied to the linear-programming algorithm for vertex enumeration. As in the case of the Bentley-Shamos planar algorithm, the modified linear-programming algorithm randomly divides the set of points into two halves and recursively applies itself to the halves to identify the vertices of the convex hulls of the halves. Every vertex of the hull of the whole set must also be a vertex of the hull of its half. The vertices of the entire set are found by merging the two sets of vertices of the halves by running the original linear-programming algorithm. If EV, is small, the merging step goes quickly on the average, for the linear programs that must be solved are both few and small (few constraints). The average-case running time of this algorithm satisfies the recurrence

A(n) = 2A(n/2) + O(E((2V,/2)2)).

By Lemma 3.1, E((2V,/2) 2) = O(E(V,2)) = O((EV,)2). It is not difficult to see that the solution

3O of the recurrence is

A(n) = O(n log n) if EVn -- O(n)2p) if EVn,., <<_ np for p <> 1/2 A summary of this information that is useful in spite of its abuse of notation is

A(n) = O(n + (EV,) 2) for "EV, _ O(x/'_')".

Since EV, = o(n) for any distribution with a density with respect to Lebesgue measure in 1{d [27], we have A(n) -- o(n2). On the other hand, if T(n) = O(n 2) is the running time of the original algorithm, the modified algorithm has worst-case running time W(n) satisfying

W(n) < 2W(n/2) + T(n) or

W(n) <_2T(n)= O(n2), which is asymptotically equivalent to the original algorithm. The performance of the shelling, gift-wrapping, and beneath-beyond algorithms can also often be improved by applying this vertex-enumeration algorithm as a preprocessing step to reduce input size. Subject again to our condition "EV, _ O(x/_)" , we have average running times of

O(nEF,.,) for gift-wrapping without preprocessing, O(n + (EVn) 2 + (EV,.,)(EF,.,)) for gift-wrapping with preprocessing,

O(n 2 + (EFn)logn) for shelling without preprocessing, and

O(n + (EV,) 2 + (EFn)logn) for shelling with preprocessing.

This method lends itself to parallelization in a very natural way. If O(n) processors are available, the two subproblems can be attacked independently by different processors. Also, the merging step can be carried out quickly since a separate processor can be assigned to each of the linear programs that must be solved.

3.3 A Method for Bounding EF,

In this section a method for bounding the expected number of facets is developed. This method is a d-dimensional generalization of methods developed by R6nyi & Sulanke in two dimensions

31 [75] and Efron in three [39]. Similar methods have been applied to certain spherically symmetric distributions by Raynaud [74] and Buchta et al. [17]. The change of variables used has also been applied by Miles [70] and Santal6 [79]; however the exposition given here, while somewhat tedious, avoids recourse to some complicated mathematical machinery. The first d points ml,...,xd define a hyperplane with probability one. Let us first reckon the probability that they also define a facet of the convex hull. This is just the probability that the other n - d points lie on the same side of the hyperplane. Writing g(.) for the density function of the mi and F and 1 - F for the probability content of the two halfspaces formed by the hyperplane, we see that this probability is = f ... f (r + (1- g(,d)d,1... , R a Ra

d and the expected number of facets is

R a R a

d

The usual strategyfor evaluatingsuch an integralis to express the d pointsin the following terms: Let p be the projectionof the originonto the hyperplane defined by the d points. The point p can be expressed in terms of generalizedsphericalcoordinates [56,p. 17] consistingof d - 1 angles01,82,...,0a-i varying from 0 to _ and r, itssigned distancefrom the origin.Then a system of rectangularcoordinateswith originat p can be establishedon the hyperplane,and xi for1 < i < d isuniquelydetermined by the coordinatesof p plus the d- 1 rectangularcoordinates

Specifically, we define r and 01, 02,... , 0d_ 1 by = +lpl

p(1) : red-led-2 • "" c3c2cl

p(2) : rCd_lCd_ 2 . .. C3C281

p(3) : rCd-lCd-2'''c382

p(4) : rCd-lCd-2 • "" 83

p(d) = rSd-1

32 where si -- sin0/and ci = cos0/. Then vectors el, e2,..., ea-1 can be defined by

e_ 1) -- --8ici_1...c3c2c 1

el 2) = --8ici-1 • • • c3c281

e_ 3) = -- 8ici- 1 " • • c382

el 4) = -- 8ici- 1 " " • 83

el/+1) : Ci

elj) = 0 forj>i+l.

It is tedious but straightforward to verify that the ei plus p/r form an orthonormal basis for R a, i.e., that (p, ei} = 0 and (ei, ej) = 0 for 1 _

xi : p + _ t(ii) ei. 1

We must now determine the aacobian of this transformation• The following facts are easily verified.

Oz(J)i //va"ti(k) -- ekC_),

a--z(ii)/a.Ck)/"_l = O for i# 1, o.Ii)/a_ = p(i)/_, a l )/a0 = k i,

Oe(i)/O0, -- e_i) cot 0, for j _

Oek(k+l)/aO k --__ -% (k+l) tan Ok, ae_;)/ae,,= o forj > k + 1, _(_) Op(J)/c_Ok -- Cd_led_2...ek+lra k .

For concreteness, let us consider the Jacobian matrix for the case d = 3. We mark non-zero entries

33 with "x" or "®", and leave zero entries blank.

01 02 r t_ 1) t_ 2) t_ 1) t_ 2) t_ 1) t (2)

X_ 1) X X X {_)

X_ 1) X X X (_ _)

X(31) X X X @ @ x_2) x x x x x z_3) x x x x _2)X X X X X z_3) x x x x_2) x x x x x z(3) x x x

Let pli) be the row for xi-(i). If we replace pl 1) by ai -- Y_l

V"

l_

-- r l

+_?) 1_ d -_ot0_') +(t.0_+_ot0_)4_+xk ) p(1) ]

--__ r (Cd-lCd-2'''Ck+lp(1)) (ek,p) -F E ,_0 (--SI¢'I-IC'-2"''Ck+I)p(1) (ek,p) k

p(1) (ek,p) + (tanOk + cot Ok)e_k+l) k_(p(k+l)

34 = O+O+tki O+ ekk+lp(k+8kekp(1)l ) ) tl k) tic 2 •.. c k

The determinant is now simply the product of the determinants of the one d × d matrix and d identical (d- 1) × (d- 1) matrices lying along the diagonal. In the three-dimensional case, the former is t[ll/,l t? l r 2 simp(xl, :g2, Z3) t_l)/cl t_2),/clc2 r/P (1) -- t_ 1) t_2) 1 = 3 2 , t(1)/Cl t(32)/ClC2 r/p(1) P(1)c21c2 t(1)tg2) 1 ClC2 where 8imp(xl, x2, x3) is the area of AZlX2Z3. For general d, the value is

(a- 1)! simp(xl,x2,... ,Xd) cd,.d-1 1"2 """C_-I where 8imp(xl_X2,... _Xd) is the (d- 1)-dimensional volume of the simplex formed by the d points Xl through xa. _(i+1) Each (d- 1) x (d- 1) matrix is itself upper-triangular; its diagonal entries are _i = ei for 1 < i < d- 1, the determinant is ClCz'"Cd-1, and therefore the Jaeobian of the entire transformation is

(ClC2 • • . Cd_l)a(d-- 1)! dsimp(xd-1 l,xZ,...,Xd) = (d - 1)! c2c321"''Cd-1d-2 sirnp(xl, x2, .. ", Xd), (3.1) ClC 2 •.. c2_1 and we have

BEn: (d) (d-l), f[.. f[/- d-1 d-1

d • dr dO1.., dOd-1.

If we define

_(p) = f g(tl Ip) dh, Rd-_

35 then, since Xl through xd are i.i.d.,

/ "" / g(tl I p)-"gCt4 Ip)dh'"dtd = (gCp)) d. (3.2) R,l- I R d- 1

d

The quotientof the bracketedquantityabove and (3.2)isa conditionaelxpectation,i.e.,the bracketedquantityisequalto

If we now write _/(p, n) for r n-d + (1 - r) n-d and estimate (_) .-_nd/d!, we have finally

EFn .._ -_ ... oo cdcd-l '''C2d-l(g(p))d esimp(p)'7(p,n) dr aO1. ..dO4_l. (3.3)

d-1

Informally, (_l(p)) d is the probability that the first d points lie on the hyperplane p defines, _/(p, n) is the probability that the other n - d points all lie on the same side of the hyperplane, and esimp(xl,..., Xd) is the expected volume of the simplex formed by the first d points if they lie on the hyperplane. If these quantities can be computed exactly or at least bounded for a particular distribution, it is then possible to evaluate the integral and derive bounds on the expected number of convex hull facets.

3.4 A Method for Bounding EVn Above

By (2.5), upper bounds on the expected number of facets imply asymptotically similar bounds on the number of faces of every lower dimension. However, it is sometimes possible to derive bounds on the number of vertices in cases where the method of integrating for EFn directly cannot be applied easily. In other cases, this method yields smaller dimension-dependent factors.

Let us consider the point X1 and the 2 d orthants into which d-space is partitioned by the d hyperplanes x(i) -- X_ i) for 1 _< i _< d. A key observation is that X1 lies inside the convex hull of any set of 2 d points chosen one from each orthant. Thus X1 can be a vertex only if at least one of these orthants is empty. If r = r(x1) is the probability content of the orthant of smallest probability content, the probability that at least one orthant is empty is a most 2d(1 -- y)n-1 <_ 2d exp(--(n-- 1)F), which is also an upper bound on the probability that X1 lies on the convex hull.

36 Devroye [24] used essentially this observation to establish his O(log _-1 n) bound on the expected number of vertices among n chosen from a uniform distribution on a hypercube. We extend the usefulness of this method by observing that the d orthogonal hyperplanes need not lie normal to the coordinate axes, but may be chosen to maximize F. In the case of spherically symmetric distribution, this is achieved by choosing one hyperplane normal to the line through X1 and the origin. If G(y) - Pr{r(x_) <_y} and g(y) is the corresponding probability density function, then

BY. <_2_,_ g(y)exp(-(n- 1)y)ey.

If G(y) cannot be computed exactly but only bounded above, then integration by parts can be applied to eliminate g(y) in this formula.

3.5 A Method for Bounding EVn Below

If a particular fixed hyperplane passing through point X1 defines an empty halfspace, then that point is a vertex with probability 1. Let F = F(X1) be the probability content of the fixed halfspace. Then the probability that a given point is a vertex is bounded below by the probability that the corresponding halfspace is empty, which is (1 --F) n-1 _ exp(-(n- 1)F/(1 - F)). If G(U)- Pr{r(x1) < y} and g(y) is the corresponding probability density function, then BY. >_n /0'g(y)exp(-(n- 1)y/(1- y))ey. The best lower bounds are derived by fixing the halfspace so as to minimize its probability content. In the case of a spherically symmetric distribution, this is achieved by choosing the halfspaee normal to the line through the point end the origin.

37 Chapter 4

Convex Hulls of Samples from Polytopes

In this chapter the asymptotic behaviors of EVn and EFn are investigated for sets of points chosen from a uniform distribution on the interior of a d-dimensional polytope. Section 4.1 of this paper demonstrates that EVn = O(log d-1 n) for points drawn from any d-polytope. The upper bound EFn "- O(log (d-1)[d/21 n) for any polytope follows easily. In Section 4.2 it is shown that the bound on EVn is tight for a large class of polytopes including the simple polytopes. (Whether it is tight for all polytopes remains an open question.) Section 4.3 contains a proof of the bound EFn = O(log d-x n) for any simple polytope. The algorithmic implications of this chapter are summarized in the following theorem.

Theorem 4.1 For n points drawn independently from the uniform distribution on the interior of any fixed d-polytope, the vertex-enumeration problem can be solved in O(n) time on average by the algorithm of §3._. The facet-enumeration and facial-lattice problems can be solved in time

O(n) by any polynomial-time algorithm using the preprocessing step,

O(n 2) by the shelling algorithm without preprocessing, and O(nlog d-x n) by the gift-wrapping algorithm without preprocessing.

4.1 Upper Bounds on Vertices

In this section we derive upper bounds on the expected number of convex-hull vertices among n points chosen from a uniform distribution on any fixed polytope. The regular simplex is attacked first, and the resulting bounds are applied to proving the more general result.

38 Figure 4.1: Cell C0 of a regular 2-simplex and cell C_ of the corresponding "square" 2-simplex.

Theorem 4.2 Let Xn = {X1, X2,..., Xn} be a set of n points chosen independently from a uniform distribution on the interior of a d-simplex. Then EVn = O(log a-x n).

Proof. Without loss of generality, we can consider a regular d-simplex, since all simplices are affine- equivalent, and an affine transformation of Y.n will not alter the combinatorial structure of the convex hull. We partition the simplex into d+ 1 cells Co, C1,..., Cd corresponding to d+ 1 vertices vo, vx,..., va of the d-simplex. A point x lies in Ci if dist(x, vi) < dist(x, vi) for 0 < j < d. Ci is itself a polytope; its vertices are the centers of each of the faces of the simplex to which vi belongs. Thus each cell has _0

It will be more convenient to work with a simplex with a "square corner". Let (i) denote the unique affine transformation satisfying = (o,o,o,...)

vi -- (1,0,0,...)

v_ = (0,1,0,...) (4.1) .

v_ = (0,0,...,1).

Since it is affine, this transformation preserves the combinatorial properties of the convex hull; e.g., Xi E vert Xn ¢_ X_ E vert X_. It is easy to verify that the vertices of C_ that are centers of k-faces have k coordinates of 1/(k + 1) and d- k zero coordinates. (See Figure 4.1.) Now suppose that X_ = _ and lies in cell C_. The hyperplanes x(i) -- _(i) for 1 _< i _< d partition the simplex into 2 d pieces. If the points Xx and X_ are to be vertices, one of these pieces must be free of points of X_, for X_ would lie inside the convex hull of any 2 d points of X_ chosen one from each piece. We write II(x) for I-II

39 Claim. The smallest of the 2d pieces has volume at least (1/d!)H(_). Proof. By induction on d. The basis case, d = 1, is easy. The volume of one piece is _(1); the volume of the other is 1- _(1)> 1/2 > _(1). The piece satisfying x (i) < _(i) for all i has volume II(_). Every other piece satisfies x (i) > _(i) for some i. The cross-section of the simplex on the hyperplane x (i) = _(i) is a (d- 1)-

dimensional simplex divided into 2 d-1 pieces by the (d- 1) hyperplanes x(i) = _(i) for j # i. By the induction hypothesis (modified by a scaling factor), the (d- 1)-volume of each of these pieces exceeds (1/(d- 1)!)(H(_)/_(_)). Each of the d-dimensional pieces satisfying x (i) > _(i) contains the intersection of the line x (i) = _(i) for j _ i with the hyperplane _ x (i) = 1. This point is (_(1) ..., _(i-1), 1 - _,i_i _(i), _(i+1),..., _(d)). Thus each d-dimensional piece contains a pyramid with base volume exceeding (1/(d- 1)!)(II(_)/_(i)), height 1 - _l

least (1/d)(1/(d- 1)!)(H(_)/_ (i)) (1- __,l x (i) for any x E C_. This condition surely holds for the vertices of the cell: the center of a k-face has _

Claim. If point X_ lies inside cell Co, it is a vertex with probability less than 2d exp(-(n- 1)H(X_)). Proof. By the previous claim, all 2d pieces formed at X_ have volume at least (1/d!)II(X_) and thus probability content II(X_). At least one of these pieces must be empty if X_ is a vertex. If we write n' for n- 1, the probability that a particular piece is empty is (1-1-I(X_)) n' < exp(-n'II(X_')). The probability that at least one is empty is at most 2d times as great, or less than 2d exp(-n'H(X_)). Now

Pr{X1 E vert X=}

- Pr{X1 e vertXn [ Xi E C0}

_-- f (a+l)-ePr{X_ e vertZn I H(X]) = y}" Pr{H(X_) = y lX_ E C_o}dy J0

< f(d+l) -e 2_e-n'tl Pr{H(X_) = Y l X_ E C_} dy .10

According to the integration-by-parts rule, f u dv = uv - f v du < uw - f w du = f u dw, provided that u > O, du < O, and v < w. This holds above with u = e -n'v, du = -nre -"'_ dy, and

v = Pr{H(X_) _

4O vol(x E C_ I H(x ) < y} volC_ < vol{ e [o,1] I _ v} - volC_ = (d + 1)! vol{x E [0,1]al rI(z) _

--" W

The quantity vol{x E [0, 1]a [ II(x) _

Pr{X1 E vert Xn} < 2a(d + 1)d f (a+l)-d e-n'Vloga-l(X/y) dy. .tO Substituting t : n_y gives Pr{X1 e vert Xn} < 2d(d+ 1)d(n') -1 /?e-t(log n' - logt) d-ldt = 2a(d + 1)d(n')-l(r(1)log a-1 n' + "r(d- 1)log a-2 n' + O(log a-a n')) where "r- 0.58- is Euler's constant. Since X1, X2,...,X, are i.i.d, variates, it follows that

EV,,<_2a(d+ 1)dlog a-x n + O(log a-2 n).

[]

Corollary 4.3 Let Xn : {X1, X2,..., Xn} be a set of n points chosen independently from a uni- form distribution on the interior of any d-dimensional polytope P. Then EVn = O(log d-1 n).

Proof. The polytope P can be partitioned into some finite number of simplices Px, P2, ..., Pk. Let X,_,i --- XnNPi for 1 < i < k. By Theorem 4.2, E(JvertXn,i) = O(log d-in). Since vertX_ c Ux

Corollary 4.4 Let Xn : {X1,X2,... ,Xn} be a set of n points chosen independently from a uni- form distribution on the interior of any d-polytope P. Then EFn = O((log n)(d-1)[d/2J).

41 Proof. The result follows immediately from Corollary 4.3, Lemma 3.1, and the Upper Bound The- orem (2.3);

EFn-- O((EVn) [d/2j) = O((log n)(d-1)Ld/2]). []

4.2 Lower Bounds on Vertices

In this section we prove that the bound EVn - O(log d-1 n) is tight for many polytopes. Specifically, we show that EVn -- f2(log d-i n) provided that the d-polytope from which the points are chosen has at least one vertex that lies in exactly d facets. It is a fact that every vertex lies in at least d facets; if every vertex lies in exactly d facets, the polytope is called simple. The lower bound question remains open for polytopes with no such vertex.

Theorem 4.5 Let P be a d-dimensional polytope, and suppose that Xn = {XI,X2,...,Xn} is a set of n points drawn independently from the uniform distribution over the interior of P. If P has at least one vertex that lies in exactly d facets, then the expected number of vertices among the n points is f2(log d-1 n).

Proof. We may assume that v0 is a vertex lying in exactly d facets. It follows that v0 is adjacent to exactly d other vertices, which we assume are v:, ve,... ,va. Again we prefer a "square corner". Let (*) denote the unique affine transformation satisfying v_* = dv_ with (') as in (4.1). Again combinatorial properties of the convex hull are preserved. The unit hypercube [0, 1]d is contained by the simplex v_v_.., v_ and therefore also by P*. We now bound the probability that X1 and X_ = _ are vertices of Xn and Xn respectively. The hyperplane _-_l_ (1 - Cy) n >_exp(-nCy/(1 - Cy)) for 0 _

G(y) = vol{x e [0vol, lldP*[II(z) _

42 Applying (4.2) again, the corresponding density function is

logd-l(X/y) Cloga-l(1/y) g(Y) = (d- 1)I(volP*) = d d-1 With a to be determined, it follows that

Pr{X1 E vert Zn} _> /o1Pr{X1 is a vertex I H(X_) = y}. pr{n(x ) = y} dy _ /o1I(y)g(y)ey

JO

(daC-__I)(logn- log(a/C)) a-1 fa/CnJO exp(-nC, y/(1-a/n))dy

-> (g-G_-l-1)loga- 1n(1-O\logn]()(loga_ 1 -olCn/n'_ ] (l_exP(1..__/n))-a

>-- (l°gnd-d lan-) 1 ((1l°g-a_)(l_O \ a/nlog)(X_n ] e-,_)

Setting a -- log log n gives

Pr{X1E vertXn} >_ l°gndaa-_in1 0 (l°ga-2nl°gl°gl°gn n) Since the points of Xn are i.i.d.,

EVn >_ l°ga-lda- xn O(log a-2 n log log log n) []

Corollary 4.6 The ezpected number of vertices among n chosen independently from the uniform distribution on a d-dimensional hypercube is at least

2a loga- 1 n da_l - O(log a-2 n log log log n).

Proof. It is convenient to consider the hypercube [0,2] a. We proceed as in the preceding proof, but omit the affine transformation. The expected number of vertices in [0, 1]a is as before. The total expected number of vertices is at least 2a times as large, since [0, 2]a contains 2a such unit hypercubes, each playing a symmetric rble. []

43 Devroye [24] computed the upper bound (2dlog d-1 n/(d- 1)!) -4-O(log d-2 n) for the case of the hypercube. Asymptotically, these two bounds differ by a factor of dd-1 dd ed (d- 1)! d! _"

4.3 Upper Bounds on Facets

In this section we improve upon Corollary 4.4 for the class of simple polytopes. It is a reasonable conjecture that the following bound applies to all polytopes, but it is apparently difficult to prove it using techniques that rely on orthogonal corners.

Theorem 4.7 Let Xn = (X1,X2,... ,Xn) be a set of n points chosen randomly and independently from a uniform distribution on the interior of a simple d-polytope P. Then EFn, the expected number of facets of the convex hull of Xn, is O(log d-1 n).

Proof. Without loss of generality we may assume that volP = 1. Let v0, vl, ... ,vm-1 be the m vertices of P. Let V_ be the volume of the simplex formed by vi and the d adjacent vertices. Let H(w,c) = { x I(w,x) < c} be a halfspace perpendicular to w. Let _ be the family of halfspaces { H(w,c) I (w, vi) = mino

)4 e _ ¢_ )4' e _'= { g(w,c) l (w,v_) = O<_min5

Furthermore, if )4' E 7d then _(' = g(w,c) for some w e [0, c_) d, for otherwise there would exist an i _< d for which (w, v_/ = w (i) < 0 = (w, v_). The converse also holds since v'i C [0,oo) for 0

Pr{X1, X2,..., Xd define an empty halfspace I X1, X2,..., Xa define a halfspace of 70}

44 -- Pr{X_,X_,... ,X_ define an empty halfspace I X_,X_,... ,XId define a halfspace of .TJ}

= /.../(1- r)n-dg(xl)...g(xd)dxx...dxd (4.3) R d R d

where g(x) = 0 otherwise Vod! ifx_P_

Let p be the projection of the origin onto the hyperplane formed by xl,x2,... ,Xd. Then the halfspace in question is gp = H(p, p[_), which contains the origin and is bounded by a_p, the hyperplane perpendicular to the vector p and passing through the point p. We will make the change of variables of §3.3, defining a system of spherical coordinates for p, etc. We have already established that p must lie in the first orthant. If A(p) is the (d- 1)-volume of the cross section of P' on the hyperplane 0_p, then _(p) - A(p)d!Vo, and (3.3) equals

.... II d (1-rl' .imp addO1...dO _x. (4.4/ _0 , JO _,JO l

We now dissectthe domain of integration,at firstconsideringthe regionforwhich at leastone of the angles0i is very small_0i_< n -d _ or very large_0i >_ (r/2)-n -d. Let Amaz be the (d- 1)-volume of the largestintersectionof P_ with any hyperplane. Since esimp <_ A < Am_,,,also ci <_ 1, and (1 - r)n' < 1, the value of the integralis bounded by some constanttimes the volume of the region,which is o(.-d) (4.5)

Applying the identities (1 - x) n < e-nz and esimp < A to (4.4) gives the upper bound

.... H C_-_i exp(-n'r)A d+xdrdO_... dOd_l (4.6) Jn -d Jn -d JO l<_i 1 for i E K, and r/qi_ 1 foriEK I. Then

(diam p')trd-k rd-k > vol(Np n P') > (d - k)! l-Iiel(, qi - - d! rIieK, qi'

45 since )_vrq P_ contains the d-simplex defined by v_, the points v 'i for i E K and the points (r/qi)v_ for i E K"_and is contained in the product of the k-cube [0, diam P'] k and the (d- k)-simplex formed by v_ and the points (r/qi)v i for i E Kt. Also

vol(_pn p') > _r voh_x(o_pn p_), since )_vrq P_ contains the pyramid with base 0)4v rq P_ and apex v_. Thus

go rd- k r = V0dv_ol(_ n P') > - r' Hi_K, qi and Vodld(diam Pl)kra-k-X = Ar" A = Vodlvold_l(O)4vrq P') <_ (d- k)!I-IieK, q_ Substituting F I and A _for F and A in (4.6) and then d! d(diam p,)k dt d! d(diam P')kt t = nIPI; A Idr = A I = n'(d- k)!(d- k); n'r(d- k)I and neglecting the rather cumbersome constants gives the upper bound

o a,Jr* -a ,....dnr-a JOr (\l<_in

in_ a in_ a H Ci H qi dO1...dOd-1 .. ,_. i l

- • Jr,-d (....81C 1 / (8d--led-1 ) d-1 d-1 = O(n-d)" (r'"Jn-d sinOcosO = O(n -a) • (log tan _- - log tann -a = O(n-dlog a-1 n). (4.8)

46 If )/p contains all of the vertices v_,...,v_, then r > y0and (4.6) is o(_-"_°)=oC.-_). (4.9)

Summing (4.5), (4.8) for K c {1,2,...,d} and (4.9) and arguing similarly for all 0 < i < m gives Pr{X1, X2,... ,X4 define an empty halfspace of Y/}

- O(n -d logd-1 n) since Pr{X1,X2,... ,X_ define a halfspace of _} is independent of n. Since the probability is the same for every d-subset of X,,

EFn--(nd) O<__i

[]

47 Chapter 5

Convex Hulls of Samples from Spherically Symmetric Distributions

This chapter is devoted to extending to higher dimensions Carnal's very general results on convex hulls of samples from circularly symmetric distributions in the plane [18]. A density function h on R d is spherically symmetric if h(x) = h(y) whenever Ix I = lYl- We will consider two types of distribution with infinite support, and one type with bounded support. For both types with infinite support, EVn and EFn are so small that any polynomial-time algorithm can solve the facet-enumeration or facial-lattice problem in O(n) time on average if the vertex-enumeration algorithm of §3.2 is used for preprocessing. For the distributions with bounded support, we find that EFn = O(EVn) = o(n), and that both the gift-wrapping and the shelling algorithms with preprocessing have o(n 2) average running times; the exact times depend on the parameters of the distribution. Much of the generality of Carnal's results comes from his attention to the class of so-called slowly varying functions. Such functions are o(n a) for all positive c_; perhaps the most obvious example is the logarithm function. Necessary details about slowly varying functions are included in §A.4 of the appendix. If we follow Carnal in defining F(x) = Pr{ IX I > x}, we can summarize the results of this chapter in the following three theorems. The first theorem deals with a class of distributions having algebraic tails.

Theorem 5.1 For distributions satisfying F(x) = x-kL(x) with k > 0 and L(x) slowly varying, EVn = O(1) and EFn = O(1). The convex hull can be constructed in e(n) time on average by preceding any polynomial-time convex-hull algorithm with the vertex-enumeration algorithm of§3.2.

The second theorem deals with distributions that we say (somewhat loosely) have exponential tails.

48 Theorem 5.2 For distributions satisfying x- L(1/F(x)) with L(x) slowly varying and satisfying th__,,,ooth,,_,__o,,dit_o,_,_tat_di,, (5.1S),(5.10, a,,d(5.15),

BY. =o \ nL,(n) and BE. - 0 \ _L'(_)

(This implies that EV'n and EFn are slowly varying.) The convex hull can be constructed in O(n) time on average by preceding any polynomial-time convex-hull algorithm with the vertex-enumeration algorithm of §3.2. As a special case, if the distribution has the form F(x) .._ cx "_exp(-x k) for some positive c and k, then

BEn -- O(log (d-1)/2 n) and EFn = O(log Ca-l)/2 n).

The d-dimensional normal distribution is an example of the special case with m = d- 1 and k = 2. The third theorem deals with distributions with truncated tails.

Theorem 5.3 For distributions in the unit d-ball satisfying F(1 - x) .._ cx k for positive k,

EV n = O(n (d-1}/C2k+d-1)) and EFn : O(n(d-1)/(2k+d-1)).

The average running time of the vertex-enumeration algorithm is

O(n) /f k > ca- 1)/2; oCnlogn ) ifk : (d- 1)/2; (5.1) O(n 2(d-1)/c2k+d-1)) if k < (d- 1)/2.

If the shelling algorithm is used with the vertex-enumeration algorithm of§3.2 for preprocessing, its average running time is given by (5.1); otherwise, it is O(n2). If the gift-wrapping algorithm is used with the vertex-enumeration algorithm for preprocessing, its average running time is given by (5.1); otherwise, it is O(nX+Cd-1)l(2k+d-X)).

As an additional demonstration of the power of our method for bounding EVn, in the last section of this chapter we consider distributions uniform over the Cartesian product of balls of various dimensions. It is apparently difficult to apply direct methods to bound EFn, however we can still show that EVn differs from L-_n for points from the component ball of largest dimension by at most a logarithmic factor.

49 5.1 A General Framework for Spherical Distributions

As before, let In = {X1, X2,..., Xn} be a set of n i.i.d, points in R d, this time drawn from some spherically symmetric distribution. Following Carnal, let us define

F(x) -- Pr([[XI> x};

aCx) = Pr{X c') >_x}, i.e., FCz) is the probability that a random point lies outside a circle of radius x centered at the origin, and G(x) is the probability that it lies beyond a fixed hyperplane at distance x from the origin. It is immediate that, for positive x,

F(x) = xdrd-Xh(r) dr, (5.2) aCx) - L _(z/y) IdFCy)l. (5.3) Since the Xi are i.i.d. EVn - n P(x) [dF(x)l , where P(x) = Pr( X1 e vert Xn I I]X1] = x}, and the integral represents the unconditional probability that X1 is a convex-hull vertex. To bound P(x), we establish an orthonormal coordinate system with origin at X1 and with the center of symmetry of the distribution on the negatiw., x(1)- axis. This coordinate system partitions R d into 2d orthants. If each of these orthants contains at least one of the points X2, Xs,..., Xn, then X1 lies inside the convex hull of Xn and is surely not a vertex. This observation leads to an upper bound: since the probability content of 2 d-1 ,of the orthants is 2-(d-1)G(x) and that of the other 2d-1 is 2-(4-1)(1 - G(x)), it follows that

P(x) <_ 2d'(1 -- 2-d'GCx)) n-1 + 2d'(1 -- 2-4'(1 -- G(x))) n-t

< 2d'(1--2-d'C(x))n-l+O(c-n) forsomec> 1, where d° = d - 1. On the other hand, if none of the points X2, X3,..., Xn lies in the halfspace x (t) > 0, then Xt is surely a vertex. This yields the lower bound

P(x) > (1- C(x)) n-t

With n owritten for n - 1, it follows that

EVn < n2 d' fo°°(1 - 2-d'G(x)) n' [dR(x)[ + o(1) (5.4)

n2 d' exp(-nO2-d'G(x)) IdF(x)l (5.5)

5O and

EVn >_n (1 - aCx))"' IdFCx)l. (5.6)

Let us now consider EFn and (3.3) and extend Efron's method [39]. By spherical symmetry, _imp(p)= _.imp(-p); _(p,.) = _(-p,.); _(p)= _(-p).

The only terms of (3.3) involving the 0i are the ci terms. The values of the integrals in tgi are well-known [7, f620];

• '' ClC2 "''C2d-1 1 = -- -- 0

So

EF, .._ nd.d (O(p)) d esimp(p)'l(p, n) dr.

In the framework of the preceding discussion,

_(p,.) = r"-_+ (1- r)--_ = G(r) "-d + (1 - G(r)) "-_

._ 0(2-") + exp(-nG(r)) and 9(p) = -G'(r), so

EF. ~ n_.d esimp(r)(-a'(r)) dexp(-nG(r)) dr. (5.S)

Informally, (-G'(r)) d is the probability that the first d points lie on the hyperplane p defines, exp(-nG(r)) estimates the probability that the other n-d points all lie on the side of the hyperplane containing the origin, and esimp(r) is the expected volume of the simplex formed by the first d points if they lie on the hyperplane. The following lemma will be useful in bounding esimp(r) above.

Lennna 5.4

esimPCx) <_ (1 + o(1)) drd--lG'tCd-1Cx) /z °o (u2 -- X2) d-2 uhCu) du

51 Proof. For the sake of concreteness, we will assume that the d points lie on the plane x (1) = X. Let /_ be the distance from the ith point to the point (x, O,0,..., 0). Then the simplex formed by the d points is contained in a ball of radius maxl

rg-1 l

7d-1 l

l

= drd-1 f0 _ r d-1 (tCd-lrd-2)h(x/r2--G'+(x)x2) dr ---- drd--lG'tCd(_)-1 // r2d-3h(_/r 2 + x 2) dr

= drd--lG'tCd-1(_) fx °° (_ - _)d-_h(_) d_. []

5.2 Distributions with Algebraic Tails

In this section we prove Theorem 5.1, dealing with distributions having the form

F(x) = x-kL(x), k > O, (5.9) where L(X) varies slowly at infinity. Following Carnal's calculation of G(x) in the two-dimensional case, we dissect the integral of (5.3) at Ax and apply integration by parts in the lower interval to obtain

a(x)= - ,_(_/y)dF(y)- ,_(1/A)F(_)+ F(y)e,_(_/y).

For a fixed 5 > 0, A can be chosen so large (without considering x) that 1/2 - 5 _< _(1/A) <_ _(x/y) <_ 1/2 over the range of the first integral. Then the sum of the first two terms is

-F(_)(1 - 8+ 8)/2+ F(A_)(1- 8+ 8)/_.- F(A_)(1- 8+ 6)/2= + 8F(A_).

52 Thus

G(x) -- F(y) dlc(x/y) -4-5F(Ax)

_. y-k L(y) &c(x/y) -4--5F(Ax)

= _if" y-_LCy)___(__ (_/y)_)C_-_)dC_/_/y)+8FCA_) = •_ y-_-_LCy)_-_(1- (_/y)_)c_-_)dy±/8_FCA_). If x is large enough, then L(y) -- (1 i 5)L(x) and

a(x) = (1±8)_L(_)_ y-k-,(1_ (_/y),)(d-_ld/,y+8F(A_)

_ d /A" u(k-1)/2(1 -- u)(d-3)/2 du ± 5F(Ax); G(x) F(x). 2_d \ 2 ' 2 ] (5.10)

Substituting w = 2-d'G(x) into (5.5) and applying Lemma A.3 gives

EVn < (l+o(1))n2 d' e-n'_d S w - fo 2-'_ [2[._dtd-1cd ( (k+l2 ' d-l))2 -1 ]

,._ 22d-1 t__:dd_ 1 (n( ]¢_12 ' d-l))2 -1

A similarcalculationwith (5.6)shows that

EVn >- (1+o(1))2_---__dd-1 (B(k+l 2 ' d-l))2 -1 ; the two bounds differ by a factor of 4d- 1. We turn now to EFn and (5.8). It is immediate from (5.9) and (5.10) that v'(_)~ -k_-_G(_).

We will use Lemma 5.4 to bound esimp(x); we must first compute h(x). Since by (5.2) and (5.9)

--_dxd-lh(x)--" f'(x),,_-kx-k-Xn(x), we have

h(x) ,,., k_dlx-(k+d)L(x).

53 Considering only the integral of Lemma 5.4, we have

_C_ - _)d-_hC_)d_

.._ kL(x)__d_oo(u 2 -- x2)d-2ul-k-d du

: kL(x)2_:d xd_k_ 2 fnl(1 _ y)d-2y(k-d)/2 dy where y : (x/u) 2

Thus

esimp(x) <_ (1 + o(1)) rd_ldx d-1.

Substituting into (5.8) with w = G(r), we have

EF,_ <_ (1 + O(1))ndpddrd_l /'k+l d-1 xd-l(kwx-1)d-le-nwG'(r) dr \_, 2

"_ B (k+12 ' d-1)2 I_drd-lkd-ld!"

5.3 Distributions with Exponential Tails

In this section we prove Theorem 5.2 bounding EVn and EFn for distributions of the form x -

L(1/F(x)) where L varies slowly. Since F(0) - 1 and F(co) -- 0, clearly L(1) = 0 and L(c_) = c¢. Following Carnal, we apply Lemma A.2 to express F(x). Setting s = 1IF(x), we have

x -- L(s) - exp (_ " e(t)-t dt) . (5.11)

We define v(u) -- e(L -1 (u)) and note that since dx/ds - xe(s)/s, also ds/s - dx/v(x)x. If _ =: n(a) and a -- 1/F (_), then

F(_) = exp(logF(_)) = exp(-log(a)) - exp ------exp - x_(x) " (5.12)

54 At this point, we follow Carnal in imposing the following technical smoothness conditions on v and thus indirectly on L:

v(x) is monotone (decreasing) for large x; (5.13) • .¢(_). log(_(_)=) o(1)_ _-, _¢; (5.14) _(_).log(_)=o(1)_ _-__. (5.15)

These conditions imply that e is slowly varying. We can now generalize Carnal's calculation of G(x) by substituting s -- n-l(x) - 1/f(x) and a-- L-l(y)- 1/r(y)into (5.3);

G(x) = _: \ L(a) ] "_ = + s " Since L is monotonic and increasing, L(s)/L(As) < L(s)/n(a) _< 1 over the range of the first integral. Since L varies slowly, L(s)/L(As) --, 1 as x and s approach oo. Successively applying (A.3), (5.11), the slow variation of e, and the identity 1 - e-u ._ u as u _ 0 gives

(LL(s(a) ) ]"_ _ _dIc_12(dCd-d1--)121) ( 1 Ln((8a)) ) (d-l)/2

_d_12(a-1)/2 ( ( fa e(t) dt) ) (d-1)/2

~ na_12(d-1)/2 _d(d--1) (l--exp(--e(s)l°g_)) (d-1)/2 _d_12(d-1)/2 (d-1)/2 "_ led(d--X)(e(s)l°ga) Thus

fs As -- I'_d-i(2_(.,,_d(d- (1)d-l,/2 fsAs( log 8if) (d-1,/2 -'_ado" "_ _d_1(2_(8,,_d(d- 1(d)s-l,/2 fl A (logt)(d_l)/2 dtt2" In the range of the second integral, L(s)/L(a) > exp(-e(s)log(a/s)) >_ (1 - e(s)log(a/s)), and

co co ted-12(d-1)/2 (d-l)/2 da tC,d_l(2e(s))(d-1)/2 (logt)(d_l)/2 dt L,.q < L_ _,(d N 1) (_(_)log(_/_)) _-z~ _,(d 1)_ i_ 7_" Since A can be made arbitrarily large,

G(x) .._ tCd-l(2e(s))(d-1)/2 (logt)(d_l)/2 dt (5.16) tcd(d - 1)s -t 2

"_ ICd-l(2V(X))(d-_d(d-- 1)1)/2 r (_) F(_) -- _d1(27rv(x))Cd-1)/2FCx). (5.17)

55 Carnal's argument showing that v(x),.., e(1/G(z))is easily generalized to d dimensions. Suppose that G(x) = F(Xl), or equivalently that Xl = n(1/G(x)). Then by (5.17)

logaF((_)z) ~ log(2_C_))c___d 1)~/2--V-d- 1 log(Cv(x)) for some constant C. But also by (5.12)

F(x) F(x) fz dt x- Xl log G(x) - log r(xl) = Jz , --_tv(t) zv(x) " Together, these two lines imply that ' d-1 x, - x ~ --_ log(ev(x))xv(x),

and further, by Taylor's theorem, that d-1 vCxl) - vCx) ~ 2 l°gCcvCx))xvCx)¢(x)

Finally, we have that v(x) ,,_ e(1/G(x)) as desired,sinceby (5.14), dl/a(_)) - _(_) _(_1)- _(_) _(_) = _(_) ~ logCO_(_))_'(_)=o(1). Thus, F(x) ,,, _d(2_re(1/G(x)))-Cd-1)/2G(x).

Substituting into (5.5) with w - 2-d°G(x) and applying Lemma A.3 gives

Ev. < -.2 d + o(1) ___-n_S(2,_) _'/' _-"'_d[dl/(2_'_,))-d'/'w]+o(1) ~ ,_d2_'(2_)d'/2_(2-d',_)-d'/2 .., _d(8_r)(a-1)/2e(n)-(a-1)/2

Similarly EVn _> (1 + o(1))_d(27r)(a-1)/2e(n)-(a-1)/2;

the two bounds differ by a factor of 2a-1. In the important case F(x) ,-. cx m exp(-x k) for k > 0, we have EVn = O(log (a-l)/2 n). Distribu- tions which drop off more quickly have larger values of EV,_; for example, if F(x) --, exp(- exp(xk)), then EVn = e((log n log log n)(d-1)/2), if F(x) ,.., exp(- exp(exp(xk))), then

EVn = e((log n log log n log log log n)(d-1)/2).

56 Turning now to the expected number of facets, we restrict our attention to the most interesting case, that of hCx) = c Ixl " exp(- I xll"/k). (5.18)

The following lemmata will be useful.

Lemma 5.5 If a > - 1, then

_°exp(-rb/c)d_= b-X__-b._exp(--_b/__)2_-bi_ii! (_+ b- 1)/b . /: j>_o ( 3 )

Proof. Substituting y-- rb/c and then applying (A.1) gives

r aexp(-rb/c) dr = b/c(ey)(a-b+l)/be-U dy = b-Xcz _-b+l exp(--zb/c) _(--1)_z-bici((1 -- a- b)/b)_ p_o

_ b_lcxa_b+ 1 exp(_xb//c ) 1>Ex_obicifi((a +b-3 1)/b).

[]

Lemma 5.6

• (--1)dd! /fd--k

Proof. By induction on k. The basis case, k = 0, is easily verified. For k > 0 and d > k, we have

,_. (_) (-1)iik - _. (_-:)d(-1) iik-x

= -d_i (dil)(-l)i(i-4-i)k-1

(-1)qJ

_ f o ifd>k - /, (-ilia! if d= k. []

57 For the special case defined by (5.18), Lemma 5.5 and (5.2) imply that /:

So

x ,,_ L((c_d)-lx k-'_-dexp(xk/k)); (5.19) GCx) ,,_ c(2_rv(x))(d-1)12x m+d-k exp(-xk/k).

Also

v(x) = e(L-l(x))= L-I(x)L'(L-I(x))/L(L-1Cx))

"" xk-m-a-l exp(xk/k) L' ( xk-m-d Cexp(xk/k)Xd )cxd -k the last transformation following from differentiation of (5.19) with respect to x by the chain rule:

dL((cnd)-lxk-m-dexp(xk/k))dx ,_L,(Xk-m-dexp(x\ k/k))(x2k-"mC--_d-lexp(d xk/k)" ) C_d _ dxd-x := 1. Thus finally

G(x) ,,_ c(2r)(d-1)/2x m+a-k-(d-1)(k/2) exp(-xk/k);

G'(x) -- G(x)d(logG(x))/dx,,_-xk-lG(x)

,,_ _c(2_r)(d-1)/2 xm+(d-1)(1-k/2) exp(-xk/k).

We must now bound esimp(x).

Lemma 5.7

esimPCx) < (1 + o(1)) #d-tdIx/-1d)(2(d-_r(d2d-1) ).(d-t)/2 x(t-k/2)(d-t)

Proof. We apply Lemma 5.4. Expanding h(u) and dealing only with the integral of the lemma, we have

oo C( tt2 __ X2) d-2tt m+l exp(-uk/k) du

58 =cz(,

,>_o ; (- expC-*"/k)x'd-2+'-*_*i>o-i_'kij! (2;+ _.+3 k)/k

The latter binomial coefficient is a polynomial of degree j in i that can be written as

(2/k)i;i + _ (ai,/kJ)i_. 0

(5.20) -- C(--1) d-2 exp(-xk//k)x 2d-2+m-k

By Lemma 5.6, the smallest value of j for which the bracketed quantity does not vanish is d- 2; then it equals (-2)a-2(d- 2)! and (5.21) ~ _2d-2((d-2)!)2exp(--_k/k)_(2d-2+_-k-(e-2k)) = _2d-2((a_2)!)2exp(-_k/k)_+(2-k)(d-0

Thus

cdrd-lt_d-1 2 d- esimp(x) <_ a'(x) 2((d- 2)!)2 exp(--xk/k)xm+(2-k)(d-1)

"_ _,d__2(d- d,1) fd( _(d-2d 1) ) (d-1)/_xCl-kl2)Cd-1)" []

With G(r) = w we have G'(r) = dw/dr = -wr k-1. Substituting into (5.8) gives

r(1-k/2)(d-1)(wrk-1)d-le -nw dw. (5.22) EFn < (l + o,,l,,n,,d, Pd lZ"_(a_ld---_)d'V/d( _r(d2-1)d )(d-1,/2;1/2J0

Since log w ,_ -rk/k, we have r ,,_ (k log(1/w)) 1/k. Considering the integral of (5.22) alone and applying Lemma A.3 gives

;1/2 = ;1/2 r(k/2)(d_l)wd_le_n w dw JO JO

59 __-- k (d-l)/2 fl/2 (log(1/w))(d-l)/2wd-le -nw alto Jo

k (d-l)/2 e -nw d Jo ,,_ fl/2 [(Iog(I/w))(d-l)/2wd]

": k(d-1)/2(d - 1)!n-dlog (d-1)/2 ft.

Substituting back into (5.22) gives

EFn < (1-t-o(1)) t_2(d#dd--l(d')1)v/-d2 (2rkdlognCd- )1) (d-1)/2

~ v_d(d-2)_\( 87r7kd: log n ) (d-1)/2 5.4 Distributions with Truncated Tails

We turnnow totheproofofTheorem 5.3.In two dimensionsCarnalconsidersdistributionson the unitballsatisfying F(1 - x) ,_ xkL(1/x) (x -_ O,k >_0).

For simplicity we restrict our attention to the form

F(1 - x) .._ cx k (z --* O,k > 0).

For example, the uniform distribution in the unit ball has F(1 - x) = 1 - (1 - x) d .._ d. x. Applying a well-known Beta-integral identity [7, f609], we have

G(1 - z) = _((1 - z)/r)ck(1 - r)k-_ dr j_l 1

,._ _d-12{_ddC-1)d/2-ckf111) -z (r-- (lr--x)) (d-1)/2 (1 -- 7")k-1 dr

.._ Kd_12(d-1)/2ad(d- Ck 1) _11-z (r -- (1 - z))(d-1)/2(1 -- r) k-1 dr

: Kd-12(d-1)/2Ckg1d()d-- B (k,_-)x k+(d-1)/2 : Az k+(d-1)/2, (5.23) where the second line follows from the first and (A.3) because the argument of S is 1 - O(x) as x-,0.

6O Substituting

t : n2-d'G(1 -- x) : n2-d'Axk+(d-1)/2;

( 2d't _ 2/(2k+d-1) " x = knA] 2d dt 2t-(a-x)(2k+a-_) dt Z k-1 dz : Anx(d_l)/2 -- (2d_lAn)2k/(2k+d_l) into (5.5), we have EVn <_ n2 d' £ exp(-nt2 -d'G(z)) IdF(z)l + o(1) ~ n2d'/oexp(-' n2-d'a(1 - z))ckzk-ldz

"_ ck2an(d-1)/(2k+a-1) f,q2 d' e-tt -(d-1)lC2k+d-1) dt (2d-lA) 2k/C2k+d-1) JO

._ (2k+d-1) ) r (ckn) (d-1)/(2k+d-1) 2(l+(a-k-1)(d-1) ( _d-lB((kd,- (1)_d . d 1)/2) )2k/(2k+d-1) ( 2k . 2kd - 1) Similarly, by using (5.6), we obtain

EVn > (1+o(1)) 2(1- (2k+a-') / F (ckn) (d-1)/(2k+d-1) - 2k(d-2), ((d__l)d-lB(t_ k, (d + d1)/2) )2k/(2k+d-1) ( 2k +2kd- 1) " The upper and lower bounds differ by a factor of

(k+a-1)(d-a)-2_ 2 2k+d-x .

For fixed d, this factor decreases from 2 d-1 to 2(d-3)/2 as k goes from 0 to .oo. Turning to the number of facets and writing w for G(1 - r), we have

Gt(1--r) ,-_ A(2k.d- 2 1)rk-l+(d-l'/2,,_ (2k.d-1) 2 r- 1w;

esimp(1--r) <_ Pd-l(1 _ (1 _ r)2) (d-1)/2 <_tZd_l(2r)(d-1)/2; r ,,_ (w/A) 2/C2k+d-1) and so, by (5.8), with w : G(1 - r), then t: nw, EF.. ~ nd._ /:esimp(l-r)(-a'(l-_))dexp(-na(l-_))d. .

,,., rtdl.t d e_nW(2r)(d_l)/2 2k + d- 1 d-1 r-(d-1)w d-l dw :.'1o/2 ( 2 )

61 = nd#d2(d-1)/2 (2k_d- 2 1)d-1;1/2eJ-OnWr-(d-1)/2wd-ldw

.._ nd#d2(d_l)/2 (2k -_-2d- l )d-IA(d_l)/(Zk+d_l) ;JIo/2e_nww(d_l)(2k+d_2)/(2k+d_l ) dw

"_ (An)(d-1)/(2k+d-1)#d2(d-1)/2 (2k_d- 1) d-1/n/d2Oe-tt(d-1)(2k+d-2)/(2k+d-1)dt2

"_ #d 2 r (d- 2k + d-11 ) (d- 1)gd 5.5 Uniform Distributions in Products of Balls

Let fd be the density function of the uniform distribution on the d-dimensional unit ball. Let

Z = (z_)_z_2)_.._z_d_);z_2_)_z_2_2)_x_2_d2_;_.;z_k_)_x_k_2_.._z_k_dk)) be a point in R d ford:_i_ldianddl>d2>-..>dk. We will consider product densities f of the form

f(x) -- fdl (X(1'1), Z(1'2), ''- , z(l'dl)) "fd2(x(2'1) , Z(2'2), ... , X(2'd2)) "'" fdk (Z(k'l), Z(k'2), "'" , Z(k'dk)) •

Such a density is uniform over a convex body that is the Cartesian product of unit balls of various dimensions. For such densities we will prove the following theorem.

Theorem 5.8 Let f be a density with the properties listed above. Let Xn : {X1, X2,... ,Xn} be a set of n i.i.d, points chosen according to f. Let m be the largest i for which di = dl. Then EVn, the expected number of vertices of the convez hull of f.,, is

O(n (d1-1)/(d1+1)Iogm-1 n). 1/2 Proof.SincetheXi arei.i.d.,EV, = n Pr{X1 e vertX',}.Fori _ i _

P(_/1,_12,_,'''/k)--- Pr{Xl E X'. IA q,(x_)=y,} i=1

Without loss of generality we may assume that X_'1) : ri(X1) and X_ 0") : 0 for j _>2. If X1 is a vertex, then at least one of the 2d orthants defined by the d hyperplanes x(iy) = X_ ij) must be free

62 of points from Xn. The smallest of these orthants has probability content of at least

k k k a_/22 m i=1 i=1 i=1 where #d,(') is the function /_(.) of (A.4) subscripted to show dimension. The probability that a fixed orthant is empty is (1 - F) n-1. The probability that at least one orthant is empty is at most

2 d times as large, thus m writing nt for (n - 1)

p(yl,y,.,...,yk) _ 2d(1- r)"' _ 2dexp -2-dn'II y_e,+l_/_. ( i=1' ) Now let Q(yl,y2,... ,yk) = Pr{AL1q_(xl) _ y_}.Then

k k e(yl,y2,... ,yk) = II Pr{qi(Xl) <_yi} = II(1 - (1 - y_)d,), i: 1 i= I and the corresponding density function is

q(Yl,Y2,-..,Yk) = IT dd(1- yi) d'-I <= d i (1 -- yi) d'-I . i: 1 i: m+ 1

The expected number of vertices is

EVn : n. Pr{X1 e vertXn} <__ n f0...1fP0(Yl1,Y2,...,Yk) q(Yl,Y2,...,Yk)dyl'"dyk. We first evaluate the integral in that part of the domain of integration satisfying _ < Yi <: 1 for 2 < i < m with c_ = n-2/(dl+1). (This is the entire domain of integration unless m > 2.) Call this integral I1. Then

i=1 i=m+l (d-m) (,_-1)

Substituting

t= 2-dnlHyi 2 , i:1

dyl = (2-an ') zT-+-r y_. d_+l t_z72_- 1 dt

63 yields

/1 <__n2 d di (2-dn') I_TTi"__ oo e-tt -_1_+d -I'1 dt 1 ly, • -- o "= I:o 1 (i= m+ 1 :o ,.:1..) The integral in t is a Gamma integral, and the integrals in ym+l through Yk are Beta integrals; these contribute only constant factors. Therefore,

I1 = O(n (al-1)/(al+x) logm-1 rt).

I2, the integral over the remainder of the original domain, can be bounded by replacing the integrand with 1, since p. q < 1. Since the new integrand is symmetric in Yz through Ym, the value of the integral over the remainder of its domain is at most (m - 1) times its value over the region satisfying Yz < a, thus

12 "< (TTI-- 1)n_O1'''_O1_O°t_1 d,l...d,k-- 0 (_.(d'-l)/(dl+l)) . Summing/1 +/2 gives the result. []

64 Chapter 6

Voronoi Diagrams of Samples from a Hypersphere

The Voronoi diagram is a natural and intuitively appealing structure. First conceived by the mathematician Voronoi [96], it has been reinvented by researchers in several fields; in particular, meteorologists associate the two-dimensional version with the name Thiessen [93], and physicists honor Wigner and Seitz [99] for the three-dimensional version. It has been used by geologists, foresters, agriculturalists, medical researchers, geographers, crystallographers, and astronomers. Within the domain of the mathematical sciences, it is applied to simulate differential equations by finite element methods, to interpolate surfaces in geometric modeling systems, and to solve geometric problems such as finding Euclidean minimum spanning trees and largest empty circles. (Avis & Bhattacharya [2] present an extensive list of references for applications.) The Voronoi diagram of a set of points -- called sites -- is a partition of R d that assigns a surrounding polytope of "nearby" points to each of the sites. More rigor is supplied by the following definition.

Definition 6.1 The (nearest-site) Voronoi diagram of the set Xn = {Xl, X2,... ,Xn} Of n sites in R d is the set of n convex regions _)i = { x I Vj" dist(x, xi) < dist(x, xi) } for 1 < i < n.

Each region is a d-polytope containing the points lying nearer to the site in its interior than to any other site. The straight-line dual of the Voronoi diagram in the plane is called the Delaunay triangulation. In the planar case, sites xi and xj are joined by an edge in the Delaunay triangulation if and only if _)i and _)1 share an edge. In d dimensions, sites Xio,Xi_,...,xik define a k-face of the Voronoi dual if and only if _)io,_)i_,...,_ik share a (d - k)-face in the Voronoi diagram. If.. as is

65 assumed in the sequel, no d + 2 sites fall on the same hypersphere, the dual partitions the convex hull of Xn into d-simplices. The Voronoi diagram can be constructed easily from its dual and vice versa. The dual has several interesting properties.

Lemma 6.2 If sites Xio,Xil,. ..,X_(d+l) are vertices of a simplez in the Voronoi dual, the hypersphere passing through these d + 1 sites contains no other sites.

Proof. The center of the hypersphere is equidistant from the d+ 1 sites and is on the boundary of the

Voronoi region of each of them. Thus it cannot lie nearer to another site than to _io,ZQ,...,Zi(d+X). [] We call such a hypersphere empty or site-free.

Corollary 6.3 lf sites xio,xil,...,Xi(d+l) are vertices of a face of some Voronoi dual simplez, there is some hypersphere containing no sites which passes through these sites.

Corollary 6.4 Every convez-huU face is a face of some Voronoi dual simplex.

Proof. The convex-hull face is contained in some hyperplane that defines an empty halfspace. This halfspace is a degenerate site-free hypersphere. [] One may also define a furthest-site Voronoi diagram.

Definition 6.5 The furthest-site Voronoi diagram of the set Xn = (Xl,X2,...,Xn} Of n sites in R d is the set ofn convez regions _)i = { z lVj'dist(z,z_) >_dist(x, xj) } for 1 < i < n.

Region _i contains the points lying further from site xi than from any other site. Only sites that are vertices of the convex hull of Xn have non-empty furthest-site Voronoi regions. The dual of the furthest-site Voronoi diagram also partitions the convex hull of Xn into simplices; the vertices of these simplices determine hyperspheres that each contain all the sites of X_. Many researchers have considered Voronoi-diagram construction. Previous work on the two- dimensional case is surveyed in Chapter 7. Browstow et al. [14], Finney [42], and Tanemura et al. [91] have addressed the construction of three-dimensional diagrams; none of these authors present detailed analyses of running time. Bowyer [11] and Watson [97] describe algorithms for higher dimensions, but neither analyzes his algorithm rigorously. Bowyer argues heuristically that his algorithm requires O(n l+l/a) time on average for points uniform in a d-dimensional hypercube, and cites some empirical evidence

66 to support the claim. Watson claims 0(n2-1/d) time in the worst case. Avis and Bhattacharya's algorithms for the Voronoi diagram and its dual [2] rely heavily on the simplex method for lin- ear programming; since this method has exponential worst-case running time, they focus mainly on experimental studies of their algorithms' performance and of the expected complexity of the diagrams for points distributed uniformly in the unit hypercube. A pleasing connection between nearest- and furthest-site Voronoi diagrams in d dimensions and convex hulls in d + 1 dimensions allows any convex-hull algorithm to be used to construct Voronoi diagrams. This mapping, first observed by Brown [13], is restated here in a form due to Guibas & Stolfi [49]. If a is a d-vector and b is scalar, let a *b be the (d + 1)-vector (a(1)a(2), ... ,a(d),b), and let A : R d --_ 1_d.l be the "lifting function" defined by = where (x,y) denotes the inner product. The range of )k is the surface of a (d -t- 1)-dimensional paraboloid of revolution. The image of a d-sphere is the intersection of a hyperplane with the paraboloid and vice versa. To verify this, we note that x lies within the d-sphere centered at p with radius r if and only if (x - p, x - p) < r 2 and apply the bilinearity of the inner product:

¢_ (((-2p),l),(x,(x,x))) < r2-(p,p)

¢_ (((-2p), 1),$Cx)) < r2 -

It is clear from the last line that $(x) lies not only on the paraboloid but also in some halfspace (a, x) < b in R d+l. Now suppose that the function $ is applied to the points of X, and the convex hull of their images is constructed. If points Xl through Xd+l form a nearest-site dual simplex, the d-ball they define is empty, and so, too, is the corresponding halfspace in Ra+l; thus its bounding hyperplane is a facet of the convex hull. If xl through xa+l form a furthest-site dual simplex, their d-ball contains all of X,; then the complementary halfspace is empty and again the bounding hyperplane is a convex-hull facet. Conversely, a convex-hull facet always corresponds to either a nearest-site or a furthest-site dual simplex. Constructing a (aT+ 1)-dimensional convex hull is a viable approach to the problem of con- structing a d-dimensional Voronoi diagram. The gift-wrapping algorithm [8,19,90] may be used in e(n(Sn + S,_)) time. Or Seiders shelling algorithm [81] may be used in O(n 2 + (Sn + S_,)log n),

67 where Sn is the number of nearest-site simplices and S* the number of furthest-site simplices in the result. In fact, it is not difficult to modify either algorithm to eliminate the S* (S_) term if only the nearest-site (furthest-site) diagram is required. Like the number of facets in the case of convex hulls, Sn and S_ can vary wildly. Seidel [80,83] has shown that both Sn and S* can be extremely large w O(n[(d+l)/2J) _ in the worst case. On the other hand, it is not difficult to construct families of problem instances for which Sn -- O(n). Thus probabilistic estimates of the average value of the two quantities are useful. Meijering [69] and Gilbert [47] have considered the Voronoi diagram of sites from a Poisson process of fixed intensity in Rd; Meijering showed that the expected number of nearest-site Voronoi neighbors of a site depends only on d; in particular, it is 6 for d -- 2 and _ 15.54 for d - 3. Such a set of sites may be thought of as an infinite set of sites drawn from a uniform distribution over all of R d. In computational practice, however, one must deal with finite sets of sites drawn from a particular distribution, e.g., a uniform distribution on the interior of some convex body like a hypercube or hypersphere. Two sites are neighbors if and only if they lie on the surface of some ball that contains no other site. In the Poisson case, a pair of distant neighbors is always unlikely since it implies the existence of a large empty ball. In the case of a bounded set of sites, it can still be shown that sites far from the boundary of the body probably have only nearby neighbors, but some long edges always occur near the boundary of the body, where most of the empty ball may lie outside the support of the distribution. Thus results dealing only with the Poisson case are insufficient for the average-case analysis of algorithms. In the next section, we present a new method for determining ESn and ES*. In §6.2 this method is applied to the analysis of the asymptotic behavior of ESn for sites drawn independently from the uniform distribution in the unit d-ball. In §6.3 we use a similar method to show that the Voronoi diagram of such sets of sites can be constructed in linear expected time by a variation of the gift-wrapping algorithm using standard bucketing techniques.

6.1 A General Method for Bounding the Expected Complexity of Voronoi Diagrams

In this section we describe a general method for bounding ESn and ES_, the expected number of simplices in the duals of nearest- and furthest-site Voronoi diagrams of random point sets. The first d + 1 points xl,..., Xd+l define a d-simplex with probability one. Let us first reckon the probability Pn that they also define a simplex in the dual of the nearest-site Voronoi diagram.

68 This is just the probability that the other n- d- 1 points lie outside the hypersphere passing through the d q- 1 points. Writing g(.) for the density function of the xi and r for the probability content of interior of the hypersphere, we see that this probability is P,,=/.../(1- Ra Ra and thattheexpectednumber ofsimpliceiss therefore

ES. = d+ 1 P" = d+ 1 ... (1- r)"-d-lg(xl)...g(xd+Odxl...dZd+l. R a R,i

We next carryout a transformationof coordinates.The d + 1 pointsxl,x2,...,Xd+lcan be expressedinterms of a d-vectorp representingthe centerofthe spherethey define,a scalar r representingthe radiusof thatsphere,and d- 1 angles¢i1,¢i2,...,¢i,dfor-1 each xi. Let Yi- xi- p. Then thetwo systemsofcoordinatesarerelatedby thefollowingequations:

X_ 1) __ p(1) ___y_l) -- p(1) ___rci,d_lCi,d_2. " .ci3ci2Cil

zl 2) : p(2) _[_ y_2) = p(2) _jr_rCi,d-lCi,d-2 "''Ci3Ci28il

x_ 3) __ p(3) ___y_3) __ p(3) __ rCi,d_lCi,d_2. " .Ci38i 2

z_d) = p(d)+ y_d)= p(d)+ rs_,d-1, where cii and sq represent cos Cq and sin ¢ii respectively. In three dimensions, the Jacobian of this transformation, expressed in tabular form, is

p(1) p(2) p(8) 7' ¢11 ¢12 ¢21 ¢22 t])81 ¢82 t/)41 ¢42 •_') 1 o o y_')/,, t_,y_') tt,yl') o o o o o o z(ll) 1 0 0 Y_')/" 0 0 t_,,g') "'_22Y"2(') 0 0 0 0 xl l) 1 0 0 Y_')/" 0 0 0 0 t_,,y_')t_,,y_') 0 0 x(tl) 1 0 0 yi'>/", 0 0 0 0 0 0 tf,,y,(') "_,i2 •$14<'), z_2) 0 1 0 yT)/•, k,,y?) t_,y7) o o o o o o z_ a) 0 0 1 Yl(8)/r 0 kily_ 8) 0 0 0 0 0 0 z (2) 0 1 0 y_')l, o o k,,y,(') t',,_,,(') o o o o z (3) 0 0 1 y?)l" o o o k,,y('), o o o o z (`) 0 1 0 g'>l," 0 0 0 0 k31Y3 (2) t_,y(32) 0 0 •(_) o o 1 y_)/" o o o o o k_,y_(_) o o xl 2) 0 1 0 _,(:)/" o o o o o o k41Yl 2) t'.',2 y 4(2) z(2, 0 0 1 _,,(_)/" o o o o 0 o o k,,yl')

where t_i = - tan t_i i and kq = cot ¢ii. The generalization to higher dimensions is straightforward.

69 , (j). (1), (j) If the row for x_1), denoted by pl1), is replaced by ai = 2-,s

(1). (1) (2). C1) (3). (1) Yi /Yi Yi /Yi Yi /Yi r/y_ 1) 0 0 0 0 0 0 0 0 and the matrix is in quasi-triangular form• The determinant of the entire matrix is the product of the determinant of one (d + 1) × (d q- 1) matrix and d q- 1 similar (d- 1) x (d- 1) matrices. The (d- 1) × (d- 1) matrices are themselves upper triangular; the determinant of each is easily seen to be

r d-1 Cid,d-1_lCi,d_d-2 2 • • • el1l = ___1) r d-2 Cid,-2d_lCi,d_d-3 2 • • • el12.

If the factors of (1/y_ 1)) are removed from the rows of the (d q- 1) × (d q- 1) matrix, and then the factor of r is removed from the last column, the remaining determinant is

3 or just d! times the volume of the simplex formed by Xl, x2,..., Xd+l. Thus the determinant of the original Jacobian matrix is

d! simp(xl, x2, • . . , Zd+l) r(d+l)(d-2)+l H Cdi,-d2-lCi,d-2d-3 " . . Ci12 , l

P,-,-- (6.1)

f r)o- -i h(,_l)'"h(_d.l)Simp(_l,"',_d.l)d¢l,S...dCd.l,d_l drdp R d 0oo ] (d+l)(d-1) with

h(xi) = rd-lcd-_,2d-1c_,d-d3-2" ''cl2g(xi) If we define

_(r,p) -- ... h(xl) d¢1,1 ""' d¢l,d-1,

(d-l) then, since Zl through Xd+1 are i.i.d., /0"•.. /0"h(_l)"'h('_d+l)d¢l,l'"dCd.l,d-l= (O(_,p))d.l, • y -' (d.l)(d-1)

7O and the bracketed quantity of (6.1) is easily seen to be

(O(r,p))d+lECsimpCxl,X2,...,Xd+l) l llxi- NI = r for 1 <_i <_d-t- 1)= (_(r,p)) d+l esimpCr, p) and P, = d! //2r-d(1 - r)"-d-l(O(r,p))d+l esimp(r,p) dr dp. Rd It is clear that _, F, and esirnp depend only on r and Ilpllif g is spherically symmetric• To exploit this symmetry, we express p in generalized spherical coordinates (q, 01,02,... ,0d-l) defined by

Ipll = q

/9 (1) = qCd-lCd-2 "'" C3C2Cl

p(2) : qCd-lCd-2"''C3C2Sl

p(3) : qCd-lCd-2"''C382

p(d) : qSd_l, where ci and si represent cos 0i and sin 0i. The Jacobian of this transformation is well known to be q-d-lCd.d_-l2Cd__..d- C_ [56, p. 17], thus by application of a well-known definite-integral identity [7, f620] with #d = (2rd/2)/(dr(d/2)) being the volume of the unit d-ball, Pn is

d'(fo _" "'for _0 2"cdd-_¢d-3" "" c12dO1...dOd_l)(_0 °° _o°°qd-lr-d(1 - r)n-d-lg d+l es_mp drdq) (a-2) ... d! d_d . f0/0qd-lr-d_d+l esimpexp(--nr) dr dq. r_ rid+ 1 and, since (d+l) "_ N_-iTr'

ES,_ ... d_and+1ld+ /oo_ /2qd-Xr-d_d+l esimpexp(--nr) dr dq = d_and+1ld+" /ooo /_oI(q,r) drdq. (6.2) Similarly,

ES* ,_ dlgdnd+ld+l . /ooo /oqd-lr-d_d+l esimpexp(-n(1 - r))drdq.

71 6.2 Bounds for the Uniform Distribution in a d-Ball

In this section we turn to the uniform distribution in the unit d-ball in particular and prove the following theorem.

Theorem 6.6 Let Xn = {X1,X_,... ,Xn} be a set of n sites drawn independently from the uniform distribution on the interior of the unit d-ball. Then ESn, the expected number of simplices of the dual of the Voronoi diagram of Xn, is O(n).

Proof. We have g(x) -- 1///_4 when IxI -<1 and g(x) -- 0 otherwise. Let/2 denote the unit d-ball, B the ball defined by the points Xl through Xd+l, and cOBthe surface of B. Then

F -- vol(B N _/) Pd

__ vold-l(a_ f3 _/), #d esimp < volCconv(OB n _/)).

We now divide the domain of integration of the integral of (6.2) into eight regions corresponding to the possible patterns of intersection of the two balls B and _/. Case 1: q

-- Icdrd-1 -- drd-1; and F --_tard -- r d. l_d tZd Also

esimp = Vdrd.

Thus

I(q, r) dr dq .._ dd+lv d j(olj(o 1-q folfol-qqd-lrdZ-lexp(_nrd)drdq

dd+lvd(_olqd-ldq) (l_On(1-q)a (t)de-t dt )--t ,.., dd-2d!vd n-d.

Case 2: q < 1 and 1 - q < r < x/1 - q2. In this case at least half of B and cOBlie inside ll, and

_< tgdrd-1 -- drd-1; F > _d rd -- r d ; esimp < Dd rd. Pd -- 21_d 2 -

72 Figure 6.1: Case 2: q < 1 and 1 - q < r < x/1 - q2.

By Wricomi's formula f_ t-ae -t dt .., x-ae -z for the incomplete gamma function [95, §4.3], ;1r_,(q.._"..,q=o(,rIr_ q,1,.1°x,(-°.,,.,/q,_ J0 41 -q JO Jl-q = o(1). qd-, _dt dq (6.3) So'((1i-q)d . t )

: O(n-d) • _olq d-1 (n(1 - q)d)d-lexp(--n(1 -- q)d)dq = O(n-d) • /o"n-1/dud-2+(1/d)e -'_ du = O(n-d-(Ud)).

Figure 6.2: Case 3: q _<1 and @1 - q2 _

Case 3: q <_ 1 and X/1- q_ _< r _< 1. Referring to Figure 6.2, we have immediately from geometric considerations that (r - x) _

73 for x, it is easy to verify that

C_- _)= (q+ I - _)(_+ I - q) 2q

Since r >_ V/1 - q2 _ 1 - q, it follows that

r< (r+l-q) <2r,

q< (qWl-r) <2q; thus (r- x) = e(r) and also h = e(r). Now

- _ = e(1)r a-x 1 - = e(rd-1); #d

F > (r + 1 - q)t_d_l hd-I -- e(rd); - d#a esimp < (r- X)#d_lh d-1 = O(rd), and

I(q, r) dr dq = O(1) qd-lr-dr(d+l)(d-1)/2r d exp(--nr a) dr dq

n d

= o

Figure 6.3: Case 4: q < 1 and r > 1.

74 Case 4: q <- 1 and r > 1. In this case

< ,_a/_a=d,

F > 2vol(L/n{xIx(1)> 1/2})- fl(1), esimp < I_d, and /ol/I(q1,r)drdq < qa-lr-ada+l.aexp(-a(n))drdq= O(e-a(")). Case 5: q > 1 and 0 _

Figure 6.4: Case 6: q > 1 and q - 1 < r < q.

Case 6: q> 1 and q -1 < r < q. Settingw= 1-q+r, wehave

r2 + q2 _ 1 (2- w)w x = 2q =r- 2q =r-eCw/q); 1 h = VqqV/-(1 + q + r)(1 + q - r)(1 - q + r)(1 - q - r)

= 2-_V/(2/+ w)(2- w)w(r + q- 1) =o since

0

75 2q_< (2q+w) <3q;

r< (r-t-q-1)_<2r.

Thus

r : O(whd-1); : 0 (w(d+l)/2(r/q)(d-i)/2); es_mp <_ i_d_ihd-l(r -- x) = o(r/q);

: _drd-i_cx/r) : ocrd-l(1 -- (x/lr)) (d-l)/2) = OCCwr/q) cd-1)/2) = OCt/w). Pd

Now

I(q,r) drdq = 0(1) e-nr drdq flOo fqq-1 /OO fqq-1\(qd-l,"_ ] (__)d+l (__)

-- 0(1) /OOfoi q-2w-I ()d-_rq pd+2e-nr dw dq

-- O(X) fll°° folq-2w-lh-2drd+2e -rip dwdq

= O(X)/°°folq-2wh-2rde-nr dwdq.

Substituting t =nP; dt = O(t/w) dw gives

I(q,r) drdq = O(1) q-2 e-t--dq (6.4) fxf/-1 /lf: (h) t - = O(n -d) fi/o"q-lwr-ltd-le-t dtdq.

Since the F-region contains a ball of radius w/2, I' = a(w d) and w = O(rUd) = O((t/n)Ud).. Also r 1/2 >_ w 1/2 and r 1/2 >_ (q- 1)1/2, so _-1 =o (1.(t/_)l/'_.(q- _)-_/2), and

fl°° fqq_lI(q,r)drdq - O(n-d-(1/2d)) (fl°°q-l(q--1)-l/2 dq) (fontd-l-(1/2d)e-t dt) = O(n-d-(X/2d))B(1/2,1/2)r(d - 1 - (1/2d))

= O(n-d-Cll2d)).

76 X / r-x

Figure 6.5: Case 7: q > 1 and q _

Case 7: q> 1 and q <_r <_q + l.

<__d/#d = d; P = _(1); esimp _ Ud_12d-lx = Ud_12d-l(r -- X/r 2 -- 1) -- O(r-1).

Thus, since r-1 _

fxOO¢/qq+l I(q,r) drdq = [1co .q/q+l qd-lr-dO(llO(r-1) exp(--na(1))dr dq

= O(e-n(")).

Case 8: q > 1 and q + 1 <_r. In this case 72c B and OB N 72= O, thus _ = 0 and the integral vanishes.

Examining all eight cases, we see that Case 1 dominates, and that

d! d d- 1UdVdn ESn "_ (d + 1) (6.5) [] Applying (6.5) for d = 2, we obtain ESn "¢ 2n. This is confirmed by well-known combinatorial results. We also have 24_r2 ESn "_ _n35 "_ 6.77n for d : 3; 286 ESn "" ---6-n _ 31.78n for d = 4.

These values are not obviously inconsistent with the values 6.31 and 25.6 found empirically by Avis & Bhattacharya for (rather small) samples of 1000 points chosen from the unit hypereube [2,

77 Table 1]; it is reasonable to conjecture that (6.5) in fact holds for point sets chosen from a uniform distribution on any convex body.

6.3 A Fast Algorithm for the Unit d-Ball

It is immediate from Theorem 6.6 and the discussion of the lifting function at the beginning of the chapter that the Voronoi diagram of random points from a d-ball can be constructed in O(n 2) time on average by either the shelling or the gift-wrapping algorithm. In this section, we describe an algorithm requiring only O(n) time on average. The algorithm we use constructs the Voronoi dual and is similar in spirit to Maus' planar algorithm [65]: it employs standard bucketing techniques, and its operation in R d corresponds to the operation of the gift-wrapping algorithm in R d+l. It will be convenient to call the d-simplices of the Voronoi dual cells and the (d- 1)-simplices facets (since they are facets of the cells); likewise, we will call an empty d-sphere defined by the vertices of a cell a cell sphere and a (d - 1)-sphere defined by the vertices of a facet a facet sphere. The algorithm proceeds by repeatedly finding a new cell adjacent to a known facet. Except for facets that are also facets of the (d-dimensional) convex hull, every facet belongs to exactly two cells. We maintain a dictionary of facets for which only one cell is known. At each step a facet is removed from the dictionary and its unknown cell (if it exists) is found by searching for the unknown (d + 1)st vertex (the site search). The remaining facets of the new cell are searched for in the dictionary (facet searches). Each that is found is deleted, since both of its cells are already known. Each that is not found is inserted so that its unknown cell will be searched for in some later step. The algorithm is described more formally in Figure 6.6.

The facet dictionary is organized as a linear array of n buckets; a random facet falls into a particular bucket with probability 1/n. Within each bucket facets may be organized in a balanced search tree to insure good (logarithmic) worst-case performance, but a simple linear list is sufficient to achieve a linear bound on expected time. The pseudo-code of Figure 6.7 describes the searching function Find_site. To speed the site searches, we partition the hypercube [-1,1] d into 2dn/_d hypercubic boxes of volume _d/n and side (Dd/n) lid and assign each site to the bucket for the box in which it lies. Boxes lying completely outside the unit ball will always be empty. Boxes lying inside the unit ball will contain in expectation one site each. An asymptotically vanishing fraction of the boxes will intersect the boundary of the

78 Algorithm A -- Box(x) is the bucket for the box containing x. m f_diet is the facet dictionary and contains (facet,halfspace) pairs. for x e Xn do Box(x):= Box(x)O {x}; Find an initial (facet,halfspace) pair (3r, J_) by gift-wrapping; Insert (7, _[, f _dict ) ; while f_dict _ 0 do (3r, )_) :-- any pair from f_dict; new_v := Find_site(7, _); delete(7,f _dict); if new_v _ nil then Output(7 u {new_v}); for vC 7 do 7' :-- (7 \ {v}) U (new_v}; _' :-- the halfspace defined by jrt not containing v; if 7 _E f_dict then Delete(F', f_dict) else Insert(j r', _[I,f _dict ) ; end

Figure 6.6: Algorithm A. unit ball and will contain less than one site each in expectation. On a particular call to Find_site, let 7, X, and _/ be as in Figure 6.6, and let B be the cell ball of the unknown cell if it exists. If the cell exists, it is necessary and sufficient to examine those boxes intersecting (B D _ D _/) to determine it. If it does not exist, it is necessary and sufficient to examine the boxes intersecting ()_ A U) to determine this. The function Find_site examines exactly those boxes. The priority queue operations Insert, Find_rain, and Delete_rain can be implemented so that only O(log n) time is required for each [92], but a naive linked-list implementation in which each operation requires time proportional to the length of the list suffices for the purposes of our average-case analysis.

We must now show that all the facet and sites searches can be completed in O(n) time on average.

Lernnaa 6.7 The facet searches can be completed in O(n) expected time.

Proof. We maintain n buckets and hash facets into buckets by exclusive-or'ing the binary repre- sentations of the indices of the input points defining the facet. Within each bucket we maintain a linked list of the facets in the bucket; each search or insertion in the bucket takes time proportional to the length of this list. It is not hard to see that this scheme hashes an equal number of the d-subsets of {1, 2,..., n} to each bucket, but we must show that those d-subsets actually defining

79 function Find_site(7, _) --/2 is the unit d-ball. Site_seg_vol(x) is the volume of the intersection of _ N _/ and the d-ball defined by x and the vertices of 3r. -- Box_seg_vol(B) = min Site_seg_vol(y) for y a corner of B. -- q is a priority queue of boxes ordered by Box_seg_vol. Insert(Box(center of facet sphere of _r), q); ans :-- nil; while (q _ O) A (Box_seg_vol(Find_min(q)) < Site_seg_vol(ans)) do B := Delete_rain(q); for xE B do if (z _ )4) A (Site_seg_vol(x) < Site_seg_vol(ans)) then ans := z; for B' adjoining B do if ((B' fqJ_ n_/) =/=0)A (B' _ q) then Insert( B', q); return arts; end Find_site

Figure 6.7: The site-search procedure Find_site. facets are not correlated in a way that causes them to be hashed to a small number of buckets.

Let C(n,d) -----(_), let Pl,P2,...,PC(n,d) be the d-subsets of Xn, and let Mk be the number of facets hashed into the kth bucket. The average amount of work done for the facet searches is

hence it will suffice to show that E(M_) = O(1) for any fixed k. Let Fi represent the condition "Pi is a dual facet", and let Ii be the indicator variable of Fi, i.e., Ii = 1 if/9/is a dual facet, otherwise Ii = 0. Let Bk be the set of indices i for which Pi hashes into the kth bucket. It is easy to verify that IBk = n-XC(n, d). Without loss of generality let us consider the bucket containing Pl. Then

E(M_) = E Zi

= EMk+ Pr(F,I.Pr{FjIFi} iEBk iEBk

8O = O(1)+n-le(n,d)Pr{F1}. _ Pr(FilF1} _'EBk i#1

By Theorem 6.6 the expected number of facets is O(n), so Pr{F1} = O(n/C(n, d)). The conditional probability Pr{Fi I F1} depends only on IPi \ Pll. If IP_"\ Pll = rn > 0, then there are at most C(n- d, m- 1)/C(rn, m- 1) = O(n m-l) ways to choose the m elements of Pi \ Pl so that Pl and Pi fall into the same bucket. This holds because the choosing the first m - 1 elements fixes the ruth. (We must say "at most" because the ruth element required by the first m - 1 sometimes belongs to Pl and cannot be chosen. This always occurs when m -- 1, for example.) So

E(M 2) - O(1)+O(1)- _ O(n m-1) pr(F ilriA(p/\p,l-rn)} l

s %

-- O(1) + O(1/n) . _ (n - d_ pr, FI , F1A (,Pi \ PiI = m)} l

-- 0(1), since the number of d-subsets (in all buckets together) satisfying IPi \ Dil- m is exactly (n_d). [] We now turn to the search for the (d-[- 1)st site completing a cell with a known facet. We call a site search "successful" if a site is found, and "unsuccessful" if no site is found because the facet lies on the boundary of the convex hull.

Lenmla 6.8 The successful site searches can be completed in O(n) expected time.

Proof. If we define the distance between a point x and a set y by

dist(x, y) = min dist(x, y), veY all the boxes intersecting _ N )4 N _/ are completely contained by the set

.4 - { x Idist(x, _; n L/) _

Assuming the na'/ve linked-list implementation, the cost of the each priority queue operatiort is at most proportional to the total number of boxes examined. The cost of examining the sites in a box

81 is O(1) in expectation. The expected total cost of the site search is therefore proportional to the square of the number of boxes examined, or

ecost = O((n. vol_)2). (6.7)

If we write Cn for the total cost of all successful site searches needed to compute the Voronoi diagram of Xn, we have

ECn <_ d + 1 ... ecost(xl,... ,Xd+l)(1 -- r)n-d-lg(zl).., g(Xd+l) dxl ""dXd+l; R d R/d

The (d + 1)-foldintegralrepresentsan upper bound on the expected cost of a successfulsitesearch to complete the cellXlX2...Xd+l. Proceeding as in §6.1,we eventuallyobtain

ECn -- O(1). /5/5ecost(q,r)I(q,r)dr dq.

We continue as in §6.1, dividing the domain of integration into eight regions. However, we ignore constant factors this time. Cases 1, 2, 3: In all these cases,

The set _ of (6.6) is contained in a d-ball of radius (r + V_(l_d/n)l/d) d and, by (6.7), with t -- nr - e(nr d) as in §6.2,

_/ )__=oo(.(.t(,q,+,,.-', /".))=o =o =o(_+t).

It follows that

ecost(q,r)Z(q,r) drdq = e(1) qd-l dq (1 + t) = --dt /o'/o' //o'/(/o t ) = ec.-d).

Cases _, 7: For these cases we apply the trivial bound ecost(r, q) : O(n2), and f = o(.')f f I(,..),.,, = O(e-n("))

82 Cases5,8:In thesecases_ --0 and theintegravanil shesasbefore. Case 6: In thiscasethesetA of (6.6)iscontainedina (d- 1)-sphericaclylinderwith height (w A- 2v/d(l_d/n) 1/d) and base radius (h + v/d(_d/n)l/d), so

w/ecost(q,r) = 0 (n(w-4- n-1/d)(h A-n-l/d) d-l)

= O(n). (i_=od-1Wn-i/dhd-l-i-+-En-i/dhd-i i=1d ) = o(,1. _h_-_+ _(,0 + + i=1

= 0 nwh a-1 + _hin i/d+ 1 (6.8) ( i=1 ) = O nr+]>-_ i=0 = 0 ((h/w)Cnr + 1)), since w = O(h) and h = (hr/w) 1/d. Computation is similar to the nrst subcase of Case 6 in the proof of Theorem 6.6, substituting t = uP, to the point of (6.4), where this time we have

oo q ecost(q,r)I(q,r)drdq = O(n-d) ecost(q,r) 2td-le-tdt -1 f0° = O(n-e) (1 + t)'ta-le-' dt = o(,-d).

Summing overtheeightcases,we seethat

fo °° fo °° ecost(q, r)I(q, r)dr dq = O(n -d) and EC. = O(.d+_)0. (. -_)= o(.). [] Finally,we considertheunsuccessfuslitesearchesforfacetsoftheconvexhull.

Lemma 6.9 The unsuccessful site searches can be completed in o(n) expected time.

83 Proof. An unsuccessful search requires examination of all the boxes intersecting _ n _/; these are completely contained in -- {x I distCx, _ n _) < x/-dC_d/n) 1/d }.

As before, =

Let us write Un for the total cost of all unsuccessful site searches. These occur only for :facets of the convex hull. We have

Here P is the probability content of the smaller halfspace defined by xl through Xd, and (1 - :[,)n-d is the probability that all the other sites lie in the larger halfspace. We apply the method of §3.3 to obtain NOr.= O(nd) . /:eeost(r) esimv(r)(O(r))dexp(--nr) dr where

r is the distance of the hyperplane )? from the origin, esimp(r) is the expected volume of the simplex xlx2"..Xd given the distance r,

ecost(r) is the expected cost of the unsuccessful search given r, _(r) is the density of the probability that a random point falls on )1, and

exp(-nP) estimates the probability that J_ defines an empty halfspace.

Let w = 1 - r and h = x/1 - r 2 = O(x/_ as in Figure 6.8. Then

r(r) = O(wh d-x) = O(h d+l)

With t =nP or h =: (t/n) 1/(d+1), we have

dr = -dw= O(1) dtt

esirnvCr) = OChd-') = e ((t/n)(d-i)l(d+')) W ; & _(r) = OChd-') = 0 ((t/n)(d-')/(d+')) ;

v/ecostCr) = O (new + n-1/d)(h + n-'/d) d-')

84 Figure 6.8: An unsuccessful search.

-- e nwh d-1 + _ hin i/d as in (6.8) di--O-1 /

= 0 t+ tnl/d) i/(_+1) i=O

and

BUn -- o(nd)_o n (n2(d-1)/(dU+d'(l_-t)2) (t) (d-1)/(d.l, (_)d(d-l'/(d+l'e-t (nt--) 2/(d+1} dAt

--- O(n 1-(2/(d2+d))) fort(1 + t)2t d-2+(2/(d+1)) dt =

[] Lemmata 6.7, 6.8, and 6.9 together imply the following theorem.

Theorem 6.10 Let Xn = {X1,X_,... ,Xn} be a set of n sites drawn independently from the uni- form distribution on the interior of the unit d-ball. Then for fixed d, Algorithm A constructs the Voronoi diagram of Xn in O(n) time on average.

This algorithm is clearly optimal in the average-case sense, and is asymptotically faster' than any other known. If a balanced-tree implementation of priority queues is used, the running time of this algorithm is O(Snn log n), only a factor of (}(log n) worse that the standard gift-wrapping algorithm. Worst-case performance can be improved to O(nSn) if the use of buckets is abandoned

85 on any site search that examines _ buckets. It should not be difficult to show that this occurs so infrequently that average performance is not affected. It is easy to show that linear performance is preserved if the distribution is "quasi-uniform" in the unit d-ball, i.e., if its density bounded above and below by a positive constant everywhere in the d-ball. It is an open question whether the same approach yields an O(n) algorithm -- or even linear bounds on S_ -- for other distributions.

86 Chapter 7

Voronoi Diagrams in the Plane

In this chapter we consider a special case of the problem addressed in Chapter 6, namely, the construction of the nearest-site Voronoi diagram and Delaunay triangulation in two dimensions. Several approaches have been taken to the design of algorithms for this problem. It is interesting to note that most of these approaches amount to using a well-known three-dimensional convex-hull algorithm to construct the convex hull of the images of the input points under the mapping described at the beginning of Chapter 6, even though all were apparently developed without knowledge of this connection between the two structures. The result of our effort is a simple, practical variation of the divide-and-conquer algorithm that has optimal O(n log n) worst-case performance and very good O(n log log n) performance on average for points from the uniform distribution on the unit square. Shamos & Hoey [86] present the first divide-and-conquer algorithm for constructing the Voronoi diagram and show its O(n log n) worst-case running time to be optimal under the real-RAM model of computation. Lee & Schachter [62] describe a dual algorithm for constructing the Delaunay triangulation. Guibas & Stolfi [49] advocate the Delaunay triangulation as an intermediate step in the construction of the Voronoi diagram, and present ideal data structures for the problem. Hwang [51], Lee & Wong [63], and Lee [59] present O(n log n) divide-and-conquer algorithms for the Voronoi diagram in the L1, L1 and Loo, and general Lp metrics respectively. Ohya, Iri & Murota [71] show the average running time of these divide-and-conquer algorithms to be _t(n log n) when the sites are uniformly distributed in the unit square. Fortune's [45] sweepline algorithm uses a clever geometric transformation to achieve O(n log n) worst-case time. Only the trivial bounds on its average performance are known. Various incremental algorithms, which construct the Voronoi diagram by adding new sites one

87 by one,requireO(n2)timein theworstcase,sincea new sitemay be a neighborto allpreviously insertedsites.Such methods diffeprinr cipallyintheorderinwhichsitesareadded.Sibson& Green [48]were earlyadvocatesofsucha method,givingan algorithmwithO(n 3/2)averagerunningtime. The algorithmofOhya, Iri& Murota [71]attainsoptimallinearaveragetime. AlgorithmsofanothersortconstructtheDelaunaytriangulationtrianglbye triangleM.cLain's algorithm[66]requiresO(n 2) time in the worstcase.An improvement describedby Maus [65] requiresonlylineartimeon theaverage. Finally,Bentley,Weide & Yao [6]proposea complicatedhybridalgorithm.Itinvokesa divide- and-conqueralgorithmon the"outer"siteslyingneartheboundaryoftheunitsquare.The Voronoi polygonofeach "inner"siteisconstructedinconstantexpectedtime by a "spiraslearch"among itsneighbors.In the unlikelyeventthatmore than O(logn) time isexpendedon any innersite, spiralsearchisabandoned and the divide-and-conqueralgorithmisappliedto the entiresetof sites.Whileachievingasymptoticallyoptimalperformanceinboth theworstand averagecases,its implementationiscomplicateddue to itsuseoftwo distincmtethods. We presentan easilyimplementedimprovementtothedivide-and-conquaelgorithms,r including those for the Lp metric for 1 < p < oo. If the floor function can be computed in constant time, the change maintains its O(n log n) worst-case complexity, but lowers its average-case complexity to O(n log log n) when the sites are drawn independently from any of a large class of distributions which includes the uniform distribution on the unit square. Our experimental evidence demonstrates that in the Euclidean metric the improved algorithm performs very well for n < 216, the range of the experiments. We conjecture that the average number of edges it creates m a good measure of its efficiency m is no more than twice optimal for n less than seven trillion.

7.1 Preliminaries

In this chapter it will be convenient to adopt somewhat different notation. The input sites will be P1, P2,..., Pn. The upper-case letters P, Q, M, N, etc., will be points in the planes; their coordinates will be (xp,yp), (xQ, yQ), etc. We now restate some useful facts about Delaunay triangulations and Voronoi diagrams given by Lawson [58], Shamos & Preparata [73], and Guibas & Stolfi [49]. We take Guibas & Stolfi's very complete exposition as our point of departure and often speak only in terms of their algorithm. In fact, our results extend to the other divide-and-conquer algorithms without difficulty.

88 Lemxna 7.1 Every triangulation of a set of n sites of which k lie on the convex hull (i.e., on the boundary of the convex hull) has 3n - k - 3 edges and 2n - k - 2 triangles.

Proof. Let t be the number of triangles and e the number of edges. All the edges belong to two triangles except for the k on the convex hull, thus 2e - k = 3t. The results are obtained by solving this equation for t or e and substituting into n - e + (t + 1) - 2 (Euler's formula). []

Corollary 7.2 At most 3n - 6 non-intersecting edges can be constructed on a set of n sites.

Proof. Any set of non-intersecting edges can be extended to form a triangulation. [] Lawson [58] and Sibson [87] have showed that, degenerate cases with four cocircular sites ex- cepted, the Delaunay triangulation is the only triangulation in which every triangle satisfies Lemma 6.2. Guibas & Stolfi's algorithm first constructs the Delaunay triangulation. The Voronoi diagram is then found in linear time. The Delaunay triangulation is constructed as follows:

1. The sites are sorted by increasing x-coordinate.

2. If there are three or fewer sites, the Delaunay triangulation is constructed directly. Otherwise, the sites are divided into two approximately equal sets by a vertical line, Step 2 is recursively applied to construct the Delaunay triangulations of these sets, and the results are merged.

The merge procedure forms the foundation of our own algorithm as well as of Guibas & Stolfi's. Let _ be the dividing line of Step 2, and let L and )_ be the sets lying to the left and the right of £. Clearly two edges of the convex hull of/_ to _ cross/_. Merging begins with a search for the endpoints of the lower of the two. The search begins with the sites of _ and )_ lying nearest /_ and alternately advances clockwise around the convex hull of L and counterclockwise around the convex hull of )_. Once its endpoints are found and the lower hull edge is created, the other new edges are created in the order in which they cross £. Old edges are removed if intersected by a new edge. The essential features of the merge procedure are summarized in the following theorem:

Lernnaa 7.3 Let _, _ and ]_ be as above. When merging DT(L) and DT(]_) to construct DT(£to

a) Only edges joining two sites in L and edges joining two sites in ]_ are deleted.

b) Only edges crossing _ and joining a site in _, to one in ]_ are created.

89 c) The worst-case running time of the merge is bounded by a function linear in the sum of three components:

i} the number of sites ezamined to find the endpoints of the lower of the two edges of the convez hull of _ U ]_ which cross e.

ii} the number of edges deleted, and

ii O the number of edges created.

d) The worst-case running time of the merge is O( £, u £ ).

Proof. Guibas & Stolfi. []

Our analysis requires some refinement of Lemma 7.3.c. Lemma 7.4 will be used to bound running time analytically, while Corollary 7.5 justifies the use of an easily collected statistic as a measure of efficiency.

Lernma 7.4 The running time of the Guibas-Stolfi merge procedure is bounded by a linear function of the number of sites to which it attaches new edges.

Proof. By Lernma 7.3.c, it suffices to show that, if new edges are joined to m sites, then at most 3rn - 6 edges are created, at most 3m - 9 edges are deleted, and at most rn sites are examined to construct the lower convex-hull edge. Since the edges created are non-intersecting, by Corollary 7.2 there are at most 3rn- 6 of them. Now suppose that /_, _ and /_ t2 _ consist of nl, n_ and n sites respectively, of which ks, k2 and k lie on the convex hull. If c edges are created and d deleted, we have by Lemma 7.1 that

(3.1 -- k 1 - 3)-_- (3n 2 - k 2 - 3)-_-c- d-- 3.- k- 3.

Sincen:nl+n2 andk_

Corollary 7.5 The running time of the merge procedure is bounded by a linear function of the number of edges it creates.

90 ( Step 0: Sort Sites into Buckets

for P E X,, do insert P into Bl,nzpj,lm_pj { Step 1: Triangulate Cells } for i:=0tom-1 do for j :--O to m-1 do DTii := Guibas_Stolfi_DT(Bii) { Step 2: Merge Cells into Rows } for k:=0to [lgm]-i do fori:=0 tom-1 do forj:=0tom-1 by2 k+l do DT_i := merge( DTii, DTi,i+2_ ); { Step 3: Merge Rows } for k:=0to [lgm]-i do fori:=0tom-1 by2 k+l do DTio := merge(DTio, DTi+2k,0); return DT00;

Figure 7.1: Algorithm A.

'/.2 A Faster Algorithm and Its Worst-Case Running Time

The unit square is partitioned into [x/n/logn] 2 = O(n/log n) square cells with sides of length Ix/n/log n]-1 < v/log n/n. The Delaunay triangulation of the sites within each cell is constructed with the Guibas-Stolfi algorithm. The triangulations within each row of cells are merged in pairs until the triangulation of the row has been completed. Then row triangulations are merged in pairs to complete the triangulation of the entire set of sites. (Fig. 7.1.)

Theorem 7.6 Algorithm A uses O(n log n) time in the worst case.

Proof. Step 0 requires O(n) time, since the floor function is assumed to be computable in constant time. Number the cells arbitrarily and let ni be the number of sites in the i-th cell. Since Guibas & Stolfi's algorithm requires O(n log n) time in the worst case, Step 1 can be completed in time _,i O(n, log hi) <_ _F-,Oi (ni log n) < O(n log n). In Step 2, no site is involved in more than a single merge for any fixed value of k. Since by Lemma 7.4 the number of points involved in the merges bounds the running time, no more than

91 J r J m l_ O(n) time is required for each iteration of the k loop. The k loop is executed lg(/x/n/log n/) lg n times, thus O(n log n) time is required for Step 2. Step 3 can be handled exactly like Step 2. Summing the times for the three steps, we find that O(n log n) time suffices. []

7.3 Analysis of Expected Time

We will call a distribution with density function f quasi-uniform in a region if, for some strictly positive constants cl and c2, f(x,y) -- 0 outside the region and cl < f(x, y) < c2 inside the region. By modifying constants in the proofs which follow, it is not difficult to show that the expected running time of Algorithm A is O(nlog log n) for any quasi-uniform distribution in a rectangle. However, we restrict our attention to the uniform distribution in the unit square for the sake of expository simplicity. During Step 1, we accept the O(n log n) worst case performance of Guibas & Stolfi's algorithm to construct Delaunay triangulations within each cell. We need only take care to show that it is unlikely that very many sites fall within a single cell, making worst-case performance for that cell unacceptably large. We next show that when merging two Delaunay triangulations most sites to which new edges are constructed must lie near the boundaries of the two triangulations. We are then able to bound the expected number of sites near enough to the boundary to receive a new edge. By Lemma 7.4, this essentially bounds the running time. In the sequel _2denotes the unit square, and Xn -- {P1, P2,..., Pn} is a set of n sites chosen independently from the uniform distribution on l/.

Lemma 7.7 The probability that any fixed cell contains more than e log n sites is less than l/n, where e is the base of the natural logarithms.

Proof. The probability that a given site lies in a given cell is equal to the area of the cell, which is less than log n/n. Call this probability p, and let N be the number of sites in the cell. Then N has a binomial distribution, and

Pr{Y>elogn}= _ Pr{N=k} k>_e logn

_< _ e(k-el°gn) Pr{Y = k} k>e logn

92 k_>o --- n-e(ep -b 1 - p)n

-- n -e exp(n log(1 q- p(e- 1))).

Since log(1 + x) < x for x > O, it follows that

Pr{N >_elogn} < n-" exp(nv(e- 1)) < n-' exPC(log n)(e - 1))--n-e.n e-x-- 1/n.

[]

In passing, we note that the same argument can be used to show that the probability that a fixed cell contains more than (e + m - 1) log n sites is at most 1/n ra.

Lemma 7.8 Step 1 of Algorithm A requires O(n log log n) expected time.

Proof. Since there are O(n/log n) cells, by Lemma 7.7

Pr{ omeceU >_ ite }_ og. sites} -- O(1/log n).

Since even these inputs can be handled in O(nlogn) time, their contribution to the expected running time of Step 1 is at most O((nlogn). (1/logn)) -- O(n). For the other cases, at most n/log n subproblems of size at most e log n must be solved using the O(n log n) worst-case algorithm of Guibas & Stolfi, making the total expected time O ((n/log n)e log n log(e log n)) or O (n log log n) for Step 1. []

We now bound the expected time of the merging steps. We first show that most new edges lie in a region near the boundaries of the triangulations being merged. Informally, this holds because an edge passing from a site far from the dividing line of the merge across the dividing line to another site must necessarily be long, and so by Corollary 6.3 it must also be a chord of a large site-free circle. Since large regions are unlikely to be site-free, such long edges are also unlikely. An exception to this observation is a long edge lying entirely near the boundary of the merged triangulation, e.g., an edge of the convex hull. Such an edge may correspond to a large site-free circle which lies mostly outside the region of the triangulation. The edge is not demonstrably improbable, since the portion of the circle lying inside the region of the triangulation may be very small. It is this case which compels us to draw the hyperbolic boundaries in the proof of Lemma 7.11.

93 Once we have delimited where the likely endpoints of new edges lie, it is an easy matter to calculate the expected number of sites in these regions. A bound on the average running time then follows from Lemma 7.4.

The following lemma will be proved in a more general setting in the next section:

Lemnaa 7.9 Let P, Q, and P' be three points lying along a line in that order. Let M and N be the other vertices of the square with diagonal PQ. Then any circle passing through P and P' completely encloses either APQM or APQN or both.

The next lemma formalizes the assertion that large regions are unlikely to be site-free:

Lernnaa 7.10 Let p = 21oglogn/n, let T be a subset of Zl with area exceeding p, and let X _ be a set of at least n - 2 sites drawn independently from the uniform distribution on Ll. Then Pr{Z' n T - 0} -- O(1/logn).

Proof. Pr(Z' n T -- O} < (1 - p)n-2 < exp(-p(n - 2)) < logn (I/n)-2 = O(1/logn). [] We can now describe the likely location of new edges.

Lemma 7.11 Let r"IABCD and rICDEF lie inside lg and have disjoint interiors, let 1_ = Xn n []ABCD and £, = Xn n [-"]CDEF, and let h -- ICDI. Then there is a merge region ,M c rGABCD such that: a) Area(N) -- O(hx/log log n/n + log n loglogn/n) b) if P E ]_ but P _ At then p = Pr{ 3P' E/_ I (P,P') E DT(_, u ]_) } = O(1/logn).

Proof. For convenience, we use a coordinate system with origin at C and axes CB and CD. (Fig. 7.2.) We now verify that conditions a) and b) are satisfied by

3q = E]ABCDA {(x,y) l (x < 8¢loglogn/n) V(xy < 241oglogn/n)

v(x(h-y) < 241oglogn/n)}.

Since the two hyperbolas defining Jq are mirror images, and since IBC I <_ 1, we have

Area(.M) _< 8h¢loglogn/n

+2f81/ log log n/ri 241°gl°gndxnx

= O(h¢og log n/n + log n log log n/n).

94 -_--F C B

Figure 7.2: The merge region 34 and I-'IPMQN.

Now let P : (xp,yp) be a site in £ but outside 34, let P' be a site in £, and let Q be the point of intersection of PP* and CD. Let rqPMQN be the square with M above and N below PQ. Further suppose that P and pi are joined in DT(£. u £). By Corollary 6.3, there is a circle C through P and pi whose intersection with F-qABCD is site-free. Then according to Lemma 7.9 either (APQM N F"qABCD) or (APQN N F"IABCD) is site-free. We will show that each of APQM and APQN completely contains one of ten regions in F"qABCD. Thus the site-free circle

C must also contain one of the regions. But each of the regions has area exceeding 2 log log n/n, so by Lemma 7.10 with X I : Xn -- {P, pI}, the probability that at least one of the regions is site-free is O(10.1/log n) = O(1/logn) -- which also bounds the probability that P is joined to a P' in £.

The ten regions are sectors or parts of sectors of the circle of radius ½xP centered at P. (Fig. 7.3.) The central angle of each sector is lr/8. Eight of the regions fall within the sector defined by /CPD. They must completely cover this sector, but they may otherwise be fixed ar- bitrarily. To these are added the sector whose upper boundary is PC and the sector whose lower boundary is PD • Since/P'PM and/P'PN measure _r/4, and [PM[, IPQI and [PN[ exceed. _1xp, each of APQM and APQN completely contains at least one of these sectors, no matter where pi lies in r"qCDEF.

The area of each sector is ½(½xP)2(_r/8) : (r/64)x2p > _rlog log n/n > 2 log log n/n. But part of the sector below PC may lie outside f"qABCD. In this case let J be the point on PC at

95 -_--E D A

P

_- F C B

Figure 7.3: The ten sectors.

distance lxp from P, and let K be the point of intersection of the x-axis and the other side of the sector. Surely the intersection of E:]ABCD and the sector contains APJK. But Area(APJK) --

_12 .1pji . IPKI . sin(Tr/8) >.... ½. ½zp.yR-_ > _Ixpyp > _(241oglogn/n) > 21oglogn/n. Thus we have showed that the intersection of E:3ABCD and the sector below PC is itself large enough. The proof is completed by a symmetric argument applied to the sector above P D. []

Lemma 7.12 Step g of Algorithm A requires O(n log log n) expected time.

Proof. The merge region for each merge in Step 2 has height h < x/log n/n, thus the area of each merge region is O((vqog n/n. _/log log n/n) + log n log log n/n) - O(log n log log n/n). For a particular value of k there are at most 2-k+l n/log n disjoint merge regions with total area O(2 -k log log n). The expected number of sites in these regions is O(2-kn log log n). In addition it is expected that at most another O(n/log n) sites from outside the merge regions will receive new edges. By Lemma 7.4 the expected running time of Step 2 is bounded by

(O(2-k n log log n) + O(n/log n) ) k=0 = o(. loglog + = o(. loglog []

96 Lemma 7.13 Step 8 of Algorithm A requires O(n) expected time.

Proof. The merge region for each merge in Step 3 has height h -- 1 and area O(_/log log n/n). There are at most 2-i:+1 F_/n/log n] merge regions for each k for a total area of O(2-tv/log log n/log-_ - O(2 -t) containing O(2-kn) sites in the expected case. Reasoning as for Step 2, we find the running time in the expected case to be bounded by

(O(2-kn) + O(n/log n)) = O(n). k=O

[]

Adding together the expected time for each step gives the result:

Theorem 7.14 Algorithm A constructs DT(S) in O(n log logn) expected time.

7.4 Extension to the Lp Metrics

In this section we show that the algorithm of Section 3 and its analysis extend to the Lp metrics for 1 < p < oo. We are able to make only a small improvement for the L1 case.

The Lp metric is defined by the distance function distp(P,Q) = (Ixp - xQIP + lYP -- YelP) 1/p for 1 < p < oo and disto_ (P, Q) = max(Ixp-xq[,Iyp-yq[) for p= oo. The L2 metric is the usual Euclidean metric. The L1 metric is sometimes called the rectilinear or Manhattan metric. Hwang [51], Lee & Wong [63], and Lee [59] present O(nlog n) divide-and-conquer algorithms for constructing the Voronoi diagrams in the L1, L1 and Loo, and general Lp metrics respectively. In these metrics each Voronoi polygon is star-shaped with a nucleus at the associated site, but it is not necessarily convex. The straight-line dual of the Voronoi diagram is called the Delaunay triangulation and denoted DTp(Xn), although in fact DTI(Xn) and DToo(Xn) do not generally triangulate the convex hull of Y.n. Lee [59] shows that Lemma 6.2, Corollaries 6.3 and 6.4, and (essentially) Lemma 7.3 hold for 1 < p < oo when the word "circle" is taken to mean the locus of points at a constant distance in the Lp metric from some given point. From these Lemma 7.4 and Theorem 7.6 follow. He also proves the following useful lemma.

Lemxna 7.15 The locus of points equidistant from two given points, called their bisector, is either nonincreasing or nondecreasing if viewed as the graph of a function y - f(x).

97 I

P

p, Figure7.4:Bounding theareaofLp circles.

We can now extendtheaverage-caseanalysisto 1 < p < co.

Theorem 7.16 Algorithm A constructs DTp( Xn) in O(nlog log n) expected time for 1 < p < co.

Proof. The analysis of Section 4 lacks only an Lp version of Lemma 7.9, which we now provide:

Lemma 7.17 Let P, Q, and P_ be three points lying along a line in that order. Let M and N be the other vertices of the square with diagonal PQ. Then for 1 < p < co any Lp circle passing through P and pi completely encloses either APQM or z_PQN or both.

Proof. Let M I and N I be the other vertices of the square with diagonal ppi lying on the same side of PP' as M and N respectively. Since APP_M r and APP_N _ contain APQ, M and APQN respectively, it suffices to show that the circle contains either M r or N r. (Fig. 7.4.) Without loss of generality we consider only the case in which a > b > 0, P = (0,2b), P' = (2a, 0), and M _ -- (a + b, a + b), and show that the center of the circle C must lie nearer M _ than P if it lies above the line through P and P_. The triangle inequality implies that the bisector of P and P_ passes no nearer to P than (a p + bP)1/p at R = (a, b). If C = R, M' clearly lies on the perimeter of the circle. Since both M' and R lie on the bisector, it is nondecreasing rather than nonincreasing. Thus a <_xc < a + b and b < Yc ___a+ b and Yc > a+b, so distp(C, M r) -- ((xc-(aq-b))P+(yc-(a+b))P) 1/p < (xPcq-(yo-2b)P) lip -- distp(C, P). []

98 It is the L1 and Loo Voronoi diagrams which are of most interest in applications. Unfortunately, implementation and analysis of these cases are complicated by the fact that some convex hull edges may be omitted from the Delaunay triangulation. For these cases, the Guibas-Stolfi merge procedure must be modified along the lines of Lee's dual algorithm to search downward for the endpoints of the lower convex hull edge then back upward for the endpoints of the lowest Delaunay edge crossing the dividing line. Thus some sites examined in the search do not receive new edges during the merge, and Lemma 7.4 no longer holds. The analysis of Section 4 still bounds the expected number of edges created, but not the overall running time. We call the edges of the infinite face of the Delaunay triangulation boundary edges and their endpoints boundary sites. We show that the Loo boundary sites of a set of sites contained in an orthogonal rectangle in the unit square all probably lie near the boundary of the rectangle. There are therefore so few of them in the expected case that searching through them does not dominate the cost of the merge step.

Lemina 7.18 A site P : (xp,yp) is a boundary site in DToo(X,) only if at least one of the quarter-planes defined by the lines x : xp and y : yp is site-free.

Proof. The result follows immediately from Lee's observation that an edge is a boundary edge in DToo (Xn) if and only if one of the two quarter-planes defined by horizontal and vertical rays passing though the endpoints is site-free. []

Lemina 7.19 Let F"GABCD be an orthogonal rectangle lying inside _. Then there is a boundary region B c rGABCD such that

a) Area(B) = O(log n log log nln)

b) if P E (XnM F"GABCD- _) then p= Pr{P is a boundary site in DToo(Xn)} = O(lllogn).

Proof. For convenience, we use a coordinate system with origin at C and axes CB and CD. We now verify that conditions a) and b) are satisfied by

B -- rGABCD M{ (x, y) I (xy < 2 log log nln)

v(x(h - y) _<2 log log nln)

v((w - z)y <_ 21oglognln) V((w - x)(h - y) <_21oglognln)},

99 where h = CD[ and w = ICB]. The bound on Area(B) is straightforward since h, w _<1. Without lossof generality suppose that P = (xp, yp) liesin the lower left quarter of F1ABCD but outside _. Then the smallest of the four rectangles into which rlABCD is partitioned by the vertical and horizontal lines through P has area xpyp _>2 loglogn/n. By Lemma 7.10 the probability of its being site-free is at most O(1/logn). Therefore by Lemma 7.1Sthe probability that P is on the boundary is at most O(4/log n) -- O(1/logn). [] We can now bound the running time of Algorithm A for the Loo metric.

Theorem 7.20 Algorithm A constructs DToo(Xn) in O(n log log n) expected time.

Proof. The analysis of Step 1 in Lemma 7.8 remains valid, since the worse-case performance of the L1/Loo merge procedure is O(n log n). Since the boundary region of any rectangle is smaller than its merge region, the calculations of Lemmata 7.12 and 7.13 for Steps 2 and 3 apply to the number of sites examined in boundary-edge searches as well as to the number of sites receiving new edges. Therefore the overall bound of Theorem 7.14 applies as well. []

The L1 case is more difficult. It follows from Lemma 7.18 and the well-known correspondence between the L1 and Loo metrics that a site P is an L1 boundary site if and only if one of the quarter-planes defined by the lines through P with slopes +1 and -1 is site-free. The L1 analog of Lemma 7.19 states that all points within a (Euclidean) distance of x/_2log log n/n of the boundary fall within the boundary region. Analyzing the number of sites examined in lower-boundary- edge searches in Steps 2 and 3 yields an O(nx/]og n log log n) bound, which is only a marginal improvement to the original O(n log n) expected running time.

7.5 Experimental Results

A variation of Algorithm A for the Euclidean metric has been implemented for practical evaluation. In this variation, the set of n sites is divided into [_/n/log n] equal subsets by horizontal lines. Then the Guibas-Stolfi algorithm, which divides with vertical lines, is applied to each subset. Finally the results are merged in pairs as in Step 3 of Algorithm A. (It is notable that the Guibas-Stolfi merge procedure performs the horizontal merges of the second stage without recoding if the sites are sorted by decreasing y-coordinate.) This variation is somewhat easier to implement but more difficult to analyze. Intuitively, we expect its behavior to differ little from that of Algorithm A for

100 reasonably large n, since it should divide the unit square into horizontal strips of about the same width as the buckets.

- 10

_ _ -6_ o o o ° ° ° " .--v Modified Algorithm 16I I 64I I 256I I 1024I I 4096I I 16384I I 65536I 2 n = number of sites (log scale) Figure 7.5: The algorithms compared.

This variation and the original Guibas-Stolfi algorithm were run on inputs generated by drawing sites from the uniform distribution in the unit square. Twenty inputs of size 2k were generated for 4 < k < 16. The results are summarized in Fig. 7.5, which plots the mean number of edges created per site as a function of n, the number of sites. The small variance can be safely ignored. Our measurements for the original algorithm match closely those of Ohya, Iri & Murota. It is clear that the modified algorithm is significantly faster for all but the smallest values of n.

- 3.0

i:m -2.50 o

-2.0 g

I I I I I I I 1.5 0 20000 40000 60000 n = number of sites Figure 7.6: Estimating the constant factor.

Fig. 7.6 is useful in estimating the constant factor c in the upper bound cn log log n on expected running time. If c is asymptotically less than 1.77 as Fig. 7.6 suggests, the number of edges created by the algorithm would be less than 6n for n < exp(exp(6.0/1.77)) _ 7 × 1012. Since about 3n edges are required in the final diagram, we conjecture that the running time is no more than twice optimal for n in this range.

101 Chapter 8

Conclusions and Conjectures

We have investigated the asymptotic behavior of the expected number of vertices and facets of the convex hull of i.i.d, points from any of a wide variety of distributions on R d, and have shown that the average running time of both the gift-wrapping and the shelling algorithms are o(n _) for all the distributions investigated. The most striking pattern to emerge from these investigations is that, although Fn = O(V_ d/2j) in the worst case, EFn grows linearly in EVn for every distribution considered. This leads to our first conjecture.

Conjecture 8.1 For any distribution with a density function in R d,

EFn = O(EVn) - o(n).

We also note that the worst case, EFn -- O((EVn)Ld/2J), is achieved for any distribution with a density on the d-dimensional moment curve, a one-dimensional manifold. Also, for the uniform distribution on the surface of a d-sphere, a (d- 1)-manifold, EFn "-- O(EVn) = O(n). It seems plausible to conjecture the existence of a function f(d, c) for which the following holds.

Conjecture 8.2 For any distribution with a density on a c-dimensional manifold in R d,

SFn = O((SVn)Y(c'd)); furthermore, there exist such distributions for which

EFn = O((EVn)Y(e'd)).

102 The function f could, for example, have the form

f(d'c) = [d-c+ lj2 The expected probability content of the convex hull, EPn is related to the expected number of vertices by Efron's [39] equation EV.+ i EP.= 1 n+l In the case of a uniform distribution in a convex body, this also gives the expected volume. These quantities are of great interest to geometric probabilists. I have recently learned that B£r_ny & Larman [3] have independently and simultaneously proved results on the expected volume of the convex hull of random points in a polytope that are equivalent to the results on the expected number of vertices in Chapter 4. Naturally the results of Chapter 5 extend to expected probability content and, in the case of the uniform product distributions of §5.5, to expected volume. 'These connections have not been emphasized because they are not directly relevant to the analysis of convex-hull algorithms. It should not be difficult to extend the methods of Chapter 6 for the investigation of random Voronoi diagrams to study other aspects such as expected radius and volume of Delaunay spheres, expected number of faces of lower dimensions in the Voronoi dual, etc. It would also be desirable to extend the results to other distributions, such as the standard normal. As a counterpart to Conjecture 8.1, I advance the following conjecture.

Conjecture 8.3 If n i.i.d, points are selected from any distribution with a density function in 1_d, the expected number of simplices of their Voronoi dual, ES,, grows as O(n).

The results of Chapter 7 have been extended recently by Katajainen & Koppinen [55]. They exhibit a variation of the divide-and-conquer algorithm for planar Voronoi diagrams that achieves O(n) average running time by merging triangulations of cells in a quad-tree order. While the aspect ratio of the subproblems in the original algorithm can be as small as O(1/n), and as small as O(_/log n/n) in our algorithm, their algorithm maintains aspect ratio between 1/2 and 2. Our algorithm has the advantage that it can be implemented in a top-down fashion. A top-down implementation of the quad-tree merge ordering would require that the dimension of division be changed at every level. Since each change requires median-finding in O (n) time, the overall running time becomes O(n log n). In Katajainen & Koppinen's empirical studies, their quad-tree algorithm performed only slightly better than our own. It would be interesting to know whether a tighter analysis of our algorithm

103 showing linear average running time is possible; applying Katajainen & Koppinen's analysis to our algorithm gives only a weak O(n_ bound. The technique of changing the dimension of division in a divide-and-conquer algorithm is ap- parently widely applicable. Are there other problems that can be solved more quickly on average if this technique is applied? For some problems, such as planar convex-hull construction, we may expect no improvement at all. For others, such as Voronoi diagrams of non-intersecting line seg- ments in the plane [60], we might intuitively expect an improvement; however, in this case the lack of a reasonable yet tractable model of the input distribution appears to be the first obstacle to be overcome. Can the technique be applied in higher dimensions?

104 Appendix A

Notation, Constants, and Identities

A.1 Asymptotic Notation

We write f(x) -- O(g(x)) for a < x < b if there exists a constant c such that

gf(x(_) )-

f(x) = _(g(x)) if g(x) = O(f(x)), and f(_) =oCg(_)) if both f(x) -- O(g(x)) and f(x) -" _'l(g(x)). The notation fC_)=oCg(_))

105 means that f(z) _i_ g-_ - o. Wewrite

f(z) ~ g(x) to mean f(=)-- (1+ o(1))g(_).

For example, if f (x) -- 3z + 7, then

f(_) = o(_),

f(x)- O(xX),

f(x) -- o(xX),

fCx) -- f]Cx),

f(x) -- O(x), and f(x) ~ 3x.

From the Taylor series expansion

-logCl-x-1)= -1+x-_/2+x-_/3+... we can deduce the following statements, each of which is stronger than all of its predecessors:

-log(l-x -1) = OCx -1)

-log(1 - x -1) -- e(x -1)

- log(1 - x -1) ~ X -1

-log(l - x -1) --- X -1 -_- 0(2 -3/2)

-- log(1 -- X -1) --" X -1 Jr- oCx -3/2)

- log(1 - 2 -1) -- x -1 -4- 0(2-2).

106 A.2 Symbols

R the set of real numbers ._, B, C (script letters) sets of points or integers diam _ the diameter of the set vol.4 the volume of the set .4 vold_l A the (d - 1)-dimensional volume of the set 0.4 the relative boundary of .4 Y.n a set of n points chosen at random from R d cony X_ the convex hull of Xn or its boundary aft Xn the affine hull of Xn vert Y.n the set of vertices of the convex hull of Xn. P a polytope Y', _ a face or facet of a polytope a halfspace f(P) the f-vector of a polytope fk (P) the number of k-faces of a polytope EVn the expected number of convex hull vertices EFn the expected number of convex hull facets EHn the expected number of convex hull faces ESn the expected number of Voronoi dual simplices log n the base-e (natural) logarithm of n lg n the base-2 logarithm of n F the probability content of a halfspace, quadrant, or ball _, _, u, r see §A.5 x, xl a point (vector) in 1{d x (1), x_1) the first coordinate of a point in R _ X, X1 a random point in 1_ d

X (1), X_ 1) the first coordinate of a random point in PL d Ilxll the length of the vector x dist(x, y) the distance between x and y. (x, y) the inner product of x and y simp(xl, x2,..., Xd) the volume of the simplex formed by the points Xl through Xd esimp(.) the expected volume of the simplex formed by xl through Xd i.i.d, independent and identically distributed i.u.d, independent and uniformly distributed

107 A.3 The Gamma and Beta Functions

The Gamma function is an extension of the factorial function to the real numbers. It is defined by

r(x) = tZ-le -t dt.

If n is an integer, then n! = F(n + 1). Other useful facts are

r(, + 1) = ,r(,); r(1/2) = v_; r(_)r(z+ (1/2)) = 2'-_'v_r(2_).

The function P(a + 1,x) = tae -t dt

is known as the incomplete Gamma function. Tricomi [95, §4.3] gives the following asymptotic

expansion. N

F(a + 1' x) = zae -z E(-1) j (-a)yx] for a > -1 as x _ oo. (A.1) y_>0

The notation aT represents the so-called falling factorial, the product

a(a- 1)(a- 2)..-(a - j+ 1)

of j factors. The Beta function is defined by

B(rn, n) = rc_)r(.)F(rn + n) _-- f0_x'n-l(1 -- X) n-1 dx.

A.4 Slowly Varying Functions

Definition tk.1 A function L(x) is slowly varying as x --. oo if for all positive _,

lim n(xx) _ 1. _-_ooL(_) Perhaps the most obvious example of a slowly varying function is the logarithm function. It is easily verified that the slowly varying functions are closed under addition, subtraction, multiplication, and division. Feller [41, pp. 274 et seq.] presents relevant aspects of the theory of slow and regular variation, including the following representation lemma.

108 Lemma A.2 A slowly varying function L(x) can be expressed in the form

with a ---, ao and e --, 0 as x ---, oo.

In fact we can generally chose a(x) = 1 and e(t) = tL'(t)/L(t). Feller [41, p. 421] also proves a more general form of the following lemma on Laplace transforms involving slowly varying functions.

Lernma A.3 If k > 0 and L is slowly varying, then as n --, c_,

o°°e-"Wd[wkL(1/w)] _ k!n-kL(n).

The lemma still holds if the upper limit of integration is bounded but fixed.

A.5 Geometric Constants and Functions

We write /Zdand _d for the volume and surface area of the unit d-ball (the ball of radius one in d dimensions). It is well known [89] that

27rd/2 27rd/2 t_d= dr(d2) and _d = _dd- r(d/2)" (A.2)

We write _(r) for the fraction of the surface area of a unit d-sphere cut off by a plane at distance r from its center. The dimension d is generally clear from context. We have

_(r) : K_d-d1 .f l (1- u2) Cd-3)/2du

= _d-____._--l/l-r(2 _ t)(d_Z)/2t(d_3)/2 dt ted do

"_ t (d-3)/2 dt as r ---, 1 t_dE-dl.2(d-3)/2 f01-r

_d_12(d-1)/2 = _¢d(d-- 1) (1 -- r) (d-1)/2 (A.3)

Also _(0) = 1/2, thus _(r) -_ (1/2)(1 - r)(d-1)/2 as r --, 0. By showing _(r)//(1 - r)(d-1)/2 to be monotonic, it can easily be shown that

1 tCd_12Cd-1)/2 _-(1 - r) (d-l)/2 _ E(r) _ ted(d- 1) (1 - r) (d-l)/2 for 0 < r < 1.

109 (For d - 2, the inequalities are reversed; for d - 3, equality is achieved.) Similarly, we write _(r) for the fraction of the volume of a unit d-sphere cut off by a plane at distance r from its center. By a similar argument one can show that

_(11 - r) (d+l)/2 _ /.t(r) __ Dd-12(d+l)+ /21)(lDd(d-- r)(d+l)/2 for 0 _< r < 1 and d _> 2. (A.4)

rd represents the volume of a regular d-simplex inscribed in a unit d-ball. We can verify that

(d + 1)(d+l)/2 rd = da/2d! (A.5) by an inductive argument. Obviously rl = 2, and this agrees with our formula. Now consider the d-dimensional case. Let h be the altitude of the regular d-simplex, and let Ad be the area of one of its d + 1 facets. The simplex can be decomposed into d + 1 simplices formed by one of the facets as base and the center of the simplex as apex, so (d + 1)(h - 1)Ad -- hAa and h = 1 + 1/d. It follows that Ad = ra-l(1- (1/d)2) (_-1)/2 and

r d = hAd/d= (l/d)(1 + 1/d)rd_l(1 -- l/d2) (d-1)/2.

We can easily verify that (A.5) satisfies this recurrence.

Another useful constant is denoted by Yd. Miles [70] proves a very general theorem about the expectation and higher moments of the volume of the convex hull of i points chosen from the uniform distribution on the interior of the d-ball plus j points chosen from the uniform distribution on its surface. We require only the special case of the expected volume of the simplex formed by d + 1 points chosen from the surface of d-ball. This value is

r((d 2 + 1)/2)r(d/2) d+l va = v_d!r(d2/2)r(( d + 1)/2)a (A.6)

As special cases, we have v2 : 3/(2r) _-, 0.48 and v3 : 4_r/105 _ 0.12

110 Bibliography

[1] A. Appel and P. M. Will. Determining the three-dimensional convex hull of a polyhedron. IBM J. Res. Devel., 20:590-601, 1976.

[2] D. Avis and B. K. Bhattacharya. Algorithms for computing d-dimensional Voronoi diagrams and their duals. In F. P. Preparata, editor, Advances in Computing Research: Computational Geometry, pages 159-180, JAI Press Inc., Greenwich, Conn., 1983.

[3] I. B£r£ny and D. G. Larman. Convex bodies, economic gap coverings, random polytopes. Mathematika, to appear, June 1988.

[4] J. L. Bentley, H. T. Kung, M. Schkolnick, and C. D. Thompson. On the average number of maxima in a set of vectors. J. Assoc. Comput. Mach., 25:536-543, 1978.

[5] J. L. Bentley and M. I. Shamos. Divide and conquer for linear expected time. Info. Proc. Letters, 7(2):87-91, February 1978.

[6] J. L. Bentley, B. W. Weide, and A. C. Yao. Optimal expected-time algorithms for closest-point problems. A CM Transactions on Mathematical Software, 6(4):563-580, December 1980.

[7] W. H. Beyer. CRC Standard Mathematical Tables. CRC Press, Boca Raton, Fla., 27th edition 1984.

[8] B. Bhattacharya. Worst-case analysis of a convex hull algorithm. 1982. Simon Fraser U.

[9] B. Bollobks and I. Simon. Repeated random insertions into a priority queue. J. Algo., 6(4):466- 477, 1985.

[10] B. N. Boots and D. J. Murdoch. The spatial arrangement of random Voronoi polygons. Computers _ Geosciences, 9(3):351-366, 1983.

111 [11] A. Bowyer. Computing Dirichlet tessellations. Computer J., 24(2):162-166, 1981.

[12] A. BrCndsted. An Introduction to Convex Polytopes. Springer-Verlag, New York, 1983.

[13] K. Q. Brown. Geometric Transforms for Fast Geometric Algorithms. PhD thesis, Carnegie- Mellon U., 1979.

[14] W. Browstow, J.-P. Dussault, and B. L. Fox. Construction of Voronoi polyhedra. J. Compu- tational Physics, 29:81-97, 1978.

[15] C. Buchta. Zuf£11ige Polyeder: eine Ubersicht. In E. Hlawka, editor, Zahlentheoretische Anal- ysis, pages 1-13, Springer-Verlag, 1985.

[16] C. Buchta and F. Miiller. Random polytopes in a ball. J. AppI. Prob., 21:753-762, 1984.

[17] C. Buchta, F. Miiller, and R. F. Tichy. Stochastical approximation of convex bodies. Math. Annal., 271(2):225-235, April 1985.

[18] H. Carnal. Die konvexe Hiille yon n rotationssymmetrisch verteilten Punkten. Z. Wahrschein. lichkeitstheorie u. verw. Geb., 15:168-176, 1970.

[19] D. R. Chand and S. S. Kapur. An algorithm for convex polytopes. J. Assoc. Comput. Mach., 17(1):78-86, January 1970.

[20] K. L. Clarkson. Linear programming in O(n3 d2) time. Info. Proc. Letters, 22(1):21-24, 1986.

[21] B. Delaunay. Sur la sphere vide. Bull. Acad. Sc. USSR, Classe Sci. Mat. Nat., 7(1):793-800, May 1934.

[22] L. P. Devroye. How to reduce the average complexity of convex hull finding algorithms. Computers and Mathematics with Applications, 7:299-308, 1981.

[23] L. P. Devroye. Moment inequalities for random variables in computational geometry. Com- puting, 30:111-119, 1983.

[24] L. P. Devroye. A note on finding convex hulls via maximal vectors. Info. Proc. Letters, 11(1):53-56, August 1980.

[25] L. P. Devroye. On the computer generation of random convex hulls. Computers and Mathe- matics with Applications, 8(1):1-13, 1982.

112 [26] L. P. Devroye and G. T. Toussaint. Elimination algorithms for finding convex hulls in linear expected running time with radial densities. 1978.

[27] L. P. Devroye and G. T. Toussaint. A note on linear expected time algorithms for finding convex hulls. Computing, 26:361-366, 1981.

[28] R. A. Dwyer. A faster divide-and-conquer algorithm for constructing Delaunay triangulations. Algorithmica, 2(2):137-151, 1987.

[29] R. A. Dwyer. Higher-Dimensional Voronoi Diagrams in Linear Expected Time. Technical Report CMU-CS--88-100, Carnegie-Mellon University, February 1988.

[30] R. A. Dwyer. On the Convex Hull of Random Points in a Polytope. Technical Report CMU- CS-87-118R, Carnegie-Mellon University, October 1987.

[31] R. A. Dwyer. A simple divide-and-conquer algorithm for constructing Delaunay triangula- tions in O(n log log n) expected time. In Proc. 2nd Ann. Syrup. on Computational Geometry, pages 276-284, ACM, June 1986.

[32] R. A. Dwyer and R. Kannan. Convex hull of randomly chosen points in a polytope. In Proc. Int'l Workshop on Parallel Algorithms, Suhl, E. Germany, 1987.

[33] M. E. Dyer. On a multi-dimensional search technique and its application to the Euclidean one-centre problem. SIAM J. Comput., 15(3):725-738, August 1986.

[34] M. E. Dyer and A. M. Frieze. A randomized algorithm for fixed-dimensional linear program- ming. 1987. manuscript.

[35] M. E. Dyer and L. G. Proll. An algorithm for determining all extreme points of a . Math. Program., 12(1):97-101, February 1977.

[36] W. F. Eddy. The Convex Hull of a Spherically-Symmetric Sample. Technical Report; 182, Dept. of Statistics, Carnegie-Mellon U., August 1980.

[37] W. F. Eddy. The distribution of the convex hull of a Gaussian sample. J. Appl. Prob., 17:686- 695, 1980.

[38] W. F. Eddy. A new convex hull algorithm for planar sets. ACM Trans. Math. Softw., 3(4):398- 403, 1977.

113 [39] B. Efron. The convex hull of a random set of points. Biometrika, 52:331-342, 1965.

[40] W. Feller. An Introduction to Probability Theory and Its Applications. Volume I, John Wiley and Sons, Inc., third edition 1968.

[41] W. Feller. An Introduction to Probability Theory and Its Applications. Volume II, John Wiley and Sons, Inc., second edition 1971.

[42] J. L. Finney. A procedure for the construction of Voronoi polyhedra. J. Computational Physics, 32:137-143, 1979.

[43] L. Fisher. Limiting convex hulls of samples: theory and function space examples. Z. Wahrscheinliehkeitstheorie u. verw. Geb., 18:281-197, 1971.

[44] L. Fisher. Limiting sets and convex hulls of samples from product measures. Ann. Math. Stat., 40(5):1824-1832, 1969.

[45]S.Fortune.A sweeplinealgorithmforVoronoidiagrams.Algorithmiea2,(2):152-174,1987.

[46]A. M. Frieze.On therandom constructionofheaps.1987.manuscript.

[47]E. N. Gilbert.Random subdivisionsof spaceintocrystals.Ann. Math. Star.,33:958-972, 1962.

[48]P. J. Green and R. Sibson. Computing Dirichlettessellationsin the plane. Computer J., 21(2):168-173, 1978.

[49] L. J. Guibas and J. Stolfi. Primitives for the manipulation of general subdivisions and the computation of Voronoi diagrams. A CM Trans. Graphics, 4:74-123, 1985.

[50] R.N. Horspool. Constructing the Voronoi diagram in the plane. Technical Report SOCS-79.12, McGill University, School of Computer Science, July 1979.

[51] F. K. Hwang. An O(nlogn) algorithm for rectilinear minimal spanning trees. J. Assoc. Comput. Mach., 26:177-182, 1979.

[52] G. H. Johansen and C. Gram. A simple algorithm for building the 3-d convex hull. BIT, 23:146-160, 1983.

114 [53] A. Jozwik. A method for solving the n-dimensional convex hull problem. Patt. Recog. Letters, 2(1):23-25, October 1983.

[54] M. Kallay. Convex hull algorithms in higher dimensions. 1981. Dept. Math., U. Okla.

[55] J. Katajainen. Bucketing and Filtering in Computational Geometry. PhD thesis, Dept. of Mathematical Sciences, U. of Turku, Finnland, 1987.

[56] M. G. Kendall. A Course in the Geometry of n Dimensions. Hafner Pub. Co., New York, 1961.

[57] H. T. Kung, F. Lucio, and F. P. Preparata. On finding the maxima of a set of vectors. J. Assoc. Comput. Mach., 22(4):469-476, October 1975.

[58] C. L. Lawson. Software for C x surface interpolation. In J. R. Rice, editor, Mathematical Software III, page 194, Academic Press, 1977.

[59] D. T. Lee. Two-dimensional Voronoi diagrams in the Lp-metric. J. Assoc. Comput. Mach., 27:604-618, 1980.

[60] D. T. Lee and R. L. Drysdale, III. Generalizations of Voronoi diagrams in the plane..SIAM J. Comput., 10(1):73-87, February 1981.

[61] D. T. Lee and F. P. Preparata. Computational geometry: a survey. IEEE Transactions on Computers, C-33(12):1072-1101, December 1984.

[62] D. T. Lee and B. Schachter. Two algorithms for constructing Delaunay triangulations. Int. J. Comput. Inform. Sci., 9:219-242, 1980.

[63] D. T. Lee and C. K. Wong. Voronoi diagrams in L_ (Lco) metrics with two-dimensional storage applications. SIAM J. Comput., 9:200-211, 1980.

[64] G. K. Manacher and A. L. Zobrist. Neither the greedy nor the Delaunay triangulation of a planar point set approximates the optimal triangulation. Info. Proc. Letters, 9(1):31-34, July 1979.

[65] A. Maus. Delaunay triangulation and the convex hull of n points in expected linear time. BIT, 24:151-163, 1984.

115 [66] D. H. McLain. Two-dimensional interpolation from random data. Computer J., 19(2):178-181, May 1976.

[67] N. Megiddo. Linear programming in linear time when the dimension is fixed. J. Assoc. Comput. Mach., 31(1):114-127, January 1984.

[68] K. Mehlhorn. Data Structures and Algorithms 8: Multi-dimensional Searching and Com- putational Geometry. Volume 3 of EATCS Monographs on Theoretical Computer Science, Springer-Verlag, 1984.

[69] J. L. Meijering. Interface area, edge length, and number of vertices in crystal aggregates with random nucleation. Philips Res. Rep., 8:270-290, 1953.

[70] R. E. Miles. Isotropic random simplices. Adv. Appl. Prob., 3:353-382, 1971.

[71] T. Ohya, M. Iri, and K. Murota. Improvements of the incremental methods for the Voronoi

diagram with computational comparison of various algorithms. J. Oper. Res. Soc. Japan, 27(4):306--337, December 1984.

[72] F. P. Preparata and S. J. Hong. Convex hull of finite sets of points in two and three dimensions. Comm. Assoc. Comput. Mach., 20(2):87-93, February 1977.

[73] F. P. Preparata and M. I. Shamos. Computational Geometry: An Introduction. Springer- Verlag, 1985.

[74] H. Raynaud. Sur l'envellope convexe des nuages des points al_atoires dans R n, I. J. Appl. Prob., 7:35-48, 1970.

[75] A. R_nyi and R. Sulanke. Uber die konvexe Hiille yon n zuf£11ig gew_ihlten Punkten. Z. Wahrscheinlichkeitstheorie u. verw. Geb., 2:75-84, 1963.

[76] A. R_nyi and R. Sulanke. i_lber die konvexe Hiille yon n zufallig gew_ihlten Punkten, II. Z. Wahrscheinlichkeitstheorie u. verw. Geb., 3:138-147, 1964.

[77] A. R_nyi and R. Sulanke. Zuf'£11ige konvexe Polygone in einem Ringgebiet. Z. Wahrschein. lichkeitstheorie u. verw. Geb., 9:146-157, 1968.

[78] L. A. Santal6. Integral Geometry and Geometric Probability. Volume 1 of Encyclopedia of Mathematics and Its Applications, Addison-Wesley, Reading, MA, 1976.

116 [79] L. A. Santal6. An Introduction to Integral Geometry. Volume 1198 of Actualitds Scient_Kques et Industrielles, Hermann et Cie., Paris, 1953.

[80] R. Seidel. The complexity of Voronoi diagrams in higher dimensions. In Proc. 20th Annual Allerton Conference on Communication, Control, and Computing, pages 94-95, University of Illinois at Urbana-Champaign, October 1982.

[81] R. Seidel. Constructing higher-dimensional convex hulls at logarithmic cost per face. In Proc. 18th ACM Syrup. on Theory of Computing, pages 404-413, ACM, May 1986.

[82] R. Seidel. A convex hull algorithm optimal for points in even dimensions. Master's thesis, U. British Columbia, 1981.

[83] R. Seidel. On the number of faces in higher-dimensional Voronoi diagrams. In Proc. 3rd Ann. Syrup. on Computational Geometry, pages 181-185, ACM, June 1987.

[84] M. I. Shamos. Computational Geometry. PhD thesis, Dept. of Computer Science, Yale U., August 1978.

[85] M. I. Shamos. Geometric complexity. In Proc. 7th Syrup. on Theory of Computing, pages 224- 233, ACM, May 1975.

[86] M. I. Shamos and D. Hoey. Closest-point problems. In Proc. 16th Syrup. on Foundations of Computer Science, pages 151-162, ACM, 1975.

[87] R. Sibson. Locally equiangular triangulations. Computer J., 21(3):243-245, August 1978.

[88] S. W. Sloan and G. T. Houlsby. An implementation of Watson's algorithm for computing 2-dimensional Delaunay triangulations. Advances in Engineering Software, 6:192-197, 1984.

[89] D. McL. Y. Sommerville. An Introduction to the Geometry of n Dimensions. Dover, New York, 1958.

[90] G. Swart. Finding the convex hull facet by facet. J. Algorithms, 6:17-48, 1985.

[91] M. Tanemura, T. Ogawa, and N. Ogita. A new algorithm for three-dimensional Voronoi tessellation. J. Computational Physics, 51(2):191-207, August 1983.

117 [92] R. E. Tarjan. Data Structures and Network Algorithms. Volume 44 of CBMS-NSF Regional Conference Series in Applied Mathematics, Society of Industrial and Applied Mathematics, Philadelphia, 1983.

[93] A.H. Thiessen. Precipitation averages for large areas. Monthly Weather Review, 39:1082--1084, July 1911.

[94] G. T. Toussaint, S. G. Akl, and L. P. Devroye. Efficient convex hull algorithms for points in two and more dimensions. Technical Report SOCS 78.5, McGill U., 1978.

[95] F. G. Tricomi. Funzioni Ipergeometriche Confluente. Edizioni Cremonese, Rome, 1954.

[96] G. Voronoi. Nouvelles applications des parametres continus _ la theorie des formes quadra- tique, deuxi_me m_moire: recherches sur les parall_llo_dres primitifs. J. reine u. angew. Math., 134(1):198-287, May 1908.

[97] D. F. Watson. Computing the n-dimensional Delaunay tessellation with application to Voronoi polytopes. Computer J., 24(2):167-172, 1981.

[98] B. W. Weide. Statistical Methods in Algorithm Design and Analysis. PhD thesis, Carnegie- Mellon U., August 1978.

[99] E. Wigner and F. Seitz. On the constitution of metallic sodium. Physical Reviews, 43:804-810, 1933.

118