<<

An Approach to Conic Sections

Jia. F. Weng 1998

1 Introduction.

The study of conic sections can be traced back to ancient Greek mathematicians, usually to Applo- nious (ca. 220-190 bc) [2]. The name ‘conic section’ comes from the fact that the principle types of conic sections, known as , and , are generated by cutting a with a . However, most modern textbooks on calculus depart from this geometric approach. Instead, conic sections are defined as some types of loci and studied through analytic . In this paper we show a new approach to conic sections which are defined as the intersections of two . Then the vertices of two cones become the inherent foci of the conic section and a directrix exists associated with each of the inherent foci. All known properties of conic sections still hold for the inherent foci and their associated directrixes in this new approach. Moreover, when a conic section and its foci and directrixes in space are projected to a horizontal plane, they become the ones discussed in planar . This new approach seems simpler and more natural than the classical geometric and analytic approaches in defining conic sections and proving their properties. In the last section we show an application of the new approach in the network design in mining industry. In Appendix we derive the standard of a conic section with respect to the foci, lying on the cutting plane and referred to as the coplanar foci of the conic section. The author does not know any textbook that gives such a derivation.

2 Intersections of Two Right Circular Cones.

Let xP , yP , zP denote the Cartesian coordinates of a P . Suppose P and Q are two distinct points. The length of segment PQ is denoted by l(PQ), and the gradient of PQ is denoted by g(PQ) which is defined as |z − z | g(PQ) = q Q P . (1) 2 2 (xQ − xP ) + (yQ − yP ) Let C(P ; m) denote a (double-napped) right circular cone whose vertex is P and whose generating lines have gradient m (m > 0). The angle β formed by the axis of C(P ; m) and the generating lines is referred to as the generating angle of the cone. Clearly, tan β = 1/m. Now suppose A and B are two distinct points in space. The intersection of cones C(A; m) and C(B; m) is denoted by C(A, B; m). Assume the horizontal between A and B is 2u while the vertical distance between A and B is 2h. Then, after a transformation we may assume A = (u, 0, h),B = (−u, 0, −h), u ≥ 0, h ≥ 0 (Fig. 1). Hence the describing C(A; m) and C(B; m) are (z − h)2 (z + h)2 (x − u)2 + y2 = , (x + u)2 + y2 = . (2) m2 m2 If h = 0, then the intersection C(A, B; m) lies trivially in the YZ-plane. Assume h 6= 0. Subtracting the first equation from the second, we have u m2 z = m2 x = x, (3) h k Z ~ Z Y Y A P β VA R1 VA A α X X O O S R2 B VB VB

B

R1

B O A V VB A B VB VA A

R2 S

(1) (2)

Figure 1: Intersections of two cones. where k = h/u = g(AB). It follows that C(A, B; m) lies on a plane P˜ which contains the Y -axis and meets the XY -plane at an angle α, referred to as the intersecting angle of two cones. We call this planar C(A, B; m) a conic section, or simply a conic. In particular, if g(AB) ≥ m, C(A, B; m) is a closed curve and called an (Fig. 1(1)); if g(AB) ≤ m, C(A, B; m) has two separate branches, called a (Fig. 1(2)). Trivially, in their degenerate cases in which g(AB) = m, an ellipse becomes a AB, and a hyperbola becomes two half-lines that are the extensions of AB. Moreover, there are two special cases: 1. If A, B lie in a vertical line, then g(AB) = ∞ and the ellipse is a lying in a horizontal plane. 2. If A, B lie in a horizontal plane, then g(AB) = 0 and the hyperbola lies in a vertical plane. Substituting z with the right side of (3), from (2) we obtain p (m2u2 − h2)(m2x2 − h2) y = ± . (4) mh Hence the parametric expression of C(A, B; m) is à p ! (m2u2 − h2)(m2x2 − h2) m2u x, ± , x . (5) mh h

Let VA and VB be the intersections of C(A, B; m) with the XZ-plane that are close to A and B respectively (Fig. 1). Then by Equations (3) and (4)

h h V = ( , 0, mu),V = (− , 0, −mu). (6) A m B m

2 Clearly, C(A, B; m) is symmetric with respect to two orthogonal lines: VAVB and the Y -axis. Therefore, O is the center of the conic section C(A, B; m). When C(A, B; m) is an ellipse, let R1 and R2 be the intersections of C(A, B; m) with the positive and negative Y -axis respectively. Then again from Equation (4) we find µ ¶ µ ¶ u q −uq R = 0, |k2 − m2|, 0 ,R = 0, |k2 − m2|, 0 . (7) 1 m 2 m When C(A, B; m) is a hyperbola, (7) also defines two points on the Y -axis. In the case of an ellipse, VAVB is called the major axis while R1R2 is called the minor axis of the ellipse. In the case of a hyperbola, VAVB is called the transverse axis while R1R2 is called the conjugate axis of the hyperbola. From Equations (4) and (3), it is easy to derive that

0 2 2 0 2 yx = ±(m /k − 1)x/y, zx = m /k. Therefore, the vector t of the conic section C(A, B; m) is à ! m2 x m2 t = 1, ±( − 1) , . (8) k2 y k

Remark 1 Note t is completely determined by m and k, independent from the coordinates of A and B. p Now suppose C(A, B; m) is a hyperbola. When x goes to infinity, by (4) y/x goes to m2/k2 − 1 and the tangent vector becomes  s  m2 m2 t = 0, ± − 1,  . ∞ k2 k

The two lines through O in the directions of t∞ are the of the hyperbola, which lie in the plane P˜ where the hyperbola lies. Conic sections have two important properties: constant sum/difference property and reflective property. First we prove a lemma [5].

Lemma 2 Suppose the endpoint S of a line SA is perturbed in direction v. Let the angle between −→ SA and v be θ. Then the directional derivative of l(SA) with respect to v is (− cos θ).

∗ ∗ ∗ Proof: Suppose S moves to S in direction v. Let l0 = l(SA), l = l(S A), ε = l(SS ). Then 2 2 2 l = l0 + ε − 2εl0 cos θ, and 2l · dl = 2(ε − l0 cos θ) · dε.

Note l → l0 when ² → 0. Therefore

0 dl lv = lim = − cos θ. ε→0 dε The lemma is proved.

Theorem 3 (constant sum/difference) For any point S on an ellipse C(A, B; m), the sum of the from S to the vertices A and B is constant. For any point S on a hyperbola C(A, B; m), the difference of the distances from S to the vertices A and B is constant.

3 Proof: There is no loss of generality that we assume A = (u, 0, h),B = (−u, 0, −h) as before. Because g(AS) = g(BS) = m, when k ≥ m and C(A, B; m) is an ellipse, p p −2 −2 l(AS) + l(SB) = 1 + m (|ZA − ZS| + |ZS − ZB|) = 2h 1 + m .

The argument is similar for the case of C(A, B; m) being a hyperbola.

This property completely characterizes ellipses and hyperbolas, therefore, we can redefine el- lipses/hyperbolas to be planar that satisfy the constant sum/difference property. That is, an ellipse (or a hyperbola) is a planar curve such that the sum (or difference respectively) of the distances between any point on the curve and two fixed distinct points is constant. This property implies another property which is important in applications of conic sections.

Corollary 4 (reflection) For any point S on an ellipse or a hyperbola, the tangent line at S meets SA and SB at the same acute angle.

0 Proof: Let the acute angle between t and SA, SB be θ, φ respectively. By Lemma 2, lt(SA) = 0 0 − cos θ, lt(SB) = − cos(π − φ) = cos φ. For an ellipse, since l(SA) + l(SB) is constant, lt(SA) + 0 lt(SB) = − cos θ + cos φ = 0. Hence θ = φ. The argument is similar for hyperbolas.

Remark 5 This corollary can also be proved by Equation (8).

Because of the reflective property A and B are called the inherent foci of C(A, B; m). Let A,¯ B¯ and C¯(A,¯ B¯) be the projections of A, B and C(A, B; m) on a horizontal plane respectively. Then any point S¯ on C¯(A,¯ B¯) is the projection of certain point S on C(A, B; m). Because

l(A¯S¯) ± l(S¯B¯) = (l(AS) ± l(SB)) sin β = const,

C¯(A,¯ B¯) is also an ellipse or a hyperbola with foci A¯ and B¯. Since A¯ and B¯ lie on the same plane ¯ where the curve C(A,¯ B¯) lies, we call them coplanar foci of the curve. Moreover, let V¯A, V¯B, R¯1 and R¯2 be the projections of VA,VB,R1 and R2 on the same plane respectively. Then V¯AV¯B and R¯1R¯2 are the major and minor axes (or the transverse and conjugate axes) of the ellipse (or the hyperbola respectively) C¯(A,¯ B¯). The equation of C¯(A,¯ B¯) can be easily derived from Equation (4) as follows:

x2 y2 + = 1, (9) a¯2 ¯b2 where h2 h2 − m2u2 (k2 − m2)u2 a¯2 = , ¯b2 = = . (10) m2 m2 m2 Equation (9) is called the standard form of an ellipse or a hyperbola, depending on ¯b2 ≥ 0 or ¯b2 ≤ 0, i.e. on k ≥ m or k ≤ m. Note that 2¯a = l(V¯AV¯B), 2¯b = l(R¯1R¯2). Let 2¯c = l(A¯B¯) = 2u. Then it is easy to see that c¯2 =a ¯2 − ¯b2, if C¯(A,¯ B¯) is an ellipse, c¯2 = ¯b2 − a¯2, if C¯(A,¯ B¯) is a hyperbola.

4 _ _ _ _ B A B A

Figure 2: Confocal ellipses and hyperbolas.

3 Confocal and Similar Conics, Parabolas.

The of a conic section is completely determined by two parameters m and k. Now we investigate how the curve varies as m or k changes.

(1) If A and B are fixed, i.e. k is fixed, but m changes, we obtain a family of conics that lie on different planes but share A, B as their common foci. These conics are called confocal. As argued above, when these curves are projected to the same horizontal plane, their projections are also confocal with A,¯ B¯ as their common foci (Fig. 2, based on a figure in [3]).

(2) Now we study how C(A, B; m) varies if m is fixed but k changes. Without loss of generality we assume B is fixed and A moves to A∗ that lies in the same vertical plane. Let A∗ = (u+ε, 0, h+ δ), ε ≥ 0, δ ≥ 0. Then the equations of cones C(A∗; m) and C(B; m) become

(z − h − δ)2 (z + h)2 (x − u − ε)2 + y2 = , (x + u)2 + y2 = . (11) m2 m2 Of all possible moves of A we discuss the moves in three special directions: along BA, horizontal and along VBA.

(2.1) A move of A either preserves the gradient between two cone vertices or does not. If it does, then δ/ε = h/u = k. Solving the system (11) with δ = kε we have

m2 (h2 − m2u2)ε z = x + . k 2hu

Therefore, the new conic section C(A∗,B; m) lies in a plane P∗ that is to the plane P˜ where C(A, B; m) lies. Besides, since both curves lie on the same cone C(B; m), it follows that the two conics C(A∗,B; m) and C(A, B; m) are similar relative to their common B. When projected on the same horizontal plane, their projections are also similar relative to B¯ (Fig. ??). On the other hand, if the move of A does not preserve the gradient between the vertices, then P∗ is not parallel to P˜ , and C(A∗,B; m) is not similar to C(A, B; m) by the definition of similarity.

5 ~ P Z Y A*

A

X S O VB L

D B

Y L VB X B D S

Figure 3: A as an extreme ellipse.

Remark 6 A curve C1(x1(t), y1(t), z1(t)) is said to be similar to another curve C2(x2(t), y2(t), z2(t)) relative to point (x0, y0, z0) if x (t) − x y (t) − y z (t) − z 2 0 = 2 0 = 2 0 . x1(t) − x0 y1(t) − y0 z1(t) − z0 (2.2) The vertex A moves in the direction of the positive X-axis, that is, δ = 0 and ε monotone increases. Assume we start with g(AB) = k > m and the conic is an ellipse. In this move, the ellipse C(A, B; m) becomes narrower and narrower, and finally becomes a line segment, a degenerate ellipse with k∗ = (h+δ)/(u+ε) = m. If the increase of ε continues, then as stated in Section 2, the line segment first suddenly jumps into two half-lines as a degenerate hyperbola, and then becomes a normal hyperbola with k∗ < m.

(2.3) As we have discussed in (2.2), in the degenerate case, an ellipse or a hyperbola overlaps the line through A and B. One may ask if there is a non-degenerate common extreme of ellipses and hyperbolas. Look at the move of A along VBA. In such a move δ = mε. Solving the system (11) with δ = mε and then let ε go to infinity, we have p 2 (h − mu)(mx + h) y = ± , z = m(x − u) + h. (12) m Therefore the parametric expression of the resulting curve is à p ! 2 (h − mu)(mx + h) x, ± , m(x − u) + h . (13) m

6 If starting with a hyperbola (h/k < m), we obtain the same result. Therefore, the resulting planar curve, called a parabola, is the non-degenerate common extreme of ellipses and hyperbolas. The ∗ ˜ parabola has a vertex VB and is symmetric about the line VBA , the intersection of P and the XZ-plane. Let 2h L = {(− + u, y, −h)} m be the point that is the intersection of the plane P˜ and the horizontal plane through B (Fig. 3). For any point S on the parabola, let D be the foot of the from S to L. Then the gradient of SD is the slope of the plane P˜ , which is equal to the gradient of any generating line of C(B; m). Therefore, similarly to the argument for ellipses and hyperbolas, we have l(SB) = l(SD).

Theorem 7 (equidistance) For any point S on the parabola, the distance from S to B equals the distance from S to the line L.

For this reason, B is called the inherent focus of the hyperbola, and L is called the directrix associated with the inherent focus B. As ellipses and hyperbolas, this equidistance property com- pletely characterizes parabolas and we can redefine a parabola to be a planar curve that satisfies the equidistance property. The ratio of the distance from S to the focus to the distance from S to the directrix is called the eccentricity of the parabola. Thus, a parabola can also be defined as a planar curve whose eccentricity is one. Again, the equidistance property implies the reflective property:

Corollary 8 (reflection of parabolas) For any point S on a parabola, the tangent line t at S meets two lines at the same acute angle: the line joining S and the focus and the perpendicular from S to the directrix.

¯ Now denote the parabola by C(B,L; m). Let C(B,¯ L¯), V¯B, B,¯ S,¯ L¯ and D¯ be the projections of C(B,L; m),VB, B, S, L and D on a horizontal plane respectively. Because SB and SD have the same gradient, l(SB) = l(SD) implies l(S¯B¯) = l(S¯D¯). Therefore, the projection C¯(B,¯ L¯) is also a parabola, whose coplanar focus is B¯ and the directrix associated with B¯ is L¯. From (12) the equation of C¯(B,¯ L¯) is y2 = 4¯p(x + h/m), wherep ¯ = h/m − u is the distance from the vertex V¯B to the coplanar focus B¯ or to the directrix L¯. This equation is the standard form of a parabola with vertex at (−h/m, 0). If the origin moves 2 to the vertex V¯B, then the equation is further simplified to y = 4px.

4 Discussions

(1) It is easy to see that any ellipse (or hyperbola) C obtained by cutting a cone with a plane can be generated by two cones. Factually, we may assume that the cone C is C(A; m). Then make a copy of C(A; m) and turn the copy 180◦ around the Y -axis, the copy becomes the required C(B; m).

(2) As parabolas, an ellipse or a hyperbolas has two directrixes associated with its inherent foci A and B: They are the intersections of P˜ and the horizontal planes through A and B. Figure 4(1) shows the directrix LB of an ellipse C(A, B; m) associated with B. For any point S on the ellipse, let D be the foot of the perpendicular from S to LB. The eccentricity of the conic section C(A, B; m) is still defined to be the ratio of l(SB) to l(SD). Let H be the foot of the perpendicular

7 Z ~ Y A P ~ P β VA Z Y A* FA α X O A

FB S V F X L B ~ B L β O B ~ CB L L α B C D H B

(1) (2)

Figure 4: Coplanar foci and their directrixes. from S to the horizontal plane through B. Since 6 HSB equals the generating angle β, and 6 HDS equals the intersecting angle α (Fig. 4(1)), l(SB) l(SH)/ cos β sin α = = = const < 1. l(SD) l(SH)/ sin α cos β When projected to a horizontal plane, l(S¯B¯) l(BH) l(SH)/ cot β tan α = = = = const < 1. l(S¯D¯) l(DH) l(SH)/ tan α cot β Thus, the third definition of ellipses is that an ellipse is a planar curve whose eccentricity is a constant less than one. These argument can be similarly apply to hyperbolas. Hence, the third definition of hyperbolas is that a hyperbola is a planar curve whose eccentricity is a constant greater than one. (3) Finally we should point out that an ellipse or a hyperbola C(A, B; m) also has its own coplanar foci lying on the plane P˜ . Take an ellipse as an example. According to Quetelet-Dandelin’s ˜ ˜ construction [2], suppose the sphere inscribing C(B; m) and P touches P at a point FB. Then FB is a coplanar focus of C(A, B; m). Another focus FA can be found in a similar way. Associated with these coplanar foci, there are two directrixes. For instance, suppose the sphere defined above ˜ touches the cone C(B; m) at a horizontal circle CB (Fig. 4(1)). Then the intersection of P and the plane through CB is the directrix L˜B associated with the focus FB. In a similar way we can find the coplanar focus F and its associated directrix L˜ for a parabola C(B,L; m) (Fig. 4(2)). The coordinates of the coplanar foci and the standard equations of conic sections with respect to its coplanar foci and directrixes are derived in Appendix.

5 An application.

Given a point set, the minimum network problem asks for a network of shortest total length interconnecting all given points. To shorten the network, probably some points not in the given

8 ground A deposit

S

deposit C B

Figure 5: A mining network. point set are added. In the literature this minimum network is called a Steiner minimal tree and the additional points are called Steiner points [4]. The Steiner tree problem was first posed by Fermat who asked how to find the point S in a ABC so that the sum of the distances from S to the vertices A, B, and C is minimal [4]. There are many generalizations of the Steiner tree problem. A recently studied one is the gradient-constrained Steiner tree problem [1]. Figure 5 depicts a simple example of an underground mining network in which the ore in two deposits is extracted through tunnels to a vertical shaft and then hauled up to the ground. In the figure A and B are two prescribed access points in deposits. In practice, the gradient of any tunnel cannot be very steep. The typical maximum gradient m is about 1:7. In this example we need to find a point S so that with this gradient constraint the total length of the network f = l(SA)+l(SB)+l(SC) is minimized, where C is the access point to be determined on the vertical shaft. Clearly, to minimize f, SC should be perpendicular to the vertical shaft. Hence, SC is horizontal and automatically satisfies the gradient constraint. It can be shown that if g(AB) > m, then SA and SB must have the maximal gradient m in order to minimize f [6]. It follows that S lies on the ellipse C(A, B; m), and that 6 ASC = 6 BSC because of the reflective property. These conditions completely determine S and C.

References

[1] M.Brazil, D.A.Thomas and J.F.Weng, Gradient-constrained minimal Steiner trees, DIMACS Series in Discrete and Theoretical Computer Science, Vol. 40 (1998), pp. 23-38.

[2] J.L.Coolidge, A history of the conic sections and surfaces, Oxford University Press, London, 1945.

[3] R.Courant and H.Robbis, What is mathematics? Oxford University Press, London, 1941.

[4] F.K.Hwang, D.S.Richard and P.Winter, The Steiner tree problem, North-Holland, 1992.

[5] J.H.Rubinstein and D.A.Thomas, A variational approach to the Steiner network problem, Ann. Oper. Res., Vol. 33 (1991), pp. 481-499.

[6] J.F.Weng, A note on constrained shortest connections of two points to a vertical line, preprint.

9 Appendix. If C(A, B; m) is an ellipse, then let FA and FB be a pair of points lying on VAVB and symmetric to O such that 0 ≤ ρ = l(OFA)/l(OVA) ≤ 1. Similarly, if C(A, B; m) is a hyperbola, then define FA and FB be the points lying on the extension of VAVB and symmetric to O such that ρ = l(OFA)/l(OVA) > 1. By the definition µ ¶ µ ¶ ρh ρh F = , 0, ρmu ,F = − , 0, −ρmu . (14) A m B m Let S = (x, y, z) be a point on C(A, B; m), then s µ ¶ ρh 2 l = l(F S) = − x + y2 + (ρmu − z)2, 1 A m s µ ¶ ρh 2 l = l(F S) = + x + y2 + (ρmu + z)2. 2 B m 2 2 2 2 2 Let l1 ± l2 = 2˜a, a˜ ≥ 0. Then l1 = (2˜a ∓ l2) , l1 − l2 = 4˜a ∓ 4˜al2, l2 − l2 ρhx ±al˜ =a ˜2 − 1 2 =a ˜2 + + ρmuz. 2 4 m Squaring both sides, after a simplification we obtain

(h2ρ2 − m2a˜2)x2 + 2m2huρ2xz + (m2u2ρ2 − a˜2)m2z2 − m2a˜2y2 − (m4u2 + h2)˜a2ρ2 +a ˜4m2 = 0.

Substituting y and z with (4) and (3) respectively, finally we have an equation containing only one variable x: 2 f1x + f2 = 0, (15) where 4 2 2 2 2 4 2 2 2 f1 = m u (1 + m )˜a − (h + m u ) ρ , 2 2 2 2 2 2 2 4 2 2 4 2 2 2 f2 = h (h − m u )˜a − m h a˜ + h (h + m u )˜a ρ .

Equation (15) will hold for any x if both f1 = 0 and f2 = 0. Solving this system we obtain s √ h2 u 1 + m2 a˜ = + m2u2, ρ = . m2 a˜

Thus, the constant sum/difference of distances and the coordinates of the coplanar foci FA,FB are both determined. By the definition of the intersecting angle α, we have cos α = h/(ma˜), sin α = mu/a˜. After an anticlockwise rotation of the coordinate system around the Y -axis by α, let the new axes be OX,O˜ Y˜ and OZ˜. Then the X˜Y˜ -plane is the plane P˜ where the conic section lies. Since the curve lies on the X˜Y˜ -plane, we havez ˜ = 0, and the the transformation is x =x ˜ cos α, y =y, ˜ z =x ˜ sin α. From (4) we can derive the standard form of the conic section on the plane P˜ :

(˜x)2 (˜y)2 + = 1, (16) (˜a)2 (˜b)2

10 where ˜b2 = (h2/m2) − u2. Note ˜b2 ≥ 0 when g(AB) ≥ m and the conic section is an ellipse, and 2 ˜b ≤ 0 when g(AB) ≤ m and the conic section is a hyperbola. It is easy to checka ˜ = l(OVA), ˜b = l(OR1). Letc ˜ = l(OFA). Then as in the projection of C(A, B; m), the following equations hold:

c˜2 =a ˜2 − ˜b2, if C(A, B; m) is an ellipse,

c˜2 = ˜b2 − a˜2, if C(A, B; m) is a hyperbola. In the same way we can determine the coplanar focus F and its directrix L˜ for a parabola. As shown in Figure 4(2), let ρ be the horizontal distance between VB and F . Since the gradient of VBF is m and since VB has the same distance to F and to L˜, we have µ ¶ µ ¶ h h F = ρ − , 0, m(ρ − u) , L˜ = −ρ − , y, −m(ρ + u) . m m

Suppose S = (x, y, z) is a point on the parabola. Then the square of l(SF ) equals the square of the distance from S to L˜,

µ ¶ Ã !2 µ ¶ Ã !2 h 2 ρm2 h 2 ρm2 x − ρ + + y2 + z − + mu = x + ρ + + z + + mu . m h m h

Because y and z satisfy (12), this equation always holds if

h − mu ρ = . m(1 + m2) √ Note now cos α = 1/ 1 + m2. As argued in the case of ellipses and hyperbolas, after an anticlock- wise rotation of the coordinate system by α, from (12) we obtain the standard form of the parabola on the plane P˜ : √ h 1 + m2 y˜2 = 4˜p(˜x + ), (17) m √ √ 2 2 wherep ˜ = 4(h − mu)/(m 1 + m ) and h 1 + m /m = l(VBF ). If the origin moves to VB, then it can be further simplified toy ˜ = 4˜px.˜

11