On the Axiomatization of Interval Arithmetic
Svetoslav Markov Institute of Mathematics and Informatics Bulgarian Academy of Sciences “Acad. G. Bonchev” st., block 8, 1113 Sofia, Bulgaria [email protected]
Abstract In our study we follow the algebraically natural approach of completing the set IRn up to the set IRn involving im- We investigate some abstract algebraic properties of the proper intervals. Thus, starting from the above three ba- system of intervals with respect to the arithmetic operations sic operations/relations (1)–(3), we arrive to the system and the relation inclusion and derive certain practical con- (IRn, +, R, ∗, ⊆). In the next Section 2 we briefly recall sequences from these properties. In particular, we discuss the above mentioned algebraic construction and the conse- the use of improper intervals (in addition to proper ones) quent quasivector spaces. It is to be noted that everything and of midpoint-radius presentation of intervals. This work said in this section for intervals is also true for the more gen- is a theoretical introduction to interval arithmetic involving eral case of convex bodies and briefly repeats already pub- improper intervals. We especially stress on the existence of lished materials, cf. [7], [8]. In Section 3 we concentrate special “quasi”-multiplications in interval arithmetic and on the system (IRn, +, R, ∗, ⊆) obtained by algebraic com- their role in relevant symbolic computations. pletion. Using the theoretical foundations given in Section 2 and some specific properties of intervals (distinct from those of general convex bodies) we derive formulae for the 1. Introduction operations/relations involved. We show that the familiar midpoint-radius presentation of intervals is a special case of the presentation of elements in a quasivector space. Section In this work we discuss several algebraic properties of (IR, +, ×, ⊆) the system of intervals with the arithmetic operations ad- 4 is devoted to the system involving multi- dition and multiplication and the relation inclusion. Our plication of one-dimensional intervals. In the Conclusion aim is to point out certain practical advantages of using im- we discuss various topics like computer implementation of proper intervals and midpoint-radius presentation of inter- interval arithmetic, symbolic computations, etc. vals. Denote by IR the set of all compact intervals on the real 2. Quasivector spaces line R and by IRn the set of all n-tuples of intervals. For n A, B ∈ IRn, α, β ∈ Rn, γ ∈ R, one defines addition, The system (IR , +) is a commutative monoid (semi- multiplication by scalars and inclusion, resp., by: group with null) with cancellation law. There is no opposite operator in (IRn, +). The operator multiplication by the A + B = {α + β | α ∈ A, β ∈ B}, (1) scalar −1: ¬A =(−1) ∗ A = {−α | α ∈ A},A∈ IR, γ ∗ B = {γβ | β ∈ B}, (2) briefly called negation (that may be suspected for opposite), A +(¬A)=0 A ⊆ B ⇐⇒ (α ∈ A =⇒ α ∈ B), (3) is not an opposite operator, as is violated for certain A ∈ IRn. Thus IRn is not a group; however it where all operations/relations are understood component- can be embedded in a group. The algebraic construction wise. We thus obtain the system (IRn, +, R, ∗, ⊆) to be that converts an abelian monoid with cancellation law into discussed in the sequel. a group will be further refered as embedding construction. We shall refer to the definitions of the opera- Recall that this approach is used to pass from the monoid of tions/relations (1)–(3) as set-theoretic. These definitions are nonnegative reals (R+, +) to the set of reals (R, +). Thus, not suitable for computations with intervals. Our final aim it is natural instead of the original system (IRn, +, R, ∗, ⊆) is to derive computationally efficient expressions for these to consider the extended system (IRn, +, R, ∗, ⊆) obtained operations based on the intrinsic properties of intervals. by the embedding construction. 2.1. The embedding construction γ ∗ (a + b)=γ ∗ a + γ ∗ b, (7) α ∗ (β ∗ c)=(αβ) ∗ c, (8) Every abelian monoid (M,+) with cancellation law in- 1 ∗ a = a, (9) duces an abelian group (M, +), where M = M 2/ ∼ is (α + β) ∗ c = α ∗ c + β ∗ c, αβ ≥ 0. the difference (quotient) set of M consisting of all pairs if (10) (A, B) factorized by the congruence relation ∼:(A, B) ∼ The only difference between a vector (linear) space and (C, D) iff A + D = B + C, for A, B, C, D ∈ M. a quasivector space is contained in assumption (10), where Addition in M is defined by the relation (α + β) ∗ c = α ∗ c + β ∗ c is required to hold just for αβ ≥ 0, whereas in a vector space the same (A, B)+(C, D)=(A + C, B + D). (4) relation is assumed to hold for all scalars α, β ∈ R (known The neutral (null) element of M is the class (Z, Z), Z ∈ as second distributive law). Thus a vector space is a special M. Due to the existence of null element in M,wehave quasivector space. (Z, Z) ∼ (0, 0). The opposite element to (A, B) ∈ M is Conjugate elements. From opp(a)+a =0we obtain opp(A, B)=(B,A). The mapping ϕ : M −→ M defined ¬opp(a) ¬ a =0, that is ¬opp(a)=opp(¬a). The el- for A ∈ M by ϕ(A)=(A, 0) ∈ M is an embedding of ement ¬opp(a)=opp(¬a) is further denoted by a − and monoids. We embed M in M by identifying A ∈ M with the corresponding operator is called dualization or conju- the equivalence class (A, 0) ∼ (A + X, X), X ∈ M; all gation. We say that a− is the conjugate (or dual) of a.In elements of M admitting the form (A, 0) are called proper the sequel we shall express the opposite element symboli- and the remaining (new) elements are called improper. The cally as: opp(a)=¬a−, minding that a+(¬a−)=0(to be set of all proper elements of M is ϕ(M)={(A, 0) | A ∈ briefly written as a ¬ a− =0). Using conjugate elements ∼ M} = M. the quasistributive law (10) can be written in the form Using the above construction the system (IRn, +) is em- n bedded into the group (IR , +) in a unique way. (α + β) ∗ cσ(α+β) = α ∗ cσ(α) + β ∗ cσ(β), (11) Multiplication by scalars “∗” is extended from R×IRn σ n wherein is the sign functional to R × IR by means of −,α<0; n σ(α)= γ ∗ (A, B)=(γ ∗ A, γ ∗ B),A,B∈ IR ,γ∈ R. (5) +,α≥ 0,α∈ R, ¬(A, B)=(−1) ∗ In particular, negation is extended by and the convention a+ = a has been made (for a proof (A, B)=(¬A, ¬B),A,B∈ IRn . see [8]). Expression (11) is valid for all values of α, β (not In the sequel we shall use lower case roman letters only for equally signed α, β) which allows efficient sym- to denote the elements of IRn, writing e. g. a = n bolic calculations. (A1,A2),A1,A2 ∈ IR . For example, negation is writ- ¬a =(−1) ∗ a a ¬ b a +(¬b) ten: ; below means . 2.3. A decomposition theorem Inclusion “⊆” is extended in IR by means of An element y with the property y ¬ y =0(equivalently (A, B) ⊆ (C, D) ⇐⇒ A + D ⊆ B + C, (6) y = y−) is called linear or distributive; an element z such n that ¬z = z (equivalently z + z− =0) is called centred or wherein A, B, C, D ∈ IR and inclusion of interval n- 0-symmetric. tuples is meant component-wise. As is immediately seen, under this extension the practically important properties Theorem (Decomposition theorem). (Q, +, R, ∗) is a a ⊆ b ⇐⇒ a+c ⊆ b+c for c ∈ IR and a ⊆ b ⇐⇒ γ ∗a ⊆ quasivector space. For every x ∈ Q there exist unique γ ∗b for γ ∈ R are preserved. The system (IRn, +, R, ∗, ⊆) y,z ∈ Q such that: i) x = y + z; ii) y ¬ y =0; iii) involving improper intervals is now completely defined. ¬z = z;iv)y = z =⇒ y = z =0. The proof, see [8], is based on the fact that any x ∈ Q 2.2. Quasivector space: definition can be written in the form: (IRn, +, R, ∗) The interval system is a quasi-vector x = y + z =(1/2) ∗ (x + x−)+(1/2) ∗ (x ¬ x). (12) space in the sense of the following definition [8]: Definition.Aquasi-vector space (over R), denoted Note that the first summand in (12) is linear, whereas the (Q, +, R, ∗), is an abelian group (Q, +) with a mapping second one is centred. The subset of all linear elements of Q (multiplication by scalars) “∗”: R × Q −→ Q, such that is denoted Q = {x ∈ Q | x ¬ x =0} and the subset of all for a, b, c ∈ Q, α,β,γ ∈ R: centred elements of Q is denoted Q = {x ∈ Q | x = ¬x}. Note that negation coincides with opposite in Q , and that the two vector spaces (Q , +, R, ·), (Q , +, R, ·) are negation coincides with identity in Q . m-, resp. n-dimensional and therefore they are isomorphic to Rm, resp. Rn. Hence any a ∈ Q is a direct sum of the Corollary 1. Every quasivector space Q is a direct sum m n form a =(a ; a ), a ∈ R , a ∈ R . As elements of Q of Q = {x ∈ Q | x ¬ x =0 } and Q = {x ∈ Q | x = the elements of Q are of the form (a ;0)and the elements ¬x}, symbolically Q = Q Q . of Q — of the form (0; a ), so that: (Q , +, R, ∗) Q = {x ∈ Q | x ¬ x =0} Clearly, with is a linear space. Indeed, conjugation in Q coincides with a =(a ; a )=(a ;0)+(0;a ). (15) identity. Substituting x = x− in (11), we obtain that the ∗ familiar second distributive law (α + β) ∗ c = α ∗ c + β ∗ c, Because of the presence of the special operation “ ” and holds for all real α, β as required in the definition of a linear its different meaning in the two spaces, we shall con- (Q , +, R, ·) space. tinue to make a distinction between the spaces , (Q , +, R, ·) To characterize Q = {x ∈ Q | x = ¬x}, note that , calling the first one linear, and the second one Q Q is a quasivector space of centred elements, to be briefly centred (quasivector); also the elements of will be called Q called centred quasivector space.Asx = ¬x is equivalent linear, and the elements of — centred. Let us recall that “∗” and “·” coincide in Q , but are distinct in Q ,soQ will to x + x− =0, conjugation coincides with the opposite operator in Q . The centred quasivector space Q can be be always considered together with the two multiplications: (Q , +, R, ·, ∗) converted into a linear space by re-defining multiplication . by scalars as follows: Applying the Decomposition theorem we have that any finite dimensional quasivector space Q is a direct sum (Rm, +, R, ·) α · c = α ∗ cσ(α),c∈ Q . of two spaces — the linear space and the (13) n centred quasivector space (R , +, R, ·, ∗), symbolically: m n Now (Q , +, R, ·) is a linear space. Indeed, substituting (Q, +, R, ·, ∗)=(R , +, R, ·) (R , +, R, ·, ∗). Thus we (13) in (11), we obtain that the second distributive law (α + can write down the operations addition and multiplication β) · c = α · c + β · c is valid for all real α, β; furthermore, by scalars for intervals a =(a ; a ),b=(b ; b ) ∈ Q in properties (7)–(9) remain valid: γ ·(a+b)=γ ·a+γ ·b, α· the form: (β · c)=(αβ) · c, 1 · a = a. a+b =(a ; a )+(b ; b )=(a + b ; a + b ), For a better distinction between the two multiplica- (16) tions by scalars, we call “∗” quasi-vector multiplication by γ ∗ b = γ ∗ (b ; b )=(γ ∗ b ; γ ∗ b )=(γb ; |γ|b ), (17) scalars and “·”—linear multiplication by scalars. Note that m n the linear multiplication by scalars “·” defined by (13) coin- wherein a ,b ∈ R , a ,b ∈ R , γ ∈ R and in the last cides in Q with the quasi-vector one, as in Q relation (13) expression relation (14) is applied. (Rn, +, R, ·, ∗) n =3 becomes α · c = α ∗ c (due to c− = c). Therefore For example, in the space for , the quasi-multiplication of centred elements by scalars looks as (Q, +, R, ·)=(Q , +, R, ·) Corollary 2. The space follows: −2 ∗ (1, 2, 1) = (2, 4, 2); −1 ∗ (−1, 2, −2) = (Q , +, R, ·) · , with “ ” defined by (13), is a linear space. (−1, 2, −2). Multiplication by −1 (negation) coincides Formula (13) gives an expression in Q for the linear with identity. To change the signs of the components of multiplication by scalars in terms of the quasi-vector one. a centred element one should take the opposite (or dual) Conversely, in Q the quasi-vector multiplication by scalars and not negation, e. g. opp(−1, 2, −2) = (−1, 2, −2)− = is expressed by the linear one via (1, −2, 2). For negation, subtraction, opposite and dual in Q we have: α ∗ c = |α|·c, c ∈ Q . (14) ¬a = −1 ∗ (a ; a )=(−a ; a ), (18) Naturally, the quasi-vector multiplication by scalars is in- a ¬ b = a +(−1) ∗ b =(a − b ; a + b ), (19) clusion isotone, that is a ≤ b =⇒ γ ∗ a ≤ γ ∗ b, γ ∈ R, opp (a)=¬a− = ¬(a ; a ) =(−a ; −a ), (20) as a ≤ b =⇒|γ|a ≤|γ|b (which is not true for the linear − (a)=a =(a ; a ) =(a ; −a ). multiplication by scalar). dual − − (21)
2.4. Presentation of elements 3. Presentation of intervals
The spaces involved in Corollary 2 are vector spaces. We see that any element of a finite quasivector space is If we want to efficiently compute within these spaces we represented in the form (15) as a sum of a linear and a cen- should assume them to be finite dimensional so that their tred component. This refers to arbitrary finite quasivector elements have finite presentation. Thus we shall assume spaces, such as quasivector spaces of zonotopes [9]. We next wish to interpret formulae (18)–(15) for the case of in- Let us check if the linear part (23) of a proper element terval vectors. To this end we shall consider the presentation (A, 0) is a proper element. Clearly, (A, ¬A) is a proper of proper intervals. element if there exists X ∈ M such that (A, ¬A)=(X, 0), Intervals (interval vectors) are usually represented by that is A = X ¬ A. As we know this property is satisfied sets of real numbers. Two familiar presentations of inter- by intervals, indeed, we have X =(2a ;0), where a is vals (interval vectors) are those by pairs of real numbers the midpoint of A. Hence, for the value of (23) we obtain (vectors) either in end-point or midpoint-radius form. Up to 1/2 ∗ X =(a ;0). Summarizing we obtain for the proper now we intentionally avoided to represent intervals in any interval (a ; a ), a ≥ 0: (a ; a )=(a , 0)+(0; a ), which form. We now discuss the presentation of intervals by draw- is exactly of the form (15). ing consequences directly from their algebraic properties. Remarks. 1. Presentation (a ; a )=(a , 0) + (0; a ) corresponds (for a > 0) to the symbolic form a = a ± 3.1. The case of proper intervals a often used in engineering sciences. 2. Convex bodies with so-called Minkowski operations [7], [14] also form a IRn Proper intervals as elements of are pairs of the form quasivector space. However, for convex bodies the equation (A, 0) A ∈ IRn (A, 0) , where . Assume first that is a linear A = X ¬ A is not solvable in general and consequently, (A, 0) ¬ (A, 0) = 0 element, that is ; using (4) this means the linear part of a proper convex body may not be proper. A ¬ A =0 . As we know such properties possess exactly For example, in the case of two-dimensional convex bodies, A ∈ IRn Rn the point intervals , that is vectors from . As- if A is a (proper) triangle, then such X does not exist and (A, 0) (A, 0) = ¬ (A, 0) sume now that is centred, that is ; consequently the linear part of A is an improper element. A = ¬ A A ∈ IRn according to (4) this means , that is is a In the case of proper intervals (a ≥ 0)wehave centred interval (symmetric with respect to the origin). Recall that point intervals are represented as [α, α] and A ¬ B = {α − β | α ∈ A, β ∈ B}, (25) proper centred ones — as [−β,β], where β ≥ 0 is the radius and, in particular, ¬B = {−β | β ∈ B}. of the centred interval. Any proper interval is written in end- point form as [α − β, α + β],β≥ 0. MR-presentation of improper intervals. Formula (15) In midpoint-radius form (briefly: MR-form) a proper in- demonstrates the practical meaning of the decomposition terval A ∈ IRn is written as A =(a ; a ), a ≥ 0 and is theorem: there the linear (point) interval of the form (a ;0) interpreted as the set: corresponds to the midpoint of a and the centred interval (0; a ) corresponds to the radius (error bound) of a.A n A =(a ; a )={ξ ∈ R ||ξ − a |≤a }, (22) negative radius a < 0 corresponds to improper interval (a ; a ). In the end-point form [a − a ,a + a ] of an im- wherein the module of a vector is meant component-wise, proper interval we have that the left end-point is greater than |(α , ..., α )| =(|α |, ..., |α |) i. e. 1 n 1 n . Using MR-form a the right one, a − a ≥ a + a . (a ;0) point interval is written as and a proper centred one Let us check how improper one-dimensional elements of (0; a ) a ≥ 0 a as , . The component is called the midpoint the form a =(0,A), A ∈ IR, are decomposed according a and the component — the radius or error (bound). to the Decomposition theorem. a Within the interpretation (22) the component is an For the linear and the centred part (summand) of a = Rn element of the vector space , whereas the component (0,A) we obtain respectively: a ≥ 0 is an element of the monoid (Rn)+. The relation between the MR-presentation (22) and the end-point pre- (1/2) ∗ (a + a−)=(1/2) ∗ (¬A, A), (26) sentation is given by (a ; a )=[a − a ,a + a ], resp. (1/2) ∗ (a ¬ a)=(1/2) ∗ (0,A¬ A). (27) [e−,e+]=((e+ + e−)/2; (e+ − e−)/2]. Let us see how proper elements of the form a =(A, 0), Clearly, the centred part (27) of the improper interval (0,A) A ∈ IR, are decomposed according to the Decomposition is an improper interval. Denoting A =(a ; a ), a ≥ 0,we theorem. For the linear and the centred parts (summands) have A ¬ A =(a ; a )+(−a ; a )=(0;2a ), hence the of a =(A, 0) we obtain respectively, cf. (12): value of (27) is opp(0; a )=(0;−a ). Let us check if the linear part (26) of an improper el- (1/2) ∗ (a + a−)=(1/2) ∗ (A, ¬A), (23) ement (0,A) is an improper element. Clearly, (¬A, A) (1/2) ∗ (a ¬ a)=(1/2) ∗ (A ¬ A, 0). (24) is an improper element if there exists X ∈ M such that (¬A, A)=(0,X), that is A = X ¬ A.Wehave It is immediately seen that the centred part (24) of a proper X =(2a ;0), where a is the midpoint of a. Hence, for the interval is a proper interval. Denoting A =(a ; a ), a ≥ value of (26) we obtain opp(1/2 ∗ X)=(−a ;0). We con- 0, using (19), we have A ¬ A =(a ; a )+(−a ; a )= clude that the linear part of any interval is always a (proper) (0; 2a ), which gives a value of (24) equal to (0; a ). point interval. Summarizing we obtain for the improper interval the interval problem (28) reduces to two linear algebraic (0,A)=opp(a ; a ), a ≥ 0: (−a ; −a )=(−a ;0)+ problems, one for the midpoints and one for the radii: (0; −a ), which is in agreement with (15). Therefore we can consider (15) as a generalization of the MR- Ax = b , (29) presentation of intervals. The difference is that the value |A| x = b . (30) of a in (15) can be negative. We can thus speak of neg- ative radii (or negative errors) corresponding to improper Assuming that the real matrices A and |A| are nonsingular intervals. we obtain for the solution of (29)–(30): x = A−1b , x = −1 −1 In the n-dimensional case a in (15) belongs to Rn and |A| b . We must assume |A| b ≥ 0, so that x ≥ 0. thus the components of a can have negative values corre- We thus obtain the following corollary: sponding to improper one-dimensional intervals. We can again speak of n-dimensional radii (or errors) whose com- Corollary. Given system (28), such that A ∈ Rn×n, ponents are not necessarily nonnegative. b =(b ,b ) ∈ IRn, assume that the real matrices A and |A| are nonsingular and |A|−1b ≥ 0. Then there exists a A, B ∈ IR (A, B) ∈ IR Proposition 1. Let and as unique interval solution x to (28). defined in Section 2.1. If A =(a ; a ),B =(b ; b ), then (A, B)=(a − b ; a − b ) . Sets of solutions. For the determination of solution sets Proof. We have of algebraic problems with uncertain parameters it is of practical significance [2], [3], [15] to be able to find the set (X, 0), if A = B + X, (A, B)= n (0,Y), if A + Y = B, {x ∈ R | A ∗ x ⊆ b}, (31)