<<

1

Geometric- Adaptive Filters Wilder B. Lopes∗, Member, IEEE, Cassio G. Lopes†, Senior Member, IEEE

Abstract—This paper presents a new class of adaptive filters, namely Geometric-Algebra Adaptive Filters (GAAFs). They are Faces generated by formulating the underlying minimization problem (a deterministic cost ) from the perspective of (GA), a comprehensive mathematical language well- Edges suited for the description of geometric transformations. Also, (directed lines) differently from standard adaptive-filtering , (the extension of GA to differential calculus) allows Fig. 1. A polyhedron (3-dimensional polytope) can be completely described for applying the same derivation techniques regardless of the by the geometric of its edges (oriented lines, vectors), which type (subalgebra) of the data, i.e., real, complex , generate the faces and hypersurfaces (in the case of a general n-dimensional , etc. Relying on those characteristics (among others), polytope). a deterministic quadratic cost function is posed, from which the GAAFs are devised, providing a generalization of regular adaptive filters to subalgebras of GA. From the obtained update rule, it is shown how to recover the following least-mean squares perform calculus with hypercomplex quantities, i.e., elements (LMS) adaptive filter variants: real-entries LMS, complex LMS, that generalize complex numbers for higher [2]– and quaternions LMS. Mean-square analysis and simulations in [10]. a system identification scenario are provided, showing very good agreement for different levels of measurement noise. GA-based AFs were first introduced in [11], [12], where they were successfully employed to estimate the geometric Index Terms—Adaptive filtering, geometric algebra, quater- transformation ( and translation) that aligns a pair of nions. 3D clouds (a recurrent problem in vision), while keeping a low computational footprint. This work takes a I.INTRODUCTION step further: it uses GA and GC to generate a new class of AFs DAPTIVE filters (AFs), usually described via linear able to naturally estimate hypersurfaces, hypervolumes, and A algebra (LA) and standard [1], can be elements of greater dimensions (), generalizing interpreted as instances for geometric estimation, since the regular AFs from the literature. Filters like the regular least- weight vector to be estimated represents a directed in an mean squares (LMS) – real entries, the Complex LMS (CLMS) underlying , i.e, a n-dimensional vector encodes – complex entries [13], and the LMS (QLMS) – the length and direction of an edge in a n-dimensional polytope quaternion entries [14]–[16], are recovered as special cases of (see Fig. 1). However, to estimate the polytope as a whole the more comprehensive GA-LMS introduced herein. (edges, faces, and so on), a regular AF designed in of The text is organized as follows. Section II covers the LA might not be very helpful. transition from linear to geometric algebra, presenting the Indeed, LA has limitations regarding the representation of fundamentals of GA and GC. In Section III, the standard geometric structures [2, p. 20]. Take for instance the quadratic cost function, usually adopted in adaptive filtering, between two vectors: it always results in either a is recovered as a particular case of a more comprehensive or a matrix (2-dimensional array of numbers). Thus, one may quadratic cost function that can only be written using the wonder if it is possible to construct a new kind of product geometric product. In Section IV the of that cost that takes two vectors (directed lines or edges) and returns function is calculated via GC techniques and the GA-LMS arXiv:1608.03450v2 [math.OC] 21 Aug 2018 a hypersurface (the face of an n-dimensional polytope); or is formulated. Section V provides a mean-square analysis takes a vector and a hypersurface and returns another polytope. (steady state) with the of the energy conservation Similar ideas have been present since the advent of algebra, in relations [1]. Experiments are shown in Section VI to assess an attempt to establish a deep connection with geometry [3], the performance of the GAAFs against the theoretical analysis [4]. in a system-identification scenario. Finally, conclusion remarks The geometric product, which is the product of are presented in Section VII. GA [5], [6], captures the aforementioned idea. It allows one Remarks about notation: Table I summarizes the notation to map a of vectors not only onto scalars, but also onto convention. While those are deterministic quantities, their hypersurfaces and n-dimensional polytopes. The use of GA in- boldface versions represent random quantities. The name array creases the portfolio of geometric shapes and transformations of multivectors was chosen to avoid confusion with vectors in one can represent. Also, its extension to calculus, namely geo- Rn, which in this text have the usual meaning of collection metric calculus (GC), allows for a clear and compact way to of real numbers (row or column). In this sense, an array of multivectors can be interpreted as a “vector” that allows ∗&D Dept. at UCit (ucit.fr) and OpenGA.org, [email protected]. †Dept. of Electronic Systems , University of Sao Paulo, Brazil, cas- for hypercomplex entries (numbers constructed by adding [email protected]. “imaginary units” to real numbers [17]). 2

TABLE I that two nonparallel directed segments determine a - SUMMARY OF NOTATION. ogram, a notion which can not be described by the inner Definition product. The multiplication a∧b results in an oriented surface a Arrays of multivectors, vectors and scalars. or (see Fig. 2a). Such a surface can be interpreted A General or matrix. as the parallelogram (hyperplane) generated when vector a is r (type of multivector). Time-varying scalar, vector (or array of multivectors), swept on the direction determined by vector b. Alternatively, a(i), a ,A(i) i and general multivector (or matrix), respectively. the can be defined as a function of the angle θ between a and b

a ∧ b = Ia,b|a||b|sinθ, (3) II.FROM LINEARTO GEOMETRIC ALGEBRA 1 where Ia,b is the unit bivector that defines the orientation of To start transitioning from LA to GA, one needs to recall the the hyperplane a ∧ b [2, p.66]. Note that in the particular case definition of an algebra over the reals [6], [9], [18]: a vector of 3D , (3) is reduced to the space V over the field R, equipped with a V × a × b = p|a||b|sinθ, where p is the normal to the V → V denoted by ◦ (the product operation of the algebra), plane containing {a, b}. From Fig. 2a it can be concluded that is said to be an algebra over R if the following relations hold the outer product is noncommutative, i.e., a ∧ b = −b ∧ a: ∀{a, b, c} ∈ V and {α, β} ∈ R, the orientation of the surface generated by sweeping a along (a + b) ◦ c = a ◦ c + b ◦ c (Left distributivity), b is opposite to the orientation of the surface generated by c ◦ (a + b) = c ◦ a + c ◦ b (Right distributivity), sweeping b along a. 2 (αa) ◦ (βb) = (αβ)(a ◦ b) (Compatibility with scalars). Finally, the geometric product of vectors a and b is denoted (1) by their juxtaposition ab and defined as Linear (matrix) algebra, utilized to describe adaptive fil- ab , a · b + a ∧ b, (4) tering theory, is constructed from the definition above. The in terms of the inner (·) and outer (∧) products [6, Sec. 2.2]. elements of this algebra are matrices and vectors, which Note that due to a ∧ b = −(b ∧ a) the geometric product is multiplied among themselves via the matrix product generate noncommutative, resulting in ab = −ba, and it is associative, new matrices and vectors. Additionally, to express the notion a(bc) = (ab)c, {a, b, c} ∈ Rn. of vector length and angle between vectors, LA adopts the In the fundamental elements are matri- V × V → R, i.e., inner product, which returns ces/vectors. In a similar way, in GA the elements are a real as a result of the multiplication between two the so-called multivectors (Clifford numbers). The structure vectors in V (one says that V is a normed vector space) [18, of a multivector can be seen as a generalization of complex p. 180]. numbers and quaternions for higher dimensions. For instance, Geometric (Clifford) Algebra derives from (1), however a α+jβ has a scalar part α combined with an with a different product operation. Such product, called geo- imaginary part jβ; quaternions like α+iβ1 +jβ2 +kβ3 expand metric product, is what allows for GA to be a mathematical that by adding two extra imaginary-valued parts. Multivectors language that unifies different algebraic systems trying to generalize this structure, which contains one scalar part and express geometric relations/transformations, e.g., rotation and several other parts named grades. Thus, a general multivector translation. The following systems are examples of A has the form integrated into GA theory: vector/matrix algebra, complex X A = hAi0 + hAi1 + hAi2 + ··· = hAig, (5) numbers, and quaternions [2], [5], [6], [9]. Such unifying g quality is put into use in this work to expand the capabilities which is comprised of its g-grades (or g-vectors) h·ig, e.g., g = of AFs. 0 (scalars), g = 1 (vectors), g = 2 (, generated via The fundamentals of GA are provided in the sequel. For an the geometric multiplication of two vectors), g = 3 (trivectors, in-depth discussion, the reader is referred to [2]–[10], [19]– generated via the geometric multiplication of three vectors), [23]. and so on. The grade g = 0 (scalar) is also denoted as hAi0 ≡ n hAi. Additionally, in R , hAig = 0, g > n [2]. The ability to A. Fundamentals of Geometric Algebra together scalars, vectors, and hypercomplex quantities in a unique element (multivector) is the foundation on top of Consider {a, b} vectors in n, i.e., arrays with real entries. R which GA theory is built – it allows for “summing apples and The inner product a · b is the standard bilinear form that oranges” in a well-defined fashion. Section II-B will show how describes vector length and angle between vectors in linear to recover complex numbers and quaternions as special cases (matrix) algebra. This way, a · b results in a scalar, of (5). a · b = |a||b|cosθ, (2) The multivectors that form the of a geometric algebra in which θ is the angle between a and b, and | · | denotes over the vector space V, denoted G(V), are obtained by the vector (). Additionally, the inner product multiplying the n vectors that compose the orthonormal basis is commutative, i.e., a · b = b · a. 1 The outer product a ∧ b is the usual product of the exterior An unit bivector is the result of the outer product between two unit vectors, i.e., vectors with unitary norm. algebra introduced by Grassmann’s Ausdehungslehre (theory 2In this text, from now on, all products are geometric products, unless of extension) [2], [3], [9], [24]. It captures the geometric fact otherwise noted. 3

3 Fig. 2. (a) Visualization of the outer product in R . The orientation of the circle defines the orientation of the surface (bivector). (b) The elements of 3 3 G(R ) basis (besides the scalar 1): 3 vectors, 3 bivectors (oriented surfaces) Fig. 3. Multiplication table of G(R ) via the geometric product. The elements 3 + 2 γij , and the trivector I (/oriented volume). Note that in R , highlighted in magenta constitute the multiplication table of G (R ) (even hAig = 0, g > 3 [2, p.42]. subalgebra isomorphic to complex numbers). Similarly, the elements enclosed + 3 in the green form the multiplication table of G (R ) (even subalgebra isomorphic to quaternions). of V via the geometric product (4). This action generates 2n multivectors, which implies that dim{G(V)} = 2n [5, p. 3 19]. Those 2n multivectors are called blades of G(V). For The Complete Geometric Algebra of R is obtained by the special case n = 3 ⇒ V = R3, with orthonormal basis multiplying the elements of (6) via the geometric product. As {γ1, γ2, γ3}, the procedure above yields the following blades depicted in Fig. 3, this results in several terms which linearly ( 3) {1, γ1, γ2, γ3, γ12, γ23, γ31,I}, (6) combined represent all the possible multivectors in G R . 2 3 2 3 3 3 Also, note that since R ⊂ R ⇒ G(R ) ⊂ G(R ), the which together are a basis for G(R ) with 2 = 8. 2 Note that (6) has one scalar, three orthonormal vectors γ multiplication table for G(R ) can be recovered from Fig. 3. i 3 2 3 The even subalgebras [2], [6], [9] of G( ) and G( ), i.e., (basis for R ), three bivectors (oriented surfaces) γij , γiγj = R R γ ∧ γ , i 6= j (γ · γ = 0, i 6= j), and one trivector those whose elements have only even grades (g = 0, 2, 4, ··· i j i j + 3 + 2 (pseudoscalar) I γ γ γ = γ (Fig. 2b). In general, the in (5)) and denoted G (R ) and G (R ), are of special , 1 2 3 123 + 2 pseudoscalar I is defined as the multivector of highest grade in interest. One can show that G (R ), with basis {1, γ12} an algebra G(V). It commutes with any multivector in G(V), (also {1, γ23} and {1, γ31}), is isomorphic to the complex hence the name pseudoscalar [5, p. 17]. numbers [6]. This is established by identifying the imaginary To illustrate the geometric multiplication between elements unit j with the bivector γ12, j = γ12 = γ1γ2 = γ1 ∧ γ2. From 2 2 2 3 Fig. 3 it is known that (γ12) = −1. Then, j = (γ12) = −1, of G(R ), take two multivectors C = γ1 and D = 2γ1 + demonstrating the . Similarly, G+( 3) with basis 4γ3. Then, CD = γ1(2γ1 + 4γ3) = γ1 · (2γ1 + 4γ3) + γ1 ∧ R {1, γ12, γ23, γ31} is shown to be isomorphic to quaternion (2γ1 + 4γ3) = 2 + 4(γ1 ∧ γ3) = 2 + 4γ13 (a scalar plus a bivector). Another example is provided to highlight the reverse algebra via the adoption of the following correspondences: of a multivector A, which is the GA counterpart of complex i ↔ −γ12, j ↔ −γ23, k ↔ −γ31, where {i, j, k} are the three conjugation in linear algebra, defined as imaginary unities of a quaternion. The minus signs are neces- n sary to make the product of two bivectors equal to the third one X g(g−1)/2 Ae , (−1) hAig. (7) and not minus the third, e.g., (−γ12)(−γ23) = γ13 = −γ31, g=0 just like quaternions, i.e. ij = k, jk = i, and ki = j [6], [29], + 2 + 3 3 Thus, given A = hAi0 + hAi1 + hAi2, its reverse is Ae = [30]. Additionally, note that G (R ) ⊂ G (R ) ⊂ G(R ). hgAi0 + hgAi1 + hgAi2 = hAi0 + hAi1 − hAi2. Therefore, the It follows that the dimension of the complete GA of G(V) reverse of the CD multiplication above is (^CD) = 2 − 4γ . can be obtained from its subalgebras, i.e., dim{G(V)} = 13 n n Note that since the 0-grade of a multivector is not affected by P g P n n dim{G (V)} = g = 2 . reversion, mutually reverse multivectors, say A and Ae, have g=0 g=0 the same 0-grade, hAi0 = hAei0. C. Performing Rotations with Multivectors (Rotors) B. Subalgebras and The even subalgebra G+(Rn) is also known as the algebra The GAAFs are designed to compute with any multivector- of rotors, i.e., its elements are a special type of multivector valued quantity, regardless if it is real, complex, quaternions, (called rotors) able to apply rotations to vectors in n [2], etc. Indeed, real ( ), complex ( ), and quaternion algebras R R C [5]. Given a vector x ∈ Rn, it can be rotated by applying the (H), commonly used in adaptive filtering and optimization rotation r(·)r literature [14]–[16], [25]–[28], are subsets (subalgebras) of the e x → rxr , (8) |{z}e GA created by general multivectors like (5). Thus, to support rotated + n the generalization of standard AFs achieved by GAAFs (Sec- where r ∈ G (R ) is a rotor, re is its reverse, and rre = 1, i.e., tion IV), this section shows how those subalgebras of interest r is a unit rotor. Note that the unity constraint is necessary to can be retrieved from G(R3), the complete GA of R3, via avoid the rotation operator to scale the vector x, i.e., to avoid isomorphisms3. changing its norm. + n 3 A rotor r ∈ G (R ) can be generated from the geometric For simulation purposes, this paper focuses on the case n = 3, i.e., the n n 3 multiplication of two unit vectors in . Given {a, b} ∈ , subalgebras of G(R ). However, the GAAFs can work with any subalgebra R R n of G(R ), ∀n ∈ Z \{0} (Section IV). |a| = |b| = 1, with an angle θ between them, and using Equa- 4

which results in a general multivector. The underlying product in each of the terms UjWj, j = {1, ··· ,M}, is the geometric product. Observe that due to the noncommutativity of the geometric product, uT w 6= wT u in general. Similarly, involving matrices of multivectors follow the general rules of matrix algebra, however using the geometric product as the underlying product, just like in (11). The reverse array is the extension of the reverse (a) (b) operation of multivectors to include arrays of multivectors. Fig. 4. (a) A rotor can be generated from the geometric multiplication of n ∗ two unit vectors in R . (b) Applying the rotation operator: the vector x is Given the array u in (10), its reverse version, denoted by , is rotated by an angle of 2θ about the normal n of the oriented surface Ia,b (a h i hyperplane). In two dimensions (2D), r is a complex number, r its conjugate, ∗ e u = Ue1 Ue2 ··· UeM . (12) and they rotate x about the normal of the complex plane. Note the similarity with the Hermitian conjugate, its counter- part in LA. From (11) and (12) it follows that tions (2)–(4), the exponential form of a rotor is [5, p. 107] M ∗ X u w = Uej Wj , (13) r = ab = a · b + a ∧ b = |a||b|cosθ + Ia,b|a||b|sinθ I θ (9) j=1 = cosθ + Ia,bsinθ = e a,b . which results in a general multivector. As shown in Fig. 4, x is rotated by an angle of 2θ about The array product between u∗ and u is represented by the 2 ∗ the normal of the oriented surface Ia,b (rotation axis) [2]. notation kuk , u u. Note the same notation is employed n Note that both quantities (θ and Ia,b) define the rotor in (9). in LA to represent the squared norm of a vector in R . The enabled by rotors was applied However, from (13) it is known that kuk2 = u∗u is a general in [11], [12] to devise AFs that estimate the relative rotation multivector, i.e., it is not a pure scalar value which in LA between 3D Point Clouds (PCDs). provides a measure of distance. In GA, the distance metric is given by the magnitude of a multivector, defined in terms p q INEAR STIMATION IN P 2 III.L E GA of the scalar product, e.g., |A| , A ∗ Ae = g |hAig| , This section introduces an instantaneous quadratic cost which is indeed a scalar value. Thus, for an array u and a function with multivector entries. This is key to expand the multivector U, estimation capabilities of AFs, generating the GAAFs further kuk2 = u∗u : is a multivector 2 P 2 (14) ahead in Section IV. Moreover, it is shown that the standard |U| = U∗Ue = Ue∗U = g |hUig| : is a scalar. LA counterpart can be recovered as a special case. The product between a multivector U and an array w, namely Uw, is defined as the geometric multiplication of U by A. Definitions each entry of w (a procedure similar to multiplying a scalar by a vector in LA). Due to the noncommutativity of the geometric The scalar product between two multivectors is A ∗ B , product, Uw 6= wU in general. hABi, i.e., it is the scalar part (0-grade) of the geometric multiplication between A and B (for the special case of vectors, a ∗ b = habi = a · b). Its commutativity originates B. General Cost Function in GA the cyclic reordering property [5] A ∗ B = hABi = hBAi = Following the guidelines in [5, p.64 and p.121], one can B ∗ A ⇒ hAB ··· Ci = hB ··· CAi. formulate a minimization problem in GA by defining the following cost function 2 An array of multivectors is a collection (row or M X column) of general multivectors. Given M multivectors J(D,Ak,X,Bk) = D − AkXBk , (15) 3 k=1 {U1,U2, ··· ,UM } in G(R ), the M × 1 array collects them where D,Ak,X,Bk are general multivectors. The term as follows M P     AkXBk (the of M multiplications) represents the U1 u(1,0)+u(1,1)γ1+ ··· +u(1,6)γ31+u(1,7)I k=1 . . canonical form of a multilinear transformation applied to the u=  .  =  .  .     multivector X [5, p.64 and p.121]. The goal is to optimize U u(M,0)+u(M,1)γ + ··· +u(M,6)γ +u(M,7)I M 1 31 {A ,B } in order to minimize (15). (10) k k Special cases of (15) are obtained depending on the values The array is denoted using lower case letters, the same as chosen for D,Ak,X,Bk and M. For instance, making M = 1, scalars and vectors (1-vectors). However, the meaning of the D = y (1-vector), X = x (1-vector), Ak = r (rotor), and symbol will be evident from the context. Additionally, this 2 Bk = re yields J(r) = |y − rxre| , the instantaneous version work adopts the notion of matrix of multivectors as well: it is of the cost function minimized in [11], [12] (subject to rre = a matrix whose elements are multivector-valued. rr = 1) to estimate the relative rotation between 3D PCDs. Given two M ×1 arrays of multivectors, u and w, the array e In this paper it is studied the case in which X = Uek, Ak = product between them is defined as 1, Bk = Wk (general multivectors), so that M T X M 2 u w = U1W1 + U2W2 + ··· UM WM = Uj Wj , (11) X ∗ 2 2 J(w) = D − U W = |D − u w| = |E| , (16) j=1 ek k k=1 5 where M now also represents the system order (the number analytical tool in GA to convert from nonorthogonal to or- of taps in the filter), E = D − u∗w is the estimation error, thogonal vectors and vice versa – it simplifies the analytical and the definition of array product (13) was employed to make procedure ahead since mutually orthogonal elements cancel PM ∗ k=1 UekWk = u w. Notice that (16) is the GA counterpart out. Details are provided in [5, Section 1-3] and [6]. For the of the standard LA cost function [1, p.477]. And similarly to purpose of this paper, it suffices to know that the following k k its LA counterpart, D (multivector) is estimated as a linear holds for dual bases: λ ∗ λj = hλ λji = δk,j, where combination of the multivector entries of u. δk,j = 1 for k = j (). It is easy to show that the n n basis for G(R ), {λk}, for k = 0, ···, 2 −1 and its reversed IV. GEOMETRIC-ALGEBRA ADAPTIVE FILTERS version {λfk} comply with the relation above (i.e., they are dual In this section, the GAAFs are motivated by deriving the bases). Therefore, they are utilized from now on to decompose GA-LMS to minimize the cost function (16) in an adaptive multivectors into blades. In particular, applying (21) to ∂w manner. This is done by writing (16) at instant i, yielding an results in 2n−1 2n−1 instantaneous cost function J(i) [31], [32], as shown in the X X sequel. ∂w , λ`hλe`∂wi = λ`∂w,`, (22) At each iteration i, two multivector-valued signals D(i) and `=0 `=0 where each term ∂ in the sum is the usual from U(i) are observed. Samples of U(i) are collected into an M×1 w,` standard calculus, which affects only `. Form (22) pro- ∗ h i array ui = Ue1(i) Ue2(i) ··· UeM (i) . The vides some analytical advantages (see [34]) and is employed ˆ ∗ (array product) D(i) = ui wi−1 is used as an estimate of D(i), next to calculate the gradient ∂wJ(wi−1). generating the a priori output estimation error ˆ ∗ E(i) = D(i) − D(i) = D(i) − ui wi−1. (17) B. Calculating the Multivector-valued Gradient This way, from (16) and (17) ∗ 2 2 Noticing that (18) can be written in terms of the scalar J(wi−1) = |D(i) − ui wi−1| = |E(i)| . (18) product 2 Given E(i) and wi−1, the GAAFs update the estimate for J(wi−1) = |E(i)| = E(i) ∗ Ee(i), (23) the array of multivectors w via a recursive rule of the form one can decompose it in terms of its blades via (21), wi = wi−1 + µh, (19)  P2n−1   P2n−1  J(wi−1) = p=0 λpEp ∗ p=0 Epλep where µ (a scalar) is the AF step size (learning rate), and n n n P2 −1 2 P2 −1 2 P2 −1 2 h is an array of multivectors related to the estimation er- = p=0 Ep (λp∗λep)= p=0 Ep hλpλepi= p=0 Ep . ror e(i). This work adopts an instantaneous steepest-descent (24) rule [1], [31], [32], in which the AF is designed to follow The gradient ∂wJ(wi−1) is obtained by multiplying (22) the opposite direction of the instantaneous gradient of (18), and (24) 2n−1 ! 2n−1 ! 2n−1 namely ∂wJ(wi−1). This way, h , −B∂wJ(wi−1), yielding X X 2 X 2 the general form of a GA-based adaptive rule ∂wJ(wi−1)= λ`∂w,` Ep = λ`∂w,`Ep . `=0 p=0 p,d=0 wi = wi−1 − µB [∂wJ(wi−1)] , (20) ˆ ˆ(25) in which B is a matrix with multivector entries. Choosing B From (17) one can notice that Ep , Dp − Dp, where Dp = ˆ ∗ appropriately is a requirement to define the type of adaptive hD(i)ip = hui wi−1ip (similar for Dp). Thus 2 ˆ ˆ [1]. ∂w,`Ep =2Ep(∂w,`Ep)=2Ep(∂w,`(Dp−Dp))=−2Ep(∂w,`Dp), (26) A. A Note on Multivector Derivative where ∂w,`Dp = 0 since Dp does not depend on the weight Equation (20) requires the calculation of a multivector array w. Plugging (26) into (25) results in derivative. In GA, the ∂w has the alge- n n 2 −1 braic properties of a multivector in G(R ) [33]. Put differently, X ˆ ∂wJ(wi−1) = −2 λ`Ep(∂w,`Dp). (27) the gradient ∂wJ(wi−1) can be interpreted as the geometric p,`=0 multiplication between the multivector-valued quantities ∂ w The term Dˆ is obtained by decomposing Dˆ(i) into its and J(w ). p i−1 blades This way, it follows from (5) that ∂w can be decomposed 2n−1 into its basis blades. In fact, it is known that any multivector ˆ ∗ X ∗ D(i) = u wi−1 = λphλp(u wi−1)i, (28) n i e i A ∈ G(R ) can be decomposed into blades via p=0 n n n ∗ 2 −1 2 −1 2 −1 which requires to perform the decomposition of ui and wi−1 X k X k X A = λk(λ ∗ A) = λkhλ Ai = λkAk, (21) (arrays of multivectors). Indeed, arrays of multivectors can be k=0 k=0 k=0 decomposed into blades as well. For instance k           in which Ak is scalar-valued, and {λ } and {λk}, k = 1 + γ12 1 0 0 1 n = + γ1+ γ3+ γ12. (29) 0, ··· , 2 −1 are the so-called reciprocal (dual) bases of 2 + γ1 + γ3 2 1 1 0 G(Rn)4. The concept of reciprocal (dual) bases is a useful Thus, employing (21) once again, ui and wi−1 can be written n 4Two symbols are used to refer to GA basis, each with a different purpose. in terms of their 2 blades 2n−1 2n−1 The symbol γij (with two indexes) is adopted when dealing with specific 2 3 n ∗ X T X algebras, e.g., G(R ) or G(R ). For general algebras G(R ) and analytical ui = ui,qλeq and wi−1 = λtwi−1,t, (30) derivation, the symbol λk (with only one index) is more appropriate. q=0 t=0 6

T where ui,q and wi−1,t are respectively 1×M and M ×1 arrays AF for several types of u and w entries: general multivectors, with real entries. Plugging (30) back into (28)5 rotors, quaternions, complex numbers, real numbers – any n 2n−1 2n−1 2n−1 subalgebra of G(R ). Dˆ(i) = u∗w = P λ hλ (P uT λ P λ w )i i i−1 p=0 p ep q=0 q eq t=0 t t This is a very interesting result, accomplished due to the n n P2 −1 P2 −1 T comprehensive analytic tools provided by Geometric Cal- = λp hλep(u λeqλtwt)i p=0 q,t=0 q culus. Recall that, in adaptive filtering theory, the transi- n n P2 −1 P2 −1 T = p=0 λp q,t=0 hλepλeqλti(uq · wt) tion from real-valued AFs to complex-valued AFs requires n one to abide by the rules of differentiation with respect to P2 −1 ˆ = p=0 λpDp, in which complex-valued variables, represented by the Cauchy-Riemann (31) 2n−1 conditions (see [1, p.25]). Similarly, quaternion-valued AFs ˆ X T n require further differentiation rules that are captured by the Dp = hλepλeqλti(uq · wt), p = 0, ··· , 2 −1 (32) q,t=0 Hamilton-real (HR) calculus [14]–[16] and its generalized ˆ ∗ version (GHR) [28]. Although those approaches are successful, is the expression of Dp as a function of the blades of ui wi−1. ˆ each time the underlying algebra is changed, the analytic tools From (32), the term ∂w,`Dp of (27) becomes h n i ˆ P2 −1 T need an update as well. This is not the case if one resorts to GA ∂w,`Dp =∂w,` q,t=0 hλepλeqλti(uq · wt) n (33) and GC to address the minimization problem: the calculations P2 −1 T = q,t=0 hλepλeqλti∂w,`(uq · wt). are always performed the same way. T It is important to notice that ∂w,`(uq · wt) will be different V. MEAN-SQUARE ANALYSIS (STEADY STATE) from zero only when ` = t, i.e., when ∂w,` and wt are of same blade. This is the case since ∂w,` is the of w The goal of the analysis is to derive an expression for the with respect to the blade ` only. Therefore, if ` 6= t then the mean-square error (MSE) in steady-state of GAAFs via energy T partial derivation yields zero, i.e., ∂w,`wt = 0 ⇒ ∂w,`(uq · conservation relations (ECR) [1]. To achieve that, first some T wt) = 0 (note that uq does not depend on w). Thus, adopting quantities and metrics are recast into GA. the Kronecker delta function [34], ∂ (uT · w ) = δ uT , w,` q t t,` q A. Preliminaries and (33) becomes 2n−1 A random multivector is one whose blade coefficients are ˆ X T 3 ∂w,`Dp = hλepλeqλtiδt,`uq . (34) random variables. For instance, a random multivector in G(R ) q,t=0 is Finally, substituting (34) into (27), the stochastic gradient is obtained A=a(0)+a(1)γ1+a(2)γ2+a(3)γ3+a(4)γ12+ ··· +a(7)I, (37) P2n−1 P2n−1 T ∂wJ(wi−1) = − 2 p,`=0 λ`Ep q,t=0 hλepλeqλtiδt,`uq where the terms a(0), ···, a(7) are independent and identically n n distributed (i.i.d.) real-valued random variables. By extension, = − 2 P2 −1 E P2 −1 λ hλ λ λ iuT p,`=0 p q=0 ` epeq ` q random arrays are arrays of random multivectors. P2n−1 ∗ The ECR technique is an energy balance in terms of the = − 2 p,`=0 Epλ`hλepui λ`i (35) n following (random) error quantities = − 2 P2 −1 E λ hλ u λ i p,`=0 p ` e` i p  o ∆wi−1 , (w − wi−1) weight-error array P2n−1  = − 2 Epuiλp = −2uiE(i) . ∗ p=0 Ea(i) = ui ∆wi−1 a priori estimation error (38)  ∗ In the AF literature, setting B equal to the Ep(i) = ui ∆wi a posteriori estimation error in (20) results in the stochastic-gradient update rule [1]. This together with the AF’s recursion. is adopted here as well in order to devise the GA version The stationary data model is captured by the following set of the LMS filter – however, GA allows for selecting B with of assumptions multivector entries, opening up the possibility to generate other (1) There exists an array of multivectors wo such that types of GA-based adaptive filters. Substituting (35) into (20) ∗ o and setting B equal to the identity matrix yields the GA-LMS D(i) = ui w + V (i) ; update rule (2) The noise {V (i)} is i.i.d. with constant wi = wi−1 + µuie(i) , (36) EhVe (i)V (i)i; where the 2 in (35) was absorbed by the scalar step size µ. (3) The noise sequence {V (i)} is independent of uj for all Note that the GA-LMS (36) has the same format of standard i, j, and all other data; LMS AFs [31], namely the real-valued LMS (u and w have (4) The initial condition w−1 is independent of all real-valued entries) and the complex-valued LMS (u and w {D(i), ui, V (i)} ; ∗ have complex-valued entries). However, this work puts no (5) The regressor matrix is Ru = E uiui > 0 ; constraints on the entries of the arrays u and w – they can (6) The random quantities {D(i), ui, V (i)} are zero mean. be any kind of multivector. This way, the update rule (36) is (39) valid for any u and w whose entries are general multivectors As in linear algebra, the steady-state MSE in GA must be n scalar-valued. To this end, the MSE is defined as in G(R ). In other words, (36) generalizes the standard LMS D 2 E D E 5 MSE = ξ , lim E kE(i)k = lim E Ee(i)E(i) , From now on, the iteration subscripts i and i − 1 are omitted from ui,q i→∞ i→∞ and wi−1,t for clarity purposes. (40) 7

2 where k·k , defined in (14), is applied to compactly denote in which ∗ is the GA scalar product and e is the reverse. the geometric product Ee(i)E(i). Further expansion gives 2 2 From the stationary linear data model (39), LHS = |∆wi| + |uiΓ(i)Ea(i)| E(i) =D(i)−u∗w =u∗(wo−w )+V (i)=E (i)+V (i). ∗  i i−1 i i−1 a + ∆wi∗ Eea(i)Γ(i)ui + uiΓ(i)Ea(i) ∗∆e wi, (41) | {z } The term E (i) is the a priori error, from which the steady- Sum of 3rd and 4th terms a (50) state excess mean-square error (EMSE) is defined, ^2 2 in which Γ(i) = Γe(i) since kuik = kuik holds (See D 2 E D E after (14)). Applying the definition of GA scalar product and EMSE = ζ lim kEa(i)k = lim Eea(i)Ea(i) . , i→∞ E i→∞ E (42) observing that the third and fourth terms of (50) are each From (41), EhEe(i)E(i)i = EhEea(i)Ea(i)i+EhVe (i)V (i)i, other’s reverse (i.e., their 0-grades are the same), their sum D ∗ E since V (i) is independent of any other random quantity and can be written as 2 Eea(i)Γ(i)ui ∆wi , its samples are assumed to be drawn from a zero-mean white where the cyclic reordering property was used. Note that the ∗ . Therefore, applying (40) and (42), MSE = term ui ∆wi is the definition of the a posteriori error Ep(i) EMSE + EhVe (i)V (i)i, analogous to the LA case. (see (45)). This way, (49) assumes the form 2 D E 2 |∆wi| + 2 Eea(i)Γ(i)Ep(i) + |uiΓ(i)Ea(i)| . (51) B. Steady-State Analysis A similar procedure allows for expanding the right-hand The ECR technique performs an interplay between the side (RHS) of (48) as energies of the weight error array ∆w and the error E at two D E |∆w |2 + 2 E (i)Γ(i)E (i) + |u Γ(i)E (i)|2 . (52) successive time instants, i − 1 (a priori) and i (a posteriori). i−1 ep a i p As a result, an expression for the variance relation is obtained, Substituting (51) and (52) into (48), and noting that the which is then particularized for each AF of interest. For details terms enclosed by the 0-grade operator are each other’s reverse on the ECR procedure, please refer to [1, p.228]. (leading to mutual cancellation of their 0-grades), Consider a GAAF whose update rule has the following 2 2 2 2 |∆wi| + |uiΓ(i)Ea(i)| = |∆wi−1| + |uiΓ(i)Ep(i)| , general shape (53) wi = wi−1 + µuif(E(i)), (43) which is an energy relation balancing out a priori and a where f(·) is a multivector-valued function of the estimation posteriori terms. Taking the expectation of the terms of (53) error E(i). Depending on the type of the GAAF (LMS, NLMS with respect to the random quantities d(i) and ui results in etc), f(·) assumes a specific value. |∆w |2+ |u Γ(i)E (i)|2 = |∆w |2+ |u Γ(i)E (i)|2. Subtracting (43) from the optimal weight array wo yields E i E i a E i−1 E i p (54) ∆w = ∆w − µu f(E(i)), (44) i i−1 i Calculating the limit of (54) as i → ∞ gives o ∗ in which ∆wi = w − wi. Multiplying from the left by ui 2 2 (array product) results in E |uiΓ(i)Ea(i)| = E |uiΓ(i)Ep(i)| , i → ∞, (55) ∗ ∗ u ∆wi = u [∆wi−1 − µuif(E(i))] i i 2 2 2 (45) in which the steady-state condition |∆wi| = |∆wi−1| = Ep(i) = Ea(i) − µ kuik f(E(i)), E E ∗ constant as i → ∞ was employed. Plugging (45) into (55) where Ep(i) = ui ∆wi is the a posteriori error, Ea(i) = ∗ results in ui ∆wi−1 is the a priori error (See (38)), and in the last 2 ∗ 2 equation kuik = ui ui (See (14)). 2 2 2 E |uiΓ(i)Ea(i)| = E |uiΓ(i)(Ea(i) − µ kuik f)| . (56) The multivector kuik is assumed to be factorable into a product of invertible vectors6 [5, p. 14], which guarantees the The right-hand side of (56) is expanded as 2 −1 existence of a multiplicative inverse Γ(i) kuik . This , 2 D ∗E 2 2 allows for solving (45) for f(E(i)) E|uiΓ(i)Ea(i)| − 2µ E uiΓ(i)Ea(i)feui + µ E|uif| . −1 f(E(i)) = µ Γ(i)[Ea(i) − Ep(i)] , (46) (57) which substituted into (44) results in Plugging (57) back into (56) and cancelling out the term 2 E|uiΓ(i)Ea(i)| on both sides results in

∆wi + uiΓ(i)Ea(i) = ∆wi−1 + uiΓ(i)Ep(i). (47) D ∗E 2 2 2µ E uiΓ(i)Ea(i)feui = µ E|uif| . (58) Taking the squared magnitude of both sides, 2 2 |∆wi + uiΓ(i)Ea(i)| = |∆wi−1 + uiΓ(i)Ep(i)| . (48) Finally, applying the cyclic reordering on the left-hand side ∗ | {z } | {z } of (58) to make ui uiΓ(i) = 1 yields the variance relation LHS (48) RHS (48) D E 2 The left-hand side (LHS) is expanded as 2 E Ea(i)fe = µ E uif . (59)    e LHS = ∆wi +uiΓ(i)Ea(i) ∗ ∆wi +uiΓ(i)Ea(i) (49) For the GA-LMS, f(E(i)) = E(i) = Ea(i) + V (i) 6 2 This assumes that kuik has only one type of grade: vector or bivector or (see (41)). Substituting into (59), trivector and so on. In practice, ku k2 can be a general multivector composed i D  E   2 by different grades. However, such assumption allows for a clearer analysis 2 E (i) E (i) + V (i) = µ u E (i) + V (i) . procedure, and ultimately does not compromise the accuracy of the EMSE E a ea e E i a expression, as shown in the simulations. (60) 8

TABLE II VI.SIMULATIONS STEADY-STATE EMSE FORTHE GA-LMS USINGEVENALGEBRAS. This Section shows the performance of the computa- 2 h P n i 2 2 tional implementation of GAAFs in a system identification µM k 2k σuσv G+( n) (Even Algebras) , for k ∈ 7 optimal weight array wo R 2 P n  N task . The to be estimated has 2 − µMσu k 2k o M multivector-valued entries (number of taps), namely Wj , G+( 3) (Quaternions) G+( 2) (Complex) G+( ) (Real) R R R o  o o o T 2 2 2 2 2 2 j = 1, ··· ,M w = W W ··· W 16µMσuσv 4µMσuσv µMσuσv , 1 2 M . Each case studied 2 2 2 2 − 4µMσu 2 − 2µMσu 2 − µMσu in the sequel (multivector, rotor, complex, and real entries) o adopts a different value for Wj (highlighted ahead). As aforementioned, the measurement noise multivector V The left-hand side of (60) becomes has each of its coefficients drawn from a white Gaussian stationary process with variance σ2 . LHS (60) =2 E Ea(i)Eea(i) + 2 E Ea(i)Ve (i) V 2   2 =2 E |Ea(i)| + 2 E Ea(i) ∗ E Ve (i) =2 E |Ea(i)| , A. Multivector Entries (61) where once more the independence of V (i) was utilized, and The underlying geometric algebra in this case is G(Rn), E Ve (i) = E V (i) = 0 (its entries are drawn from a zero-mean with n = 3, i.e., the one whose multivectors are described by white Gaussian process). basis (6). The derivation of the GAAFs puts no restriction on The right-hand side of (60) is expanded as the values the vector space dimension n can assume. However, h ∗i the case n = 3 (generating a GA with dimension 8) is an RHS (60) = µ ui(Ea(i) + V (i)) ∗ (Eea(i) + Ve (i))u = E i example that captures the core idea of this work: the GAAFs 2 2 2 2 = µ E kuik kEa(i)k + µ E kuik kV (i)k . can estimate hypercomplex quantities which generalize real, (62) complex, and quaternion entries. Substituting (61) and (62) into (60) yields In this case all the multivector entries of wo are the same, 2 2 2 2 2 namely W o = {0.55 + 0γ + 1γ + 2γ + 0.71γ + 1.3γ + 2 E |Ea(i)| = µ E kuik kEa(i)k +µ E kuik kV (i)k . j 1 2 3 12 23 (63) 4.5γ31 + 3I} for j = 1, ··· ,M. Those values were selected 2 2 in an aleatory manner. Note that the coefficient of γ is zero. Observing that E |Ea(i)| = E kEa(i)k , (63) is rewritten 1 as However, it is displayed to emphasize the structure of the G(R3) basis. 2 2 2 2 E (2 − µ kuik ) kEa(i)k = µ E kuik kV (i)k . (64) Fig. 5 (top) shows several learning curves (MSE and EMSE) for the GA-LMS estimating the weight array wo Adopting the separation principle (see [1, p.245]), i.e., in 2 with M = 10. The step size value is µ = 0.005 for all steady state kuik is independent of E(i) (and consequently simulated curves. Notice the perfect agreement between the of Ea(i)), (64) becomes theoretical error levels (obtained from (67) with n = 3) and (2 − µ ku k2) kE (i)k2 = µ ku k2 kV (i)k2 . the simulated steady-state error. Those experiments show that E i E a E i E the GA-LMS is indeed capable of estimating multivector- (65) 2 valued quantities, supporting what was previously devised From the Appendix it is known that the quantities E kuik 2 in Section IV. Fig. 5 (bottom) depicts the steady-state error and E kV (i)k depend of the underlying (sub)algebra. For 2 2 as a function of the system order (number of taps) M for G( n), kV (i)k and ku k are obtained via (69) and (71) R E E i σ2 = {10−2, 10−3, 10−5}. Theory and experiment agree respectively, which substituted into (65) yields V throughout the entire tested range M = [1, 40]. n 2 2 n 2 n 2 (2 − µM(2 σu)) E kEa(i)k = µM(2 σu)(2 σv), (66) B. Rotor, Complex, and Real Entries M where is the regressor length (number of filter taps), and Fig. 6 depicts the EMSE curves for three types of GA-LMS: σ2, σ2 are It is important to notice that since (71) is obtained u v G+(R3) (isomorphic to quaternions), G+(R2) (isomorphic considering the coefficients of the regressor entries are drawn to complex numbers), and G+(R) (isomorphic to reals). All from a circular Gaussian process (see appendix), the present 2 −3 filters have M = 10, µ = 0.005, and σV = 10 . However, analysis holds only for that kind of input. each AF has a different entries for the optimal weight array Finally, the expression for the GA-LMS steady-state EMSE o o n w : Wj = {0.55 + 0.71γ12 + 1.3γ23 + 4.5γ31} for rotors; using the complete algebra G(R ) is given by o o Wj = {0.55 + 0.71γ12} for complex; and Wj = {0.55} for µM4nσ2σ2 reals, with j = 1, ··· ,M. ζ = u v , i → ∞ . (67) n 2 The AFs are shown to be capable of estimating their LMS 2 − µM2 σu respective optimal weight arrays with very good agreement Table II summarizes the EMSE for the even subalgebras of with the theoretical value (Table II). Due to the aforemen- interest. Notice that for G+(R) the EMSE for the LMS with tioned isomorphism, those filters become alternatives to their real-valued entries is recovered (compare to Eq. 16.10 in [1, p.246] for white Gaussian inputs). To obtain the respective 7All the AFs were implemented using the Geometric Algebra 2 Expression Templates (GAALET) [35], a C++ library for evaluation of GA MSE, one should add EhVe (i)V (i)i= E kV (i)k to the EMSE expressions, and OpenGA [36]. The reader is encouraged to follow the value, as aforementioned. instructions on openga.org/ieeetsp.html in order to reproduce the simulations. 9

since they are not limited to 1-vector estimation (like LA- based AFs). Instead, they can naturally estimate any kind of multivector. Also, for the GA-LMS, the shape of its update rule is with respect to the multivector subalgebra. This is only possible due to the use of GA and GC. On top of the theoretical contributions, the experimental validation provided in Section VI shows that the GAAFs are successful in a system identification task. Nevertheless, it is expected that any estimation problem posed in terms of hypercomplex quantities will benefit from this work. For instance, GAAFs may be useful in data fusion, where different signals are supposed to be integrated in the same “package” and then processed. The multivector (and by extension the array of multivectors) can be interpreted as a fundamental information package that aggregates scalar, vector, bivector, and so on, quantities. New types of GAAFs are currently under study, particularly the NLMS and RLS variants. Together with the transient analysis of GA-LMS and the introduction of noncircularity conditions [25], [26], they figure as subjects of future pub- lications. Also, given the connection between exterior and algebras [37], it would be interesting to investigate how GAAFs and tensor-product AFs [38], [39] are related.

Fig. 5. GA-LMS for multivector entries. (Top) EMSE learning curves for 2 −2 −3 −5 APPENDIX µ = 0.01, M = 10, and σV = {10 , 10 , 10 } averaged over 100 2 experiments. (Middle) Steady-state EMSE as a function of the step size µ Calculating the expectation E kV (i)k : Take a random 3 for M = 10. Note the AF stands on the brink of divergence for µ > 0.02. multivector V ∈ G(R ) (see (37)) V = v(0) + v(1)γ1 + (Bottom) Steady-state EMSE as a function of the taps M for µ = 0.01. For ··· + v(6)γ31 + v(7)I, where v(k), k = 0, ··· , 7, are i.i.d. M > 23 the AF starts diverging. The simulated steady-state value is obtained real-valued random variables drawn from a zero-mean and by averaging the last 200 points of the ensemble-average learning curve for stationary white Gaussian process. Performing the geometric each µ and M. 2 product VVe = kV k and calculating its expectation results 40 in 2 2 2 2 2 Rotors EMSE theory E VVe = E v (0) + E v (1) + E v (2) + E v (3) + ··· + E v (7), 20

) (68) Complex B in which the expectations of the cross-terms are zero due d ( 0 2 Real to the i.i.d. assumption above. Each term E v (k), is said −20 to be the variance of v(k) and denoted v2(k) σ2. This EMSE E , v 2 3 −40 way, (68) becomes E VVe = 8σv, V ∈ G(R ). Note that in g n 2 g n general E VVe = dim{G (R )}σv for V ∈ G (R ), in which −60 g n n 0 200 400 600 800 1000 1200 1400 1600 1800 2000 G (R ) can be any subspace of G(R ). When the complete Iterations geometric algebra G( n) is used, 2 −3 R Fig. 6. EMSE learning curves for M = 10, µ = 0.005, and σV = 10 for 2 n 2 n rotor, complex, and real entries (100 experiments). Notice how the theoretical E kV (i)k = E VVe = 2 σv, V ∈ G(R ) . (69) and experimental values agree. 2 Calculating the expectation E kuik : The regressor array ui n LA counterparts, namely quaternion-LMS (QLMS) [14]–[16], is a collection of M random multivectors Uj ∈ G(R ), j = [28], complex-LMS (CLMS) [13], and real-LMS (LMS) [1], 1, ··· ,M. Analogously to the LA case, the regressor co- ∗ [31]. variance matrix is calculated as Ru = E uiui . Its is ∗ 2 T r(Ru) = E ui ui = E kuik , a multivector-valued quantity ∗ VII.CONCLUSION obtained via (13), E u u = E Ue1U1 + E Ue2U2 + ··· + E UeM UM . The formulation of GA-based adaptive techniques is still 3 For the special case Uj ∈ G(R ), the geometric product in its infancy. The majority of AF algorithms available in the UejUj is literature resorts to specific subalgebras of GA (real, complex 2 Uej Uj = u (j, 0) + u(j, 0)u(j, 1)γ1 + ··· + u(j, 0)u(j, 7)I+ numbers and quaternions). Each of them requires an specific 2 u (j, 1) + u(j, 0)u(j, 1)γ1 + ··· + u(j, 1)u(j, 5)I+ set of tools in order to pose the problem and perform calculus. . . . (70) ...... In this sense, the development of the GAAFs is an attempt to 2 unify those different adaptive-filtering approaches under the u (j, 7) + u(j, 7)u(j, 5)γ1 + ··· + u(j, 7)u(j, 0)I, same mathematical language. Additionally, as shown through- where each real coefficient u(j, k), k = 0, ··· , 7, is an i.i.d. out the text, GAAFs have improved estimation capabilities random drawn from a zero-mean and stationary white 10

2 2 Gaussian process. Thus, E UejUj = E u (j, 0) + E u (j, 1) + [22] L. Dorst, C. Doran, and J. Lasenby, Applications of Geometric Algebra u2(j, 2) + ··· + u2(j, 7), in and Engineering, Birkhauser¨ Boston, 2012. E E [23] G. Sommer, Geometric Computing with Clifford Algebras: Theoret- since the expectations of the cross-terms in (70) are zero. ical Foundations and Applications in and , Each term E u2(j, k) is said to be the variance of u(j, k) Springer Berlin Heidelberg, 2013. and denoted u2(j, k) σ2 (regressors are assumed to have [24] H. Grassmann and London Mathematical Society, Ausdehnungslehre, E , u . American Mathematical Society, 2000. shift structure). Note that this result is also obtained if Uj, [25] D. P. Mandic and V. S. L. Goh, Complex Valued Nonlinear Adaptive j = 1, ··· ,M, is a circular Gaussian random multivector Filters: Noncircularity, Widely Linear and Neural Models, John Wiley (the circularity condition [1, p. 8] for complex-valued random & Sons, 2009. [26] F.G.A. Neto and V.H. Nascimento, “A novel reduced-complexity widely variables is extended here to encompass random multivec- linear qlms algorithm,” in Statistical Signal Processing Workshop (SSP), tors.). Such case considers that the coefficients of a random 2011 IEEE, June 2011, pp. 81–84. multivector are independent Gaussian random variables. This [27] M. Jiang, W. Liu, and Y. Li, “A general quaternion-valued gradient 2 ∗ 2 operator and its applications to computational fluid dynamics and way, E UejUj = 8σu, which yields E u u = M(8σu), Uj ∈ adaptive beamforming,” in Digital Signal Processing (DSP), 2014 19th 3 ∗ g n 2 G(R ). Note that in general E u u = M(dim{G (R )}σu) International Conference on, Aug 2014, pp. 821–826. g n n [28] D. Xu, Y. Xia, and D. P. Mandic, “Optimization in quaternion dynamic for Uj belonging to G (R ) (any subspace of G(R )). When n systems: Gradient, Hessian, and learning algorithms,” IEEE Transactions Uj ∈ G(R ), on Neural Networks and Learning Systems, vol. 27, no. 2, pp. 249–261, kuk2 = u∗u = M(2nσ2), U ∈ G( n) . (71) Feb 2016. E E u j R [29] E. B. Dam, M. Koch, and M. Lillholm, “Quaternions, inter- REFERENCES polation and animation - diku-tr-98/5,” Tech. Rep., Department of Computer Science, University of Copenhagen, 1998. Available: [1] A.H. Sayed, Adaptive filters, Wiley-IEEE Press, 2008. http://web.mit.edu/2.998/www/QuaternionReport1.pdf. [2] D. Hestenes, New Foundations for , Fundamental [30] P. R. Girard, Quaternions, Clifford Algebras and Relativistic , of Physics. Springer, 1999. Birkhauser¨ Basel, 2007. [3] M.J. Crowe, A History of Vector Analysis: The Evolution of the Idea of [31] P. S. R. Diniz, Adaptive Filtering: Algorithms and Practical Implemen- a Vectorial System, Dover Books on Mathematics Series. Dover, 1967. tation, Springer US, 4 edition, 2013. [4] I. Kleiner, A History of , Birkhauser¨ Boston, 2007. [32] S. Haykin and B. Widrow, Least-Mean-Square Adaptive Filters, John [5] D. Hestenes and G. Sobczyk, to Geometric Calculus: A Wiley and Sons, Inc., 2003. Unified Language for Mathematics and Physics, Fundamental Theories [33] E. Hitzer, “Multivector differential calculus,” Advances in Applied of Physics. Springer Netherlands, 1987. Clifford Algebras, vol. 12, no. 2, pp. 135–182, 2002. [6] E. Hitzer, “Introduction to Clifford’s Geometric Algebra,” Journal of [34] E. Hitzer, “Algebraic foundations of split hypercomplex nonlinear the Society of Instrument and Control Engineers, vol. 51, no. 4, pp. adaptive filtering,” Mathematical Methods in the Applied Sciences, vol. 338–350, 2012. 36, no. 9, pp. 1042–1055, 2013. [7] C. J. L. Doran, Geometric Algebra and Its Application to Mathematical [35] F. Seybold and U. Wossner,¨ “Gaalet - a C++ expression template library Physics, Ph.D. thesis, University of Cambridge, 1994. for implementing geometric algebra,” in 6th High-End Visualization [8] C.J.L. Doran and A.N. Lasenby, Geometric Algebra for Physicists, Workshop, 2010. Cambridge University Press, 2003. [36] Wilder B. Lopes, “OpenGA - Open Source Geometric Algebra,” https: [9] J. Vaz Jr. and R. da Rocha Jr., An Introduction to Clifford Algebras and //openga.org, Accessed: 2018-03-03. , OUP Oxford, 2016. [37] T. Yokonuma, Tensor Spaces and , Translations of [10] L. Dorst, D. Fontijne, and S. Mann, Geometric Algebra for Computer mathematical monographs. American Mathematical Society, 1992. Science: An Object-Oriented Approach to Geometry (The Morgan Kauf- [38] Markus Rupp and Stefan Schwarz, “A Tensor LMS Algorithm,” in mann Series in ), Morgan Kaufmann Publishers Inc., International Conference on Acoustics, Speech, and Signal Processing San Francisco, CA, USA, 2007. (ICASSP), 2015, pp. 3347–3351. [11] W. B. Lopes, A. Al-Nuaimi, and C. G. Lopes, “Geometric-algebra lms [39] Markus Rupp and Stefan Schwarz, “Gradient-Based Approaches To adaptive filter and its application to rotation estimation,” IEEE Signal Learn Tensor Products,” in European Signal Processing Conference Processing Letters, vol. 23, no. 6, pp. 858–862, June 2016. (EUSIPCO), 2015, pp. 2486–2490. [12] A. Al-Nuaimi, W. B. Lopes, E. Steinbach, and C. G. Lopes, “6DOF point cloud alignment using geometric algebra-based adaptive filtering,” in 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), March 2016, pp. 1–9. [13] B. Widrow, J. McCool, and M. Ball, “The complex LMS algorithm,” Proceedings of the IEEE, vol. 63, no. 4, pp. 719–720, April 1975. [14] C.C. Took and D.P. Mandic, “The quaternion lms algorithm for adaptive filtering of hypercomplex processes,” Signal Processing, IEEE Transactions on, vol. 57, no. 4, pp. 1316–1327, April 2009. [15] D.P. Mandic, C. Jahanchahi, and C.C. Took, “A quaternion gradient operator and its applications,” Signal Processing Letters, IEEE, vol. 18, no. 1, pp. 47–50, Jan 2011. [16] C. Jahanchahi, C.C. Took, and D.P. Mandic, “On gradient calculation in quaternion adaptive filtering,” in Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on, March 2012, pp. 3773–3776. [17] I.L. Kantor and A.S. Solodovnikov, Hypercomplex Numbers: An Elementary Introduction to Algebras, Springer New York, 2011. [18] E.B.˙ Vinberg, A Course in Algebra, Graduate studies in mathematics. American Mathematical Society, 2003. [19] J. Lasenby, W. J. Fitzgerald, A. N. Lasenby, and C. J. L. Doran, “New geometric methods for computer vision: An application to structure and motion estimation,” Int. J. Comput. Vision, vol. 26, no. 3, pp. 191–213, Feb. 1998. [20] C. Perwass, Geometric Algebra with Applications in Engineering, Geometry and Computing. Springer Berlin Heidelberg, 2009. [21] D. Hildenbrand, Foundations of Geometric Algebra Computing, Geom- etry and Computing. Springer Berlin Heidelberg, 2012.