<<

Lattice Algebra: Theory and Applications

Prof. Gerhard Ritter CISE Department, University of Florida

Lattice Theory & Applications – p. 1/87 Overview Part I: Theory • Pertinent algebraic structures • Lattice algebra with focus on ℓ-vector Spaces • Concluding remarks and questions Part II: Applications • LNNs • Matrix based LAMs • Dendritic LAMs • Concluding remarks and questions

Lattice Theory & Applications – p. 2/87 History • Lattice theory in processing and AI • Image algebra, , and HPC A pertinent question: Why is (−1) · (−1) = 1?

Lattice Theory & Applications – p. 3/87 Algebraic Structures Some basic backgound Let G be a set with binary ◦. Then 1. (G, ◦) is a groupoid 2. if x ◦ (y ◦ z) = (x ◦ y) ◦ z, then (G, ◦) is a 3. if G is a semigroup and and G has an , then G is a 4. if G is a monoid and every element of G has an inverse, then G is a 5. if G is a group and x ◦ y = y ◦ x ∀ x, y ∈ G, then G is an .

Lattice Theory & Applications – p. 4/87 Algebraic Structures Why are groups important? Theorem. If (X, ·) is a group and a, b ∈ X, then the linear equations a · x = b and y · a = b have unique solutions in X. Remark: Note that the solutions x = a−1 · b and y = b · a−1 need not be the same unless X is abelian.

Lattice Theory & Applications – p. 5/87 Algebraic Structures Sets with Multiple Operations Suppose that X is a set with two binary operations ⋆ and ◦. The operation ◦ is said to be left distributive with respect to ⋆ if x ◦ (y⋆z) = (x ◦ y) ⋆ (x ◦ z) ∀ x, y, z ∈ X (1) and right distributive if (y⋆z) ◦ x = (y ◦ x) ⋆ (z ◦ x) ∀ x, y, z ∈ X. (2) Division on R+ is not left distributive over addition; (y + z)/x = (y/x) + (z/x) but x/(y + z) =6 (x/y) + (x/z). When both equations hold, we simply say that ◦ is

distributive with respect to ⋆. Lattice Theory & Applications – p. 6/87 Algebraic Structures Definition: A (R, +, ·) is a set R together with two binary operations + and · of addition and multiplication, respectively, defined on R such that the following are satisfied: 1. (R, +) is an abelian group. 2. (R, ·) is a semigroup. 3. ∀ a, b, c ∈ R, a · (b + c) = (a · b) + (a · c) and (a + b) · c = (a · c) + (b · c). If 1 in this definition is weakened to (R, +) is a commutative semigroup, then R is called a .

Lattice Theory & Applications – p. 7/87 Algebraic Structures If (R, +, ·) is a ring, we let 0 denote the additive identity and 1 the multiplicative identity (if it exists). If R satisfies the property • For every nonzero a ∈ R there is an element in R, denoted by a−1, such that a · a−1 = a−1 · a = 1 (i.e. (R \{0}, ·) is a group), then R is called . A commutative division ring is called a field You should now be able to prove that (−1) · (−1) = 1.

Lattice Theory & Applications – p. 8/87 Partially Ordered Sets Definition: A 4 on a set X is called a partial order on X if and only if for every x, y, z ∈ X the following three conditions are satisfied: 1. x 4 x (reflexive) 2. x 4 y and y 4 x ⇒ x = y (antisymmetric) 3. x 4 y and y 4 z ⇒ x 4 z (transitive) The inverse relation of 4, denoted by <, is also a partial order on X. Definition: The of a X is that partially ordered set X∗ defined by the inverse partial order relation on the same elements. Since (X∗)∗ = X, this terminology is legitimate. Lattice Theory & Applications – p. 9/87 Lattices Definition: A lattice is a partially ordered set L such that for any two elements x, y ∈ L, glb{x, y} and lub{x, y} exist. If L is a lattice, then we define x ∧ y = glb{x, y} and x ∨ y = lub{x, y}. • A sublattice of a lattice L is a X of L such that for each pair x, y ∈ X, we have that x ∧ y ∈ X and x ∨ y ∈ X. • A lattice L is said to be complete if and only if for each of its X, infX and supX exist. We define the symbols X = infX and X = supX. V W

Lattice Theory & Applications – p. 10/87 sℓ- and ℓ-Groups Suppose (R, ◦) is a semigroup or group and R is a lattice (R, ∨, ∧) or (R, ∨). Definition: A group translation ψ is a function ψ : R → R of form ψ(x)= a ◦ x ◦ b, where a, b are constants. The translation ψ is said to be isotone if and only if x 4 y ⇒ ψ(x) 4 ψ(y) Note that a group translation is a unary operation.

Lattice Theory & Applications – p. 11/87 sℓ-Semigroups and ℓ-Groups Definition: A ℓ-group (ℓ-semigroup) is of form (R, ∨, ∧, +), where (R, +) is a group (semigroup) and (R, ∨, ∧) is a lattice, and every group translation is isotone. If R is just a semilattice - i.e., (R, ∨) or (R, ∧) - in the definition, then (R, ∨, +) (or (R, ∧, +)) an sℓ−group if (R, +) is a group and an sℓ-semigroup if (R, +) is a semigroup.

Lattice Theory & Applications – p. 12/87 sℓ-Vector Spaces and ℓ-Vector Spaces Definition: A sℓ−vector V over the sℓ-group (or sℓ-monoid) (R, ∨, +), denoted by V(R),isa semilattice (V, ∨) together with an operation called scalar addition of each element of V by an element of R on the left, such that ∀ α, β ∈ R and v, w ∈ V, the following conditions are satisfied: 1. α + v ∈ V 2. α + (β + v) = (α + β)+ v 3. (α ∨ β)+ v = (α + v) ∨ (β + v) 4. α + (v ∨ w) = (α + v) ∨ (α + w) 5. 0+ v = v

Lattice Theory & Applications – p. 13/87 sℓ-Vector Spaces and ℓ-Vector Spaces The sℓ- is also called a max vector space, denoted by ∨-vector space. Using the duals (R, ∧, +) and (V, ∨), and replacing conditions (3.) and (4.) by 3’. (α ∧ β)+ v = (α + v) ∧ (β + v) 4’. α + (v ∧ w) = (α + v) ∧ (α + w), we obtain the min vector space denoted by ∧-vector space. Note also that replacing ∨ (or ∧) by + and + by ·, we obtain the usual axioms defining a vector space.

Lattice Theory & Applications – p. 14/87 sℓ-Vector Spaces and ℓ-Vector Spaces Definition: If we replace the semilattice V by a lattice (V, ∨, ∧), the sℓ-group (or sℓ-semigroup) R by an ℓ-group (or ℓ-semigroup) (R, ∨, ∧, +), and conditions 1 through 5 and 3’ and 4’ are all satisfied, then V(R) is called an ℓ-vector space. Remark. The lattice vector space definitions given above are drastically different from vector lattices as postulated by Birkhoff and others! A vector lattice is simply a partially ordered real vector space satisfying the isotone property.

Lattice Theory & Applications – p. 15/87 Lattice Algebra and The theory of ℓ-groups, sℓ-groups, sℓ-semigroups, ℓ-vector spaces, etc. provides an extremely rich setting in which many concepts from linear algebra and can be transferred to the lattice via analogies. ℓ-vector spaces are a good example of such an analogy. The next slides will present further examples of such analogies.

Lattice Theory & Applications – p. 16/87 Lattice Algebra and Linear Algebra Ring: (R, +, ·) • a · 0 = 0 · a = 0 • a +0=0+ a = a • a · 1 = 1 · a = a • a · (b + c) = (a · b) + (a · c)

Semi-Ring or sℓ-Group: (R−∞, ∨, +) • a + (−∞) = (−∞)+ a = −∞ • a ∨ (−∞) = (−∞) ∨ a = a • a +0=0+ a = a • a + (b ∨ c) = (a + b) ∨ (a + c)

Lattice Theory & Applications – p. 17/87 Lattice Algebra and Linear Algebra ∗ ∗ ∗ • Since (R−∞, ∨, +) = (R∞, ∧, + ), (R∞, ∧, + ) is also an sℓ−semigroup (with +∗ =+) isomorphic to (R−∞, ∨, +) ∗ • Defining a + b = a + b ∀ a, b ∈ R−∞ and

−∞ + ∞ = ∞ + −∞ = −∞ −∞ +∗ ∞ = ∞ +∗ −∞ = ∞,

we can combine (R−∞, ∨, +) and (R∞, ∧, +) into one ∗ well defined (R±∞, ∨, ∧, +, + ).

Lattice Theory & Applications – p. 18/87 Lattice Algebra and Linear Algebra • The structure (R, ∨, ∧, +) is an ℓ-group.

• The structures (R−∞, ∨, ∧, +) and (R∞, ∨, ∧, +) are ℓ-semigroups.

• The structure (R±∞, ∨, ∧) is a bounded . ∗ • The structure (R±∞, ∨, ∧, +, + ) is called a bounded lattice ordered group or blog, since the underlying structure (R, +) is a group.

Lattice Theory & Applications – p. 19/87 Matrix Addition and Multiplication

Suppose A = (aij)m×n and B = (bij)m×n with entries in R±∞. Then • C = A ∨ B is defined by setting cij = aij ∨ bij, and • C = A ∧ B is defined by setting cij = aij ∧ bij.

If A = (aij)m×p and B = (bij)p×n, then • C = A ∨ B is defined by setting p cij = k=1(aik + bkj), and • C = AW∧ B is defined by setting p ∗ cij = k=1(aik + bkj). • ∨ andV ∧ are called the max and min products, respectively.

Lattice Theory & Applications – p. 20/87 Zero and Identity Matrices

For the semiring (Mn×n(R−∞), ∨, ∨ ), the null matrix is −∞ · ··−∞  · −∞·· ·  Φ=  · ····     · ····     −∞ · ··−∞   

Lattice Theory & Applications – p. 21/87 Zero and Identity Matrices

For the semiring (Mn×n(R−∞), ∨, ∨ ), the identity matrix is 0 −∞· · −∞  −∞ 0 ·· ·  I =  · ··· ·     · · · · −∞     −∞ · ·−∞ 0   

Lattice Theory & Applications – p. 22/87 Matrix Properties

We have ∀ A, B, C ∈ Mn×n(R−∞) A ∨ (B ∨ C) = (A ∨ B) ∨ (A ∨ C) I ∨ A = A ∨ I = A A ∨ Φ = Φ ∨ A = A A ∨ Φ = Φ ∨ A = Φ Analogous laws hold for the semiring (Mn×n(R∞), ∧, ∧ ),

Lattice Theory & Applications – p. 23/87 Conjugation

If r ∈ R±∞, then the additive conjugate of r is the unique element r∗ defined by

−r if r ∈ R r∗ =  −∞ if r = ∞.  ∞ if r = −∞  • (r∗)∗ = r and r ∧ s = (r∗ ∨ s∗)∗ • It follows that r ∧ s = −(−r ∨−s) and • A ∧ B = (A∗ ∨ B∗)∗ and A ∧ B = (B∗ ∨ A∗)∗, ∗ ∗ where A = (aij) and A = (aji).

Lattice Theory & Applications – p. 24/87 sℓ-Sums 1 k Rn Definition: If X = {x ,..., x }⊂ −∞ (or Rn Rn Rn X ⊂ ∞), then x ∈ −∞ (or x ∈ ∞) is said to be a linear max (min) combination of X if x can be written as

k k ξ ξ x = (αξ + x ) (or x = (αξ + x )), ξ_=1 ξ^=1 R R ξ Rn where α ∈ −∞ (or α ∈ ∞) and x ∈ −∞ (or ξ Rn x ∈ ∞. k ξ k ξ The expressions ξ=1(αξ + x ) and ξ=1(αξ + x ) are called a linearW max sum and aVlinear min sum, respectively. Lattice Theory & Applications – p. 25/87 sℓ-Independence Rn Definition: Given the sℓ-vector space ( −∞, ∨) over R 1 k Rn ( −∞, ∨, +), X = {x ,..., x }⊂ −∞, and Rn x ∈ ∞, then x is said to be max dependent or k ξ sℓ-dependent on X ⇔ x = ξ=1(αξ + x ) for some linear max sum of vectorsW from X. If x is not max dependent on X, then x is said to be max independent of X. The set X is sℓ-independent or max independent ⇔ ∀ ξ ∈{1,...,k}, xξ is sℓ-independent of X \{xξ}.

Lattice Theory & Applications – p. 26/87 sℓ-Subspaces and Spans Rn Definition: If X ⊂ −∞, then (X, ∨) is an Rn sℓ-subspace of ( −∞, ∨) ⇔ the following are satisfied: 1. if x, y ∈ X, then x ∨ y ∈ X 2. α + x ∈ X ∀ α ∈ R−∞ and x ∈ X. Rn Definition: If X ⊂ −∞, then the sℓ-span of X is the set Rn S(X)= {x ∈ −∞; x is max dependent on X}.

Lattice Theory & Applications – p. 27/87 sℓ-Spans and Bases Remark: If x ∈ S(X), then α + x ∈ S(X) and x ∨ y ∈ S(X) ∀ x, y ∈ S(X). Thus S(X) is an Rn sℓ-vector subspace of −∞. Rn Rn If S(X)= −∞, then we say that X spans −∞ and Rn X is called a set of generators for −∞. Definition: A basis for an sℓ-vector space (V, ∨) (or (V, ∧)) is a set of sℓ-independent vectors which spans V.

Lattice Theory & Applications – p. 28/87 sℓ-independence Example. The set X = {(0, −∞), (−∞, 0)} spans R2 −∞ and is sℓ-independent. Thus X is a basis for R2 −∞ Question: What is a basis for R2? Question: If a ∈ R, what is the span of R2 X = {(0, a), (−∞, 0)} in −∞? Question: What is the span of X = {(1, 0), (0, 1)} in R2 −∞?

Lattice Theory & Applications – p. 29/87 ℓ-Vector Spaces Most of what we have said for sℓ-vector spaces also holds for ℓ-vector spaces with the appropriate Rn changes. Thus, for ( ±∞, ∨, ∧) we have: • 1 k Rn If {x ,..., x }⊂ ±∞, then a linear minimax combination of vectors from the set {x1,..., xk} Rn is any vector x ∈ ±∞ of form

k 1 k ξ x = S(x ,..., x )= (aξj + x ), (3) j_∈J ξ^=1

where J is a finite set of indices and aξj ∈ R±∞ ∀j ∈ J and ∀ξ = 1,...,k.

Lattice Theory & Applications – p. 30/87 ℓ-Vector Spaces • The expression S 1 k k ξ (x ,..., x )= j∈J ξ=1(aξj + x ) is called a linear minimax sumW or anV ℓ-sum. • Similarly we can combine the structures Rn Rn (M( ±∞)n×n, ∨, ∨ ) and (M( ±∞)n×n, ∧, ∧ ) Rn to obtain the blog (M( ±∞)n×n, ∨, ∧, ∨ , ∧ ) in order to obtain a coherent minimax theory for matrices. • Many of the concepts found in the corresponding linear domains can then be realized in these lattice structures via appropriate analogies.

Lattice Theory & Applications – p. 31/87 ℓ-Transforms Definition: A linear max transform or sℓ-transform of an sℓ-vector space V(R) into an sℓ-vector space W(R) is a function L : V → W which satisfies the condition L((α + v) ∨ (β + u)) = (α + L(v)) ∨ (β + L(u)) for all scalars α, β ∈ R and all v, u ∈ V. A linear min transform obeys L((α + v) ∧ (β + u)) = (α + L(v)) ∧ (β + L(u)) and a linear minimax transform of an ℓ-vector space V(R) into an ℓ-vector space W(R) obeys both of the equations.

Lattice Theory & Applications – p. 32/87 sℓ-transforms and polynomials • Just as in linear algebra, it is easy to prove that Rm any m × n matrix M with entries from −∞ (or Rm ∞) corresponds to a linear max (or min) Rm Rn Rm Rn transform from −∞ into −∞ (or ∞ into ∞). Simply define Rm LM (x)= M ∨ x ∀ x ∈ −∞ • The subject of ℓ- and sℓ-polynomials also bears many resemblances to the theory of polynomials and waits for further exploration.

Lattice Theory & Applications – p. 33/87 sℓ-Polynomials Definition: max polynomial of degree n with coefficients in the appropriate semiring R in the indeterminate x is of form ∞ p(x)= (ai + ix), i_=0

where ai = −∞ for all but a finite number of i. • If for some i> 0 ai =6 −∞, then the largest such i is called the degree of p(x) If no such i> 0 exists, then the degree of p(x) is zero. • For min polynomials simply replace by bigwedge. Combining the two notionsW will result in minimax polynomials.

Lattice Theory & Applications – p. 34/87 Discussion and Questions 1. Many items have not been discussed; e.g., eigenvalues and eigenvectors. 2. Applications have not been discussed. We will discuss some in the second talk. 3. Questions? Thank you!

Lattice Theory & Applications – p. 35/87 Associative Memories (AMs) Suppose X = {x1,..., xk}⊂ Rn and Y = {y1,..., yk}⊂ Rm. • A function M : Rn → Rm with the property that M(xξ)= yξ ∀ξ = 1,...,k is called an associative memory that identifies X with Y . • If X = Y , then M is called an auto-associative memory and if X =6 Y , then M is called a hetero-associative memory. • M is said to be robust in the presence of noise if M(x˜ξ)= yξ , for every corrupted version x˜ξ of the prototype input patterns xξ.

Lattice Theory & Applications – p. 36/87 Robustness in the Presence of Noise • We say that M is robust in the presence of noise ′ bounded by n = (n1,n2,...,nn) if and only if whenever x represents a distorted version of xξ with the property that x − xξ ≤ n, then x yξ. M( )= Remark: In this theory, it may be possible to have ni = ∞ for some i if that is desirable. • The concept of the noise bound can be generalized to be bounded by the set n1, n2,..., nk , with n being replaced by nξ in the above inequality so that x − xξ ≤ nξ.

Lattice Theory & Applications – p. 37/87 Matrix Bases AMs • The Steinbuch Lernmatrix (1961), auto- and hetero-associative memories. • The classical Hopfield net is an example of an auto-associative memory. • The Kohonen correlation matrix memory is an example of a hetero-associative memory. • The lattice based correlation matrix memories WXY and MXY .

Lattice Theory & Applications – p. 38/87 Lattice-based Associative Memories • For a pair (X,Y ) of pattern associations, the two canonical lattice memories WXY and MXY are defined by:

k k ξ ξ ξ ξ wij = yi − xj and mij = yi − xj . ξ^=1   ξ_=1  

• Fact. If X = Y , then

ξ ξ ξ WXX ∨ x = x = MXX ∧ x ∀ξ = 1,...,k.

Lattice Theory & Applications – p. 39/87 Lattice-based Associative Memories We have ∗ ∗ 1. WXY = Y ∧ X and MXY = Y ∨ X . ∗ ∗ ∗ 2. WXY = (X ∨ Y ) = MYX and ∗ ∗ ∗ MYX = (X ∧ Y ) = WYX. ξ ξ 3. x → {WXY | MXY } → y → ξ {MYX | WYX} → x . 4. This provides for a biassociative memory (LBAM).

Lattice Theory & Applications – p. 40/87 Behavior of WXX in Presence of Random Noise

Top row to bottom row patterns: Original; Noisy; Recalled. The output of WXX appears shifted towards white pixel values.

Lattice Theory & Applications – p. 41/87 Behavior of MXX in Presence of Random Noise

Top row to bottom row patterns: Original; Noisy; Recalled. The output of MXX appears shifted towards black pixel values.

Lattice Theory & Applications – p. 42/87 2 Behavior of WXX and MXX in R 1 2 2 The orbits of WXX and MXX for X = x , x ⊂ R : 

F (X)= set of fixed points of WXX.

Lattice Theory & Applications – p. 43/87 The data polyhedron B(v, u) ∩ F (X)

• ℓ ℓ ℓ ℓ Let v = WXX and u = MXX. k k • ξ ξ Set u = ξ=1 x and v = ξ=1 x . j W j j V j • Set w = uj + v and m = vj + u . • Define W = {w1,..., wn} and M = {m1,..., mn} • W is affinely independent whenver wℓ =6 wj ∀ℓ =6 j. Similarly for M.

Lattice Theory & Applications – p. 44/87 The data polyhedron B(v, u) ∩ F (X) • Let B(v, u) denote the hyperbox determined by {v, u}. • We obtain X ⊂ C(X) ⊂ B(v, u) ∩ F (X). • The vertices of the polyhedron B(v, u) ∩ F (X) are the elements of W ∪ M ∪{v, u}

Lattice Theory & Applications – p. 45/87 The data polyhedron B(v, u) ∩ F (X)

x2 w2 x6 5 u x1 x5 m1 w1 2 x x4 1 v 3 2 -1 x m 2 5 x1

-2

The fixed point set F (X) is the infinite strip bounded

by the two lines of slope 1. Lattice Theory & Applications – p. 46/87 Rationale for Dendritic Computing

• The number of synapses on a single neuron in the cerebral cortex ranges between 500 and 200,000.

• A neuron in the cortex typically sends messages to approximately 104 other neurons.

• Dendrites make up the largest component in both surface area and volume of the brain.

• Dendrites of cortical neurons make up > 50% of the neuron’s membrane.

Lattice Theory & Applications – p. 47/87 Rationale for Dendritic Computing • Recent research results demonstrate that the dynamic interaction of inputs in dendrites containing voltage-sensitive ion channels make them capable of realizing nonlinear interactions, logical operations, and possibly other local domain computation (Poggio, Koch, Shepherd, Rall, Segev, Perkel, et.al.) • Based on their experimentations, these researchers make the case that it is the dendrites and not the neural cell bodies are the basic computational units of the brain.

Lattice Theory & Applications – p. 48/87 Our LNNs Are Based On Biological Neurons

Figure 1: Simplified sketch of the processes of a bio- logical neuron.

Lattice Theory & Applications – p. 49/87 Dendritic Computation: Graphical Model

ℓ wijk = synaptic weight from the Ni to the kth dendrite of Mj; ℓ = 0 for inhibition and ℓ = 1 for excitation.

Lattice Theory & Applications – p. 50/87 SLLP (with Dendritic Structures)

Graphical representation of a single-layer lattice based perceptron with dendritic structure.

Lattice Theory & Applications – p. 51/87 Dendritic LNN model

• In the dendritic ANN model, a neuron Mj has Kj dendrites. A given dendrite Djk (k ∈{1,...,Kj}) of Mj receives inputs from axonal fibers of neurons N1,...,Nn and j computes a value τk . j • The neuron Mj computes a value τ which will correspond to the maximum (or minimum) of the values τ j,...,τ j received from its dendrites. 1 Kj

Lattice Theory & Applications – p. 52/87 Dendritic Computation: Mathematical Model The computation performed by the kth dendrite for ′ n input x = (x1,...,xn) ∈ R is given by

j 1−ℓ ℓ τk (x)= pjk (−1) xi + wijk , i∈^I(k) ℓ∈^L(i)  where • xi – value of neuron Ni; • I(k) ⊆{1,...,n} – set of all input neurons with terminal fibers that synapse on dendrite Djk;

• L(i) ⊆{0, 1} – set of terminal fibers of Ni that synapse on dendrite Djk;

• pjk ∈ {−1, 1} – IPSC/EPSC of Djk.

Lattice Theory & Applications – p. 53/87 Dendritic Computation: Mathematical Model

j • The value τk (x) is passed to the cell body and the state of Mj is a function of the input received from all its dendritic postsynaptic results. The total value received by Mj is given by

Kj j j τ (x)= pj τk (x). k^=1

Lattice Theory & Applications – p. 54/87 The Capabilities of an SLLP • An SLLP can distiguish between any given number of pattern classes to within any desired degree of ε> 0.

• More precisely, suppose X1, X2,..., Xm denotes a collection of disjoint compact subsets of Rn. • For each p ∈{1,...,m}, define m Yp = j=1,j6=p Xj εp = d(SXp,Yp) > 0 1 ε0 = 2 min{ε1,...,εp}. • As the following theorem shows, a given pattern x ∈ Rn will be recognised correctly as belonging to class Cp whenever x ∈ Xp

Lattice Theory & Applications – p. 55/87 The Capabilities of an SLLP

• Theorem. If {X1, X2,...,Xm} is a collection of disjoint compact subsets of Rn and ε a positive number with ε<ε0, then there exists a single layer lattice based perceptron that assigns each n point x ∈ R to class Cj whenever x ∈ Xj and m j ∈{1,...,m}, and to class C0 = ¬ j=1 Cj whenever d(x, Xi) >ε, ∀i = 1,...,mS. Furthermore, no point x ∈ Rn is assigned to more than one class.

Lattice Theory & Applications – p. 56/87 Illustration of the Theorem in R2

Any point in the set Xj is identified with class Cj; points within the ǫ- may or may not be classified as belonging to Cj, points outside the ǫ-bands will not be associated with a class Cj ∀j.

Lattice Theory & Applications – p. 57/87 Learning in LNNs

• Early training methods were based on the proofs of the preceding Theorems. • All training algorithms involve the growth of axonal branches, computation of branch weights, creation of dendrites, and synapses. • The first training algorithm developed was based on elimination of foreign patterns from a given training set (min or ). • The second training algorithm was based on small region merging (max or ).

Lattice Theory & Applications – p. 58/87 Example of the two methods in R2

The two methods partition the pattern space R2 in terms of intersection (a) and union (b), respectively.

Lattice Theory & Applications – p. 59/87 SLLP Using Elemination VS MLP

(a) SLLP: 3 dendrites, 9 axonal branches. (b) MLP 13 hidden neorons and 2000 epochs.

Lattice Theory & Applications – p. 60/87 SLLP Using Merging

During training, the SLLP grows 20 dendrites, 19 excitatory and 1 inhibitory (dashed).

Lattice Theory & Applications – p. 61/87 Another Merging Example

6

¹

Lattice Theory & Applications – p. 62/87 Learning in LNNs

• L. Iancu developed a hybrid method using both Merging and Elimination. The method is reminiscent of the Expansion-Contraction method for hyperboxes developed by P.K. Simson for training Mini-Max Neural Networks, but it is distinctly different. • L. Iancu also extended this learning to Ritter’s Fuzzy SLLP

Lattice Theory & Applications – p. 63/87 Fuzzy LNNs

TheProblem :

• Classify all points in the interval [a, b] ⊂ R as belonging to class C1, and every point outside the interval [a − α, b + α] as having no relation to class C1, where α> 0 is a specified fuzzy boundary parameter. • For a point x ∈ [a − α, a] or x ∈ [b, b + α] we would like y(x) to be close to 1 when x is close to a or b, and y(x) close to 0 whenever x is close to a − α or b + α.

Lattice Theory & Applications – p. 64/87 Fuzzy LNNs

Solution :

• 0 1 Change the weights w1 = −b and w1 = −a found by one of the previous algorithms to 0 1 0 w1 1 w1 v1 = − α − 1 and v1 = − α + 1, and use the x input α instead of x. • Use the activation function 1 if z ≥ 1 f(z)=  z if 0 ≤ z ≤ 1 .  0 if z ≤ 0 

Lattice Theory & Applications – p. 65/87 Fuzzy LNNs

Computing fuzzy output values with an SLLP using the ramp activation function.

Lattice Theory & Applications – p. 66/87 Learning in LNNs

Classifier Recognition SLLP (elimination) 98.0% Backpropagation 96% Resilient Backpropagation 96.2% Bayesian Classifier 96.8% Fuzzy LNN 100%

UC Irvine Ionosphere data set (2-class problem in R34 with training set = 65% of data set)

Lattice Theory & Applications – p. 67/87 Learning in LNNs

Classifier Recognition Fuzzy SLLP (merge/elimination) 98.7% Backpropagation 95.2% Fuzzy Min-Max NN 97.3% Bayesian Classifier 97.3% Fisher Ratios Discimination 96.0% Ho-Kashyap 97.3%

Fisher’s Iris Data Set. A 3-class problem in R4 with training set = 50% of data set.

Lattice Theory & Applications – p. 68/87 Learning in LNNs

• A. Barmpoutis extended the elimination method to arbitrary orthonormal basis settings. • A dynamic Backpropagation Method is currently under development.

Lattice Theory & Applications – p. 69/87 Learning in LNNs

In Barmpoutis’s approach, the equation

j 1−ℓ ℓ τk (x)= pjk (−1) xi + wijk , i∈^I(k) ℓ∈^L(i)  is replaced by

j 1−ℓ ℓ τk (x)= pjk (−1) R(x)i + wijk , i∈^I(k) ℓ∈^L(i)  where R is a rotation matrix obtained in the learning process.

Lattice Theory & Applications – p. 70/87 LNNs employing Orthonormal Basis

Left: Maximal hyperbox for elimination in the standard basis for Rn. Right: Maximal hyperbox for elimination in another orthonormal basis for Rn.

Lattice Theory & Applications – p. 71/87 Dendritic Model of an Associative Memory

• X = x1,..., xk ⊂ Rn.

• n input neurons N 1,...,Nn accepting input ′ n x = (x1,...,xn) ∈ R , where xi → Ni. • One hidden layer containing k neurons H1,...,Hk.

• Each neuron Hj has exactly one dendrite which contains the synaptic sites of exactly two terminal axonal fibers of Ni for i = 1,...,n.

• The weights of the two terminal fibers of Ni making contact with the dendrite of Hj are ℓ denoted by wij, with ℓ = 0, 1.

Lattice Theory & Applications – p. 72/87 Input and Hidden Layer Neural Connection

Every input neuron connects to the dendrite of each hidden neuron with two axonal fibers, one excitatory and the other inhibitory.

Lattice Theory & Applications – p. 73/87 Computation at the Hidden Layer n • For input x ∈ R , the dendrite of Hj computes

n 1 j 1−ℓ ℓ τ (x)= (−1) xi + wij . i^=1 ℓ^=0 

• The state of neuron Hj is determined by the hard-limiter activation function 0 if z ≥ 0 f(z)= .  −∞ if z < 0

j • The output of Hj is f τ (x) .

• The output flows along the axon of Hj and its axonal fibers to m output neurons M1,..., Mm. Lattice Theory & Applications – p. 74/87 Computation at the Output Layer

• Each output neuron Mh,h = 1,...,m, has exactly one dendrite.

• Each hidden neuron Hj (j = 1,...,k) has exactly one excitatory axonal fiber terminating on the dendrite of Mh. • The synaptic weight of the excitatory axonal fiber of Hj terminating on the dendrite of Mh is preset j as vjh = yh for j = 1,...,k; h = 1,...,m. • The computation performed by Mh is h k j τ (q)= j=1 (qj + vjh) , where qj = f τ (x) denotes theW output of Hj.   • The activation function for each output neuron is the identity function . Mh g(z)= z Lattice Theory & Applications – p. 75/87 Dendritic Model of an Associative Memory

Topology of the dendritic associative memory based on the dendritic model. The network is fully

connected. Lattice Theory & Applications – p. 76/87 ℓ Computation of the Weights wij • Compute ξ γ ξ γ d(x , x ) = max xi − xi : i = 1,...,n .  • Choose a noise parameter α> 0 such that α< 1 ξ γ 2 min{ud(x , x ) : ξ<γ, ξ,γ ∈{1,...,k}}. − xj − α if ℓ = 1 • ℓ i Set wij = j .  −xi + α if ℓ = 0 • Under these conditions,  given input x ∈ Rn, the ′ output y = (y1,...,ym) from the output neurons j j ′ j j will be y = y1,...,ym = y ⇐⇒ x ∈ B , j ′ Rn j where B = (x1,...,xn) ∈ : xi − α ≤ j xi ≤ xi + α, i = 1,...,n .

Lattice Theory & Applications – p. 77/87 Patterns that will be correctly associ- ated

Any patter residing in the box with center xξ will be idntified as pattern xξ. The pattern x˜ will not be associated with any prototype pattern.

Lattice Theory & Applications – p. 78/87 Patterns to Store

Top row represents the patterns x1, x2, and x3, while the bottom row depicts the associated patterns y1, y2, and y3. Here n = 2500 and m = 1500.

Lattice Theory & Applications – p. 79/87 Recall of Corrupted Patterns

Distorting every vector components of xj with random noise within the range [−α, α], with α = 75.2 results in perfect recall association.

Lattice Theory & Applications – p. 80/87 Recall Failure when Noise Exceeds α

The memory rejects the patterns if they are corrupted with random noise exceeding α = 75.2.

Lattice Theory & Applications – p. 81/87 Increasing the Noise Tolerance • For each ξ = 1,...,k compute an allowable noise parameter αξ by setting 1 ξ γ αξ < 2 min{d(x , x ) : γ ∈ K(ξ)}, where K(ξ)= {1,...,k}\{ξ}. • Reset the weights by

j ℓ − xi − αj if ℓ = 1 , wij = j  −xi + αj  if ℓ = 0 ,  • Each output neuron Hj will have a value qj = 0 if and only if x is an element of the hypercube j Rn j j B = x ∈ : xi − αj ≤ xi ≤ xi + αj and n j qj = −∞ whenever x ∈ R \ B .

Lattice Theory & Applications – p. 82/87 Successful Recall of the Refined Model

The top row shows the same input patterns as in the last figure. This time recall association is perfect.

Lattice Theory & Applications – p. 83/87 Recall of AAM based on the Dendritic Model

• Top row: patterns distorted with random noise within noise parameter α. • Bottom row: perfect recall of the auto-associative memory based on the dendritic model.

Lattice Theory & Applications – p. 84/87 A New LNN Model • In this model the synapses on spines of dendrites are used. The presynaptic neuron is either excitatory or inhibitory, but not both.

• N = {Ni : i = 1,...,n} denotes the set of presynaptic (input) neurons.

• σ(j, k)= jth spine on on dendrite Dk • N(j, k)= set of presynaptic neurons with synapses on σ(j, k). Thus, N(j, k) ⊂ N.

• jk = number of spines on Dk

Lattice Theory & Applications – p. 85/87 A New LNN Model

• The kth dendrite Dk now computes

jk 1−ℓ(i) τk = pk [wkj + (−1) sixi], j^=1 i∈XN(j,k)

where ℓ(i) = 0 if Ni is inhibitory and ℓ(i) = 1 if Ni is exitatory.

• si = number of spikes in spike train produced by Ni in an interval [s − t,t] k • Note that j=1 N(j, k) corresponds to the set of input neuronsS with terminal axonal fibers on Dk.

Lattice Theory & Applications – p. 86/87 Questions and Comments • Thank you for your attention. • Any questions or comments?

Lattice Theory & Applications – p. 87/87