Introduction to Machine Learning Lecture 9

Total Page:16

File Type:pdf, Size:1020Kb

Introduction to Machine Learning Lecture 9 Introduction to Machine Learning Lecture 9 Mehryar Mohri Courant Institute and Google Research [email protected] Kernel Methods Motivation Non-linear decision boundary. Efficient computation of inner products in high dimension. Flexible selection of more complex features. Mehryar Mohri - Introduction to Machine Learning page 3 This Lecture Definitions SVMs with kernels Closure properties Sequence Kernels Mehryar Mohri - Introduction to Machine Learning page 4 Non-Linear Separation Linear separation impossible in most problems. Non-linear mapping from input space to high- dimensional feature space: Φ : X F . → Generalization ability: independent of dim( F ) , depends only on ρ and m . Mehryar Mohri - Introduction to Machine Learning page 5 Kernel Methods Idea: Define K : X X R , called kernel, such that: • × → Φ(x) Φ(y)=K(x, y). · • K often interpreted as a similarity measure. Benefits: • Efficiency: K is often more efficient to compute than Φ and the dot product. • Flexibility: K can be chosen arbitrarily so long as the existence of Φ is guaranteed (symmetry and positive definiteness condition). Mehryar Mohri - Introduction to Machine Learning page 6 PDS Condition Definition: a kernel K : X X R is positive definite × → symmetric (PDS) if for any x 1 ,...,x m X , the {m m }⊆ matrix K =[ K ( x i ,x j )] ij R × is symmetric ∈ positive semi-definite (SPSD). K SPSD if symmetric and one of the 2 equiv. cond.’s: its eigenvalues are non-negative. • n m 1 for any c R × , cKc = cicjK(xi,xj) 0. • ∈ ≥ i,j=1 Terminology: PDS for kernels, SPSD for kernel matrices (see (Berg et al., 1984)). Mehryar Mohri - Introduction to Machine Learning page 7 Example - Polynomial Kernels Definition: N d x, y R ,K(x, y)=(x y + c) ,c>0. ∀ ∈ · Example: for N =2 and d =2 , 2 K(x, y)=(x1y1 + x2y2 + c) 2 2 x1 y1 2 2 x2 y2 √2 x x √2 y y = 1 2 1 2 . √2cx1 · √2cy1 √2cx √2cy 2 2 c c Mehryar Mohri - Introduction to Machine Learning page 8 XOR Problem Use second-degree polynomial kernel with c =1 : √2 x1x2 x2 (-1, 1) (1, 1, +√2, √2, √2, 1) (1, 1, +√2, +√2, +√2, 1) (1, 1) − − √2 x1 x1 (-1, -1) (1, -1) (1, 1, √2, √2, +√2, 1) (1, 1, √2, +√2, √2, 1) − − − − Linearly non-separable Linearly separable by x1x2 =0 . Mehryar Mohri - Introduction to Machine Learning page 9 Other Standard PDS Kernels Gaussian kernels: x y 2 K(x, y)=exp || − || ,σ=0. − 2σ2 Sigmoid Kernels: K(x, y)=tanh(a(x y)+b),a,b 0. · ≥ Mehryar Mohri - Introduction to Machine Learning page 10 This Lecture Definitions SVMs with kernels Closure properties Sequence Kernels Mehryar Mohri - Introduction to Machine Learning page 11 Reproducing Kernel Hilbert Space (Aronszajn, 1950) Theorem: Let K : X X R be a PDS kernel. Then, × → there exists a Hilbert space H and a mapping Φ from X to H such that x, y X, K(x, y)=Φ(x) Φ(y). ∀ ∈ · Furthermore, the following reproducing property holds: f H , x X, f(x)= f,Φ(x) = f,K(x, ) . ∀ ∈ 0 ∀ ∈ · Mehryar Mohri - Introduction to Machine Learning page 12 Notes: • H is called the reproducing kernel Hilbert space (RKHS) associated to K . A Hilbert space such that there exists Φ : X H • → with K ( x, y )= Φ ( x ) Φ ( y ) for all x, y X is also · ∈ called a feature space associated to K . Φ is called a feature mapping. • Feature spaces associated to K are in general not unique. Mehryar Mohri - Introduction to Machine Learning page 13 Consequence: SVMs with PDS Kernels (Boser, Guyon, and Vapnik, 1992) Constrained optimization: Φ(xi) Φ(xj) m m · 1 max αi αiαj yiyjK(xi,xj ) α − 2 i=1 i,j=1 m subject to: 0 α C α y =0,i [1,m]. ≤ i ≤ ∧ i i ∈ i=1 Solution: Φ(xi) Φ(x) m · h(x)=sgn αiyiK(xi,x)+b , Φ(xj ) Φ(xi) m i=1 · with b = y α y K(x ,x ) for any x with i − j j j i i j=1 0<αi <C. Mehryar Mohri - Introduction to Machine Learning page 14 SVMs with PDS Kernels Constrained optimization: Hadamard product max 2 1α (α y)K(α y) α − ◦ ◦ subject to: 0 α C αy =0. ≤ ≤ ∧ Solution: m h =sgn α y K(x , )+b , i i i · i=1 with b = yi (α y)Kei for any x i with − ◦ 0<αi <C. Mehryar Mohri - Introduction to Machine Learning page 15 Generalization: Representer Theorem (Kimeldorf and Wahba, 1971) Theorem: Let K : X X R be a PDS kernel and H its corresponding RKHS.× → Then, for any non- m decreasing function G : R R and any L : R R + the optimization problem→ → ∪{ ∞} 2 argmin F (h)=argminG( h H )+L h(x1),...,h(xm) h H h H m ∈ ∈ admits a solution of the formh∗ = α K(x , ). i i · i=1 If G is further assumed to be increasing, then any solution has this form. Mehryar Mohri - Introduction to Machine Learning page 16 Proof: let H =span( K ( x , ): i [1 ,m ] ). Any h H • 1 { i · ∈ } ∈ admits the decomposition h = h 1 + h ⊥ according to H = H H ⊥ . 1 ⊕ 1 • Since G is non-decreasing, 2 2 2 2 G( h ) G( h + h⊥ )=G( h ). 1 ≤ 1 By the reproducing property, for all i [1 ,m ], • ∈ h(x )= h, K(x , ) = h ,K(x , ) = h (x ). i i · 1 i · 1 i • Thus, L h ( x 1 ) ,...,h ( x m ) = L h 1 ( x 1 ) ,...,h 1 ( x m ) and F (h ) F (h). 1 ≤ • If G is increasing, then F ( h 1 ) <F ( h ) and any solution of the optimization problem must be in H 1 . Mehryar Mohri - Introduction to Machine Learning page 17 Kernel-Based Algorithms PDS kernels used to extend a variety of algorithms in classification and other areas: • regression. • ranking. • dimensionality reduction. • clustering. But, how do we define PDS kernels? Mehryar Mohri - Introduction to Machine Learning page 18 This Lecture Definitions SVMs with kernels Closure properties Sequence Kernels Mehryar Mohri - Introduction to Machine Learning page 19 Closure Properties of PDS Kernels Theorem: Positive definite symmetric (PDS) kernels are closed under: • sum, • product, • tensor product, • pointwise limit, • composition with a power series. Mehryar Mohri - Introduction to Machine Learning page 20 Closure Properties - Proof Proof: closure under sum: cKc 0 cKc 0 c(K + K)c 0. ≥ ∧ ≥ ⇒ ≥ • closure under product: K = MM , m m m cicj(Kij Kij )= cicj MikMjk Kij i,j=1 i,j=1 k=1 m m m = c c M M K = zKz 0, i j ik jk ij k k ≥ i,j=1 k=1 k=1 c1M1k with z = . k ··· cmMmk Mehryar Mohri - Introduction to Machine Learning page 21 • Closure under tensor product: definition: for all x ,x ,y ,y X , • 1 2 1 2 ∈ (K K )(x ,y ,x ,y )=K (x ,x )K (y ,y ). 1 ⊗ 2 1 1 2 2 1 1 2 2 1 2 • thus, PDS kernel as product of the kernels (x1,y1,x2,y2) K1(x1,x2) (x ,y ,x ,y ) K (y ,y ). → 1 1 2 2 → 2 1 2 Closure under pointwise limit: if for all x, y X , • ∈ lim Kn(x, y)=K(x, y), n →∞ Then,( n, cKnc 0) lim cKnc = cKc 0. n ∀ ≥ ⇒ →∞ ≥ Mehryar Mohri - Introduction to Machine Learning page 22 • Closure under composition with power series: assumptions: K PDS kernel with K ( x, y ) < ρ for • n | | all x, y X and f ( x )= ∞ a n x ,a n 0 power ∈ n=0 ≥ series with radius of convergence ρ . f K is a PDS kernel since K n is PDS by closure • ◦ N n under product, n =0 a n K is PDS by closure under sum, and closure under pointwise limit. Example: for any PDS kernel K , exp( K ) is PDS. Mehryar Mohri - Introduction to Machine Learning page 23 This Lecture Definitions SVMs with kernels Closure properties Sequence Kernels Mehryar Mohri - Introduction to Machine Learning page 24 Sequence Kernels Definition: Kernels defined over pairs of strings. • Motivation: computational biology, text and speech classification. • Idea: two sequences are related when they share some common substrings or subsequences. • Example: sum of the product of the counts of common substrings. Mehryar Mohri - Introduction to Machine Learning page 25 Weighted Transducers b:a/0.6 b:a/0.2 a:b/0.1 1 a:a/0.4 b:a/0.3 2 3/0.1 0 a:b/0.5 T (x, y)= Sum of the weights of all accepting paths with input x and output y . T (abb, baa)=.1 .2 .3 .1+.5 .3 .6 .1 × × × × × × Mehryar Mohri - Introduction to Machine Learning page 26 Rational Kernels over Strings (Cortes et al., 2004) Definition: a kernel K : Σ ∗ Σ ∗ R is rational if K =T × → for some weighted transducer T . Definition: let T 1 : Σ ∗ ∆ ∗ R and T 2 : ∆ ∗ Ω ∗ R be two weighted transducers.× → Then, the composition× → of T and T is defined for all x Σ ∗ ,y Ω ∗ by 1 2 ∈ ∈ (T T )(x, y)= T (x, z) T (z,y). 1 ◦ 2 1 2 z ∆ ∈ ∗ Definition: the inverse of a transducer T : Σ ∗ ∆ ∗ R 1 × → is the transducer T − : ∆ ∗ Σ ∗ R obtained from T by swapping input and output× →labels. Mehryar Mohri - Introduction to Machine Learning page 27 Composition Theorem: the composition of two weighted transducer is also a weighted transducer. Proof: constructive proof based on composition algorithm. • states identified with pairs. • -free case: transitions defined by E = (q ,q ),a,c,w w , (q ,q ) .
Recommended publications
  • Kernel Methods for Knowledge Structures
    Kernel Methods for Knowledge Structures Zur Erlangung des akademischen Grades eines Doktors der Wirtschaftswissenschaften (Dr. rer. pol.) von der Fakultät für Wirtschaftswissenschaften der Universität Karlsruhe (TH) genehmigte DISSERTATION von Dipl.-Inform.-Wirt. Stephan Bloehdorn Tag der mündlichen Prüfung: 10. Juli 2008 Referent: Prof. Dr. Rudi Studer Koreferent: Prof. Dr. Dr. Lars Schmidt-Thieme 2008 Karlsruhe Abstract Kernel methods constitute a new and popular field of research in the area of machine learning. Kernel-based machine learning algorithms abandon the explicit represen- tation of data items in the vector space in which the sought-after patterns are to be detected. Instead, they implicitly mimic the geometry of the feature space by means of the kernel function, a similarity function which maintains a geometric interpreta- tion as the inner product of two vectors. Knowledge structures and ontologies allow to formally model domain knowledge which can constitute valuable complementary information for pattern discovery. For kernel-based machine learning algorithms, a good way to make such prior knowledge about the problem domain available to a machine learning technique is to incorporate it into the kernel function. This thesis studies the design of such kernel functions. First, this thesis provides a theoretical analysis of popular similarity functions for entities in taxonomic knowledge structures in terms of their suitability as kernel func- tions. It shows that, in a general setting, many taxonomic similarity functions can not be guaranteed to yield valid kernel functions and discusses the alternatives. Secondly, the thesis addresses the design of expressive kernel functions for text mining applications. A first group of kernel functions, Semantic Smoothing Kernels (SSKs) retain the Vector Space Models (VSMs) representation of textual data as vec- tors of term weights but employ linguistic background knowledge resources to bias the original inner product in such a way that cross-term similarities adequately con- tribute to the kernel result.
    [Show full text]
  • 2.2 Kernel and Range of a Linear Transformation
    2.2 Kernel and Range of a Linear Transformation Performance Criteria: 2. (c) Determine whether a given vector is in the kernel or range of a linear trans- formation. Describe the kernel and range of a linear transformation. (d) Determine whether a transformation is one-to-one; determine whether a transformation is onto. When working with transformations T : Rm → Rn in Math 341, you found that any linear transformation can be represented by multiplication by a matrix. At some point after that you were introduced to the concepts of the null space and column space of a matrix. In this section we present the analogous ideas for general vector spaces. Definition 2.4: Let V and W be vector spaces, and let T : V → W be a transformation. We will call V the domain of T , and W is the codomain of T . Definition 2.5: Let V and W be vector spaces, and let T : V → W be a linear transformation. • The set of all vectors v ∈ V for which T v = 0 is a subspace of V . It is called the kernel of T , And we will denote it by ker(T ). • The set of all vectors w ∈ W such that w = T v for some v ∈ V is called the range of T . It is a subspace of W , and is denoted ran(T ). It is worth making a few comments about the above: • The kernel and range “belong to” the transformation, not the vector spaces V and W . If we had another linear transformation S : V → W , it would most likely have a different kernel and range.
    [Show full text]
  • 23. Kernel, Rank, Range
    23. Kernel, Rank, Range We now study linear transformations in more detail. First, we establish some important vocabulary. The range of a linear transformation f : V ! W is the set of vectors the linear transformation maps to. This set is also often called the image of f, written ran(f) = Im(f) = L(V ) = fL(v)jv 2 V g ⊂ W: The domain of a linear transformation is often called the pre-image of f. We can also talk about the pre-image of any subset of vectors U 2 W : L−1(U) = fv 2 V jL(v) 2 Ug ⊂ V: A linear transformation f is one-to-one if for any x 6= y 2 V , f(x) 6= f(y). In other words, different vector in V always map to different vectors in W . One-to-one transformations are also known as injective transformations. Notice that injectivity is a condition on the pre-image of f. A linear transformation f is onto if for every w 2 W , there exists an x 2 V such that f(x) = w. In other words, every vector in W is the image of some vector in V . An onto transformation is also known as an surjective transformation. Notice that surjectivity is a condition on the image of f. 1 Suppose L : V ! W is not injective. Then we can find v1 6= v2 such that Lv1 = Lv2. Then v1 − v2 6= 0, but L(v1 − v2) = 0: Definition Let L : V ! W be a linear transformation. The set of all vectors v such that Lv = 0W is called the kernel of L: ker L = fv 2 V jLv = 0g: 1 The notions of one-to-one and onto can be generalized to arbitrary functions on sets.
    [Show full text]
  • On Semifields of Type
    I I G ◭◭ ◮◮ ◭ ◮ On semifields of type (q2n,qn,q2,q2,q), n odd page 1 / 19 ∗ go back Giuseppe Marino Olga Polverino Rocco Tromebtti full screen Abstract 2n n 2 2 close A semifield of type (q , q , q , q , q) (with n > 1) is a finite semi- field of order q2n (q a prime power) with left nucleus of order qn, right 2 quit and middle nuclei both of order q and center of order q. Semifields of type (q6, q3, q2, q2, q) have been completely classified by the authors and N. L. Johnson in [10]. In this paper we determine, up to isotopy, the form n n of any semifield of type (q2 , q , q2, q2, q) when n is an odd integer, proving n−1 that there exist 2 non isotopic potential families of semifields of this type. Also, we provide, with the aid of the computer, new examples of semifields of type (q14, q7, q2, q2, q), when q = 2. Keywords: semifield, isotopy, linear set MSC 2000: 51A40, 51E20, 12K10 1. Introduction A finite semifield S is a finite algebraic structure satisfying all the axioms for a skew field except (possibly) associativity. The subsets Nl = {a ∈ S | (ab)c = a(bc), ∀b, c ∈ S} , Nm = {b ∈ S | (ab)c = a(bc), ∀a, c ∈ S} , Nr = {c ∈ S | (ab)c = a(bc), ∀a, b ∈ S} and K = {a ∈ Nl ∩ Nm ∩ Nr | ab = ba, ∀b ∈ S} are fields and are known, respectively, as the left nucleus, the middle nucleus, the right nucleus and the center of the semifield.
    [Show full text]
  • Definition: a Semiring S Is Said to Be Semi-Isomorphic to the Semiring The
    118 MA THEMA TICS: S. BO URNE PROC. N. A. S. that the dodecahedra are too small to accommodate the chlorine molecules. Moreover, the tetrakaidecahedra are barely long enough to hold the chlorine molecules along the polar axis; the cos2 a weighting of the density of the spherical shells representing the centers of the chlorine atoms.thus appears to be justified. On the basis of this structure for chlorine hydrate, the empirical formula is 6Cl2. 46H20, or C12. 72/3H20. This is in fair agreement with the gener- ally accepted formula C12. 8H20, for which Harris7 has recently provided further support. For molecules slightly smaller than chlorine, which could occupy the dodecahedra also, the predicted formula is M * 53/4H20. * Contribution No. 1652. 1 Davy, H., Phil. Trans. Roy. Soc., 101, 155 (1811). 2 Faraday, M., Quart. J. Sci., 15, 71 (1823). 3 Fiat Review of German Science, Vol. 26, Part IV. 4 v. Stackelberg, M., Gotzen, O., Pietuchovsky, J., Witscher, O., Fruhbuss, H., and Meinhold, W., Fortschr. Mineral., 26, 122 (1947); cf. Chem. Abs., 44, 9846 (1950). 6 Claussen, W. F., J. Chem. Phys., 19, 259, 662 (1951). 6 v. Stackelberg, M., and Muller, H. R., Ibid., 19, 1319 (1951). 7 In June, 1951, a brief description of the present structure was communicated by letter to Prof. W. H. Rodebush and Dr. W. F. Claussen. The structure was then in- dependently constructed by Dr. Claussen, who has published a note on it (J. Chem. Phys., 19, 1425 (1951)). Dr. Claussen has kindly informed us that the structure has also been discovered by H.
    [Show full text]
  • The Kernel of a Derivation
    Journal of Pure and Applied Algebra I34 (1993) 13-16 13 North-Holland The kernel of a derivation H.G.J. Derksen Department of Mathematics, CathoCk University Nijmegen, Nijmegen, The h’etherlands Communicated by C.A. Weibel Receircd 12 November 1991 Revised 22 April 1992 Derksen, H.G.J., The kernel of a deriv+ Jn, Journal of Pure and Applied Algebra 84 (1993) 13-i6. Let K be a field of characteristic 0. Nagata and Nowicki have shown that the kernel of a derivation on K[X, , . , X,,] is of F.&e type over K if n I 3. We construct a derivation of a polynomial ring in 32 variables which kernel is not of finite type over K. Furthermore we show that for every field extension L over K of finite transcendence degree, every intermediate field which is algebraically closed in 6. is the kernel of a K-derivation of L. 1. The counterexample In this section we construct a derivation on a polynomial ring in 32 variables, based on Nagata’s counterexample to the fourteenth problem of Hilbert. For the reader’s convenience we recari the f~ia~~z+h problem of iX~ert and Nagata’:: counterexample (cf. [l-3]). Hilbert’s fourteenth problem. Let L be a subfield of @(X,, . , X,, ) a Is the ring Lnc[x,,.. , X,] of finite type over @? Nagata’s counterexample. Let R = @[Xl, . , X,, Y, , . , YrJ, t = 1; Y2 - l = yy and vi =tXilYi for ~=1,2,. ,r. Choose for j=1,2,3 and i=1,2,. e ,I ele- ments aj i E @ algebraically independent over Q.
    [Show full text]
  • Lecture 7.3: Ring Homomorphisms
    Lecture 7.3: Ring homomorphisms Matthew Macauley Department of Mathematical Sciences Clemson University http://www.math.clemson.edu/~macaule/ Math 4120, Modern Algebra M. Macauley (Clemson) Lecture 7.3: Ring homomorphisms Math 4120, Modern algebra 1 / 10 Motivation (spoilers!) Many of the big ideas from group homomorphisms carry over to ring homomorphisms. Group theory The quotient group G=N exists iff N is a normal subgroup. A homomorphism is a structure-preserving map: f (x ∗ y) = f (x) ∗ f (y). The kernel of a homomorphism is a normal subgroup: Ker φ E G. For every normal subgroup N E G, there is a natural quotient homomorphism φ: G ! G=N, φ(g) = gN. There are four standard isomorphism theorems for groups. Ring theory The quotient ring R=I exists iff I is a two-sided ideal. A homomorphism is a structure-preserving map: f (x + y) = f (x) + f (y) and f (xy) = f (x)f (y). The kernel of a homomorphism is a two-sided ideal: Ker φ E R. For every two-sided ideal I E R, there is a natural quotient homomorphism φ: R ! R=I , φ(r) = r + I . There are four standard isomorphism theorems for rings. M. Macauley (Clemson) Lecture 7.3: Ring homomorphisms Math 4120, Modern algebra 2 / 10 Ring homomorphisms Definition A ring homomorphism is a function f : R ! S satisfying f (x + y) = f (x) + f (y) and f (xy) = f (x)f (y) for all x; y 2 R: A ring isomorphism is a homomorphism that is bijective. The kernel f : R ! S is the set Ker f := fx 2 R : f (x) = 0g.
    [Show full text]
  • 7. Quotient Groups III We Know That the Kernel of a Group Homomorphism Is a Normal Subgroup
    7. Quotient groups III We know that the kernel of a group homomorphism is a normal subgroup. In fact the opposite is true, every normal subgroup is the kernel of a homomorphism: Theorem 7.1. If H is a normal subgroup of a group G then the map γ : G −! G=H given by γ(x) = xH; is a homomorphism with kernel H. Proof. Suppose that x and y 2 G. Then γ(xy) = xyH = xHyH = γ(x)γ(y): Therefore γ is a homomorphism. eH = H plays the role of the identity. The kernel is the inverse image of the identity. γ(x) = xH = H if and only if x 2 H. Therefore the kernel of γ is H. If we put all we know together we get: Theorem 7.2 (First isomorphism theorem). Let φ: G −! G0 be a group homomorphism with kernel K. Then φ[G] is a group and µ: G=H −! φ[G] given by µ(gH) = φ(g); is an isomorphism. If γ : G −! G=H is the map γ(g) = gH then φ(g) = µγ(g). The following triangle summarises the last statement: φ G - G0 - γ ? µ G=H Example 7.3. Determine the quotient group Z3 × Z7 Z3 × f0g Note that the quotient of an abelian group is always abelian. So by the fundamental theorem of finitely generated abelian groups the quotient is a product of abelian groups. 1 Consider the projection map onto the second factor Z7: π : Z3 × Z7 −! Z7 given by (a; b) −! b: This map is onto and the kernel is Z3 ×f0g.
    [Show full text]
  • 1 Semifields
    1 Semifields Definition 1 A semifield P = (P; ⊕; ·): 1. (P; ·) is an abelian (multiplicative) group. 2. ⊕ is an auxiliary addition: commutative, associative, multiplication dis- tributes over ⊕. Exercise 1 Show that semi-field P is torsion-free as a multiplicative group. Why doesn't your argument prove a similar result about fields? Exercise 2 Show that if a semi-field contains a neutral element 0 for additive operation and 0 is multiplicatively absorbing 0a = a0 = 0 then this semi-field consists of one element Exercise 3 Give two examples of non injective homomorphisms of semi-fields Exercise 4 Explain why a concept of kernel is undefined for homorphisms of semi-fields. A semi-field T ropmin as a set coincides with Z. By definition a · b = a + b, T rop a ⊕ b = min(a; b). Similarly we define T ropmax. ∼ Exercise 5 Show that T ropmin = T ropmax Let Z[u1; : : : ; un]≥0 be the set of nonzero polynomials in u1; : : : ; un with non- negative coefficients. A free semi-field P(u1; : : : ; un) is by definition a set of equivalence classes of P expression Q , where P; Q 2 Z[u1; : : : ; un]≥0. P P 0 ∼ Q Q0 if there is P 00;Q00; a; a0 such that P 00 = aP = a0P 0;Q00 = aQ = a0Q0. 1 0 Exercise 6 Show that for any semi-field P and a collection v1; : : : ; vn there is a homomorphism 0 : P(u1; : : : ; un) ! P ; (ui) = vi Let k be a ring. Then k[P] is the group algebra of the multiplicative group of the semi-field P.
    [Show full text]
  • MATH 210A, FALL 2017 Question 1. Consider a Short Exact Sequence 0
    MATH 210A, FALL 2017 HW 3 SOLUTIONS WRITTEN BY DAN DORE, EDITS BY PROF.CHURCH (If you find any errors, please email [email protected]) α β Question 1. Consider a short exact sequence 0 ! A −! B −! C ! 0. Prove that the following are equivalent. (A) There exists a homomorphism σ : C ! B such that β ◦ σ = idC . (B) There exists a homomorphism τ : B ! A such that τ ◦ α = idA. (C) There exists an isomorphism ': B ! A ⊕ C under which α corresponds to the inclusion A,! A ⊕ C and β corresponds to the projection A ⊕ C C. α β When these equivalent conditions hold, we say the short exact sequence 0 ! A −! B −! C ! 0 splits. We can also equivalently say “β : B ! C splits” (since by (i) this only depends on β) or “α: A ! B splits” (by (ii)). Solution. (A) =) (B): Let σ : C ! B be such that β ◦ σ = idC . Then we can define a homomorphism P : B ! B by P = idB −σ ◦ β. We should think of this as a projection onto A, in that P maps B into the submodule A and it is the identity on A (which is exactly what we’re trying to prove). Equivalently, we must show P ◦ P = P and im(P ) = A. It’s often useful to recognize the fact that a submodule A ⊆ B is a direct summand iff there is such a projection. Now, let’s prove that P is a projection. We have β ◦ P = β − β ◦ σ ◦ β = β − idC ◦β = 0. Thus, by the universal property of the kernel (as developed in HW2), P factors through the kernel α: A ! B.
    [Show full text]
  • Finite Semifields and Nonsingular Tensors
    FINITE SEMIFIELDS AND NONSINGULAR TENSORS MICHEL LAVRAUW Abstract. In this article, we give an overview of the classification results in the theory of finite semifields1and elaborate on the approach using nonsingular tensors based on Liebler [52]. 1. Introduction and classification results of finite semifields 1.1. Definition, examples and first classification results. Finite semifields are a generalisation of finite fields (where associativity of multiplication is not assumed) and the study of finite semifields originated as a classical part of algebra in the work of L. E. Dickson and A. A. Albert at the start of the 20th century. Remark 1.1. The name semifield was introduced by Knuth in his dissertation ([41]). In the literature before that, the algebraic structure, satisfying (S1)-(S4), was called a distributive quasifield, a division ring or a division algebra. Since the 1970's the use of the name semifields has become the standard. Due to the Dickson-Wedderburn Theorem which says that each finite skew field is a field (see [36, Section 2] for some historical remarks), finite semifields are in some sense the algebraic structures closest to finite fields. It is therefore not surprising that Dickson took up the study of finite semifields shortly after the classification of finite fields at the end of the 19th century (announced by E. H. Moore on the International Congress for Mathematicians in Chicago in 1893). Remark 1.2. In the remainder of this paper we only consider finite semifields (unless stated otherwise) and finiteness will often be assumed implicitly. In the infinite case, the octonions (see e.g.
    [Show full text]
  • Chapter 11: Data Types in the Kernel
    ,ch11.3440 Page 288 Thursday, January 20, 2005 9:25 AM CHAPTER 11 Chapter 11 Data Types in the Kernel Before we go on to more advanced topics, we need to stop for a quick note on porta- bility issues. Modern versions of the Linuxkernel are highly portable, running on numerous different architectures. Given the multiplatform nature of Linux, drivers intended for serious use should be portable as well. But a core issue with kernel code is being able both to access data items of known length (for example, filesystem data structures or registers on device boards) and to exploit the capabilities of different processors (32-bit and 64-bit architectures, and possibly 16 bit as well). Several of the problems encountered by kernel developers while porting x86 code to new architectures have been related to incorrect data typing. Adherence to strict data typing and compiling with the -Wall -Wstrict-prototypes flags can prevent most bugs. Data types used by kernel data are divided into three main classes: standard C types such as int, explicitly sized types such as u32, and types used for specific kernel objects, such as pid_t. We are going to see when and how each of the three typing classes should be used. The final sections of the chapter talk about some other typi- cal problems you might run into when porting driver code from the x86 to other platforms, and introduce the generalized support for linked lists exported by recent kernel headers. If you follow the guidelines we provide, your driver should compile and run even on platforms on which you are unable to test it.
    [Show full text]