Graphical Calculus for Products and Convolutions

Graphical Calculus for products and convolutions Filippo M. Miatto Institut Polytechnique de Paris and Télécom ParisTech, LTCI, 46 Rue Barrault, 75013 Paris, France Graphical calculus is an intuitive visual notation for manipulating tensors and index contractions. Using graphical calculus leads to simple and memorable derivations, and with a bit of practice one can learn to prove complex identities even without the need for pen and paper. This manuscript is meant as a demonstration of the power and flexibility of graphical notation and we advocate exploring the use of graphical calculus in undergraduate courses. In the first part we define the following matrix products in graphical language: dot, tensor, Kronecker, Hadamard, Kathri-Rao and Tracy-Singh. We then use our definitions to prove several known identities in an entirely graphical way. Despite ordinary proofs consist in several lines of quite involved mathematical expressions, graphical calculus is so expressive that after writing an identity in graphical form we can realise by visual inspection that it is in fact true. As an example of the intuitiveness of graphical proofs, we derive two new identities. In the second part we develop a graphical description of convolutions, which is a central ingredient of convolutional neural networks and signal processing. Our single definition includes as special cases the circular discrete convolution and the cross-correlation. We illustrate how convolution can be seen as another type of product and we derive a generalised convolution theorem. We conclude with a quick guide on implementing tensor contractions in python. I. INTRODUCTION convolution which includes the standard convolution and the cross-correlation as special cases. These operations When we manipulate mathematical expressions (either are extremely common in signal processing and in the in writing or mentally), the behaviour of the typographi- field of machine learning, in particular in convolutional cal symbols can acquire its own physicality. For example, neural networks [2] and signal processing. we might feel that as soon as we wrap a product of various Graphical calculus was introduced in the seventies by terms in a logarithm, the \force" that is binding them to- Penrose [3] in the context of general relativity, it was then gether comes loose, as log(xyz) = log(x)+log(y)+log(z). slowly picked up and/or rediscovered by various authors Even the simple operation of bringing the denominator including Lafont [4], Coecke and Abramsky [5, 6], Grif- of one side of an equation to the numerator of the op- fiths [7], Seilinger [8], Baez [9], Wood [10], Biamonte [11], x posite side ( 2 = y x = 2y) is a physical interpre- Jaffe [12] and others [13, 14]. Because of such diversity tation of the actual operation! of multiplying both sides of scopes and the scarcity of mainstream adoption, the by the same quantity.Such physical familiarity with sym- notation is not yet standardised. Some authors proceed bolic manipulation comes with time and practice, and vertically, others horizontally. Some right to left, others arguably the more a notion is described by symbols that left to right. Obviously, there is no actual difference, but behave somewhat physically, the easier it is to have such different notations can make it easier or harder to tran- an experience. Graphical calculus is an extreme example sition between regular and graphical notations. In our of such notation: in graphical calculus all the operations case, we opted for a horizontal notation (as it is more consist in connecting \bendable" and \stretchable" wires, similar to the way in which we write) and right to left and one can perform complicated tensor manipulations (as that is how we compose successive matrix multiplica- entirely within this framework. tions). Graphical calculus offers a representation of concepts A few words about our conventions. When we write arXiv:1903.01366v1 [quant-ph] 4 Mar 2019 that complements the standard mathematical notation: tensors with indices (usually roman literals), we use the it is by providing alternative representations of the same convention that repeated indices are implicitly summed concept that we help the recognition networks in the over, to avoid an excessive use of the summation sym- brain of a learner, as recommended in the principles of bol. When we contract a pair of indices we assume that universal design for learning [1]. It is with this student- their dimensions match. In graphical notation, a tensor centered spirit that we present this paper. of rank R is drawn as a shape with R wires, where each In the first part of this work we define six types of ma- wire represents an index. When we connect two wires it trix product (dot, tensor, Hadamard, Kronecker, Kathri- means we are summing over those indices, like so: Aij Rao, Tracy-Singh) in graphical calculus notation. Our and Bjk both have two indices (i.e. we can think of A aim is to make them available in a form that is easy to and B as matrices) which means that in graphical nota- remember, easy to work with and easy to explain to oth- tion they both have two wires. If we join the two wires ers. To stress the power of graphical calculus, we show corresponding to the j indices (assuming they have the two new (to the best of the author's knowledge) identi- same dimension), we obtain the tensor AijBjk which has ties. In the second part we describe a generalized discrete two leftover indices because we sum over the repeated 2 P j index: j AijBjk (see Fig. 8). Technically it doesn't i<latexit sha1_base64="Yt5X7WkWqEfcuobpBeU98cXxhSE=">AAAC6HicjVLLSgMxFD0dX7W+qi7dDBbBVZkRQZeiG5cVbSvUUmbStMbOi5mMIMVPEFxZ3PpX/kH9C29iCmrxkTCZk3PPuclN4ieByKTjvBasmdm5+YXiYmlpeWV1rby+0cjiPGW8zuIgTi99L+OBiHhdChnwyyTlXugHvOkPTlS8ecvTTMTRhbxLeDv0+pHoCeZJos5Fx+2UK07V0c2eBq4BFZhWi8tjXKGLGAw5QnBEkIQDeMiot+DCQUJcG0PiUkJCxznuUSJvTipOCo/YAY19mrUMG9Fc5cy0m9EqAX0pOW3skCcmXUpYrWbreK4zK/an3EOdU+3tjv6+yRUSK3FN7F++ifK/PlWLRA+HugZBNSWaUdUxkyXXp6J2bn+qSlKGhDiFuxRPCTPtnJyzrT2Zrl2drafjY61UrJozo83x9mt1PvVQ3Qk9Aff7hU+Dxl7VJXy2Xzk6No+hiC1sY5du/ABHOEUNdVqhjwc8YWTdWI/WyHr+kFoF49nEl2a9vAPZtpd1</latexit> 1 matter how we orient the drawings of the tensors and their wires, as long as we keep track of which index they i<latexit sha1_base64="Yt5X7WkWqEfcuobpBeU98cXxhSE=">AAAC6HicjVLLSgMxFD0dX7W+qi7dDBbBVZkRQZeiG5cVbSvUUmbStMbOi5mMIMVPEFxZ3PpX/kH9C29iCmrxkTCZk3PPuclN4ieByKTjvBasmdm5+YXiYmlpeWV1rby+0cjiPGW8zuIgTi99L+OBiHhdChnwyyTlXugHvOkPTlS8ecvTTMTRhbxLeDv0+pHoCeZJos5Fx+2UK07V0c2eBq4BFZhWi8tjXKGLGAw5QnBEkIQDeMiot+DCQUJcG0PiUkJCxznuUSJvTipOCo/YAY19mrUMG9Fc5cy0m9EqAX0pOW3skCcmXUpYrWbreK4zK/an3EOdU+3tjv6+yRUSK3FN7F++ifK/PlWLRA+HugZBNSWaUdUxkyXXp6J2bn+qSlKGhDiFuxRPCTPtnJyzrT2Zrl2drafjY61UrJozo83x9mt1PvVQ3Qk9Aff7hU+Dxl7VJXy2Xzk6No+hiC1sY5du/ABHOEUNdVqhjwc8YWTdWI/WyHr+kFoF49nEl2a9vAPZtpd1</latexit> 1 i<latexit sha1_base64="C0XNXoKiiGTnur7JeJqSDVDFeFY=">AAAC6HicjVLLSgMxFD0dX7W+qi7dDBbBVZkpgi6LblxWtA+opcykaY2dFzMZoRQ/QXBlcetf+Qf1L7yJU1CLj4TJnJx7zk1uEjfyRCIt6zVnLCwuLa/kVwtr6xubW8XtnUYSpjHjdRZ6YdxynYR7IuB1KaTHW1HMHd/1eNMdnql4847HiQiDKzmKeMd3BoHoC+ZIoi5Ft9ItlqyypZs5D+wMlJC1Wlic4ho9hGBI4YMjgCTswUFCvQ0bFiLiOhgTFxMSOs5xjwJ5U1JxUjjEDmkc0KydsQHNVc5Euxmt4tEXk9PEAXlC0sWE1Wqmjqc6s2J/yj3WOdXeRvR3s1w+sRI3xP7lmyn/61O1SPRxomsQVFOkGVUdy7Kk+lTUzs1PVUnKEBGncI/iMWGmnbNzNrUn0bWrs3V0fKqVilVzlmlTvP1anUvdV3dCT8D+fuHzoFEp24QvjkrV0+wx5LGHfRzSjR+jinPUUKcVBnjAEybGrfFoTIznD6mRyzy7+NKMl3fcOJd2</latexit> 2 correspond to. However, in order to help transition to regular notation, we orient row wires toward the left and FIG. 2. We can use multiple copies of the rank-1 Kronecker column wires toward the right. tensor to construct constant tensors whose entries have all value 1. In this figure we have a constant vector with all ones (on top) and a constant matrix with all ones (on the bottom). II. TWO MEDIATING TENSORS mensions) Kronecker tensors yields other Kronecker ten- In order to compute the various products that we are sors, e.g. δ δ δ = δ . The Kronecker tensor going do describe, we will need the help of the Kronecker i1i2i3i4 i2i3 i4 i1 can mediate a dot product: A B is a tensor with four and the vectorization tensors, which are two \mediating" ij kl indices, but A B δ is equivalent to the dot product tensors that enable such products. In the following sub- ij kl jk of the two matrices A and B. We can also generalise it sections we will see them separately and get a feeling for to act on more than two matrices (or tensors), such as what they do, both at the level of indexed notation and A B C δ . The Kronecker tensor can extract the at the level of graphical notation. ij kl mn ikm diagonal of a matrix: diag(A)k = Aijδijk, create a diagonal matrix from a vector: diag(a)jk = aiδijk or erase all A. The Kronecker tensor of the off-diagonal elements of a matrix: Aijδijkl. The Kronecker tensor can be used to compute the trace of a matrix: Tr(A) = A δ , or the partial trace of a higher The D-dimensional, rank-N Kronecker tensor is a gen- ij ij order tensor: Tr1(A)kl = Aijklδij, which is especially eralization of the familiar Kronecker delta δij. It is de- useful in quantum information. fined as: ( 1 if i1 = i2 = = iN diag(A) Tr(A) <latexit sha1_base64="bMnNLCAqmtDK6yx2vGcw0r/0aWA=">AAAC93icjVLLSgMxFD0dX/U96tLNYBF0I1MRdFl147KCfYAWyUzTGpwXmUyhlP6J4Mri1s/xD+pfeBOnoBYfCZM5Ofecm9wkXhKIVLnua8GamZ2bXyguLi2vrK6t2xub9TTOpM9rfhzEsumxlAci4jUlVMCbieQs9ALe8O7PdbzR4zIVcXSl+glvhawbiY7wmSLq1rZvQqbuZDhoC9Yd7p3u39ol98A1zZkG5RyUkLdqbI9xgzZi+MgQgiOCIhyAIaV+jTJcJMS1MCBOEhImzjHEEnkzUnFSMGLvaezS7DpnI5rrnKlx+7RKQJ8kp4Nd8sSkk4T1ao6JZyazZn/KPTA59d769PfyXCGxCnfE/uWbKP/r07UodHBiahBUU2IYXZ2fZ8nMqeidO5+qUpQhIU7jNsUlYd84J+fsGE9qatdny0x8bJSa1XM/12Z4+7U6j3qo74SeQPn7hU+D+uFBmfDlUalylj+GIraxgz268WNUcIEqarRCDw94wsjqW4/WyHr+kFqF3LOFL816eQcz9p0z</latexit>

Graphical Calculus for Products and Convolutions

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support