Appendices at Indices I and J Is Also Necessary When There Are Differ- Ent Numbers of Points Per Minibatch but Batched Together Using Zero Padding

Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data Appendices at indices i and j is also necessary when there are differ- ent numbers of points per minibatch but batched together using zero padding. The generalized PointConv trick can A. Derivations and Additional Methodology thus be applied in batch mode when there may be varied number of points per example and varied number of points A.1. Generalized PointConv Trick per neighborhood. The matrix notation becomes very cumbersome for manipu- lating these higher order n-dimensional arrays, so we will A.2. Abelian G and Coordinate Transforms instead use index notation with Latin indices i; j; k index- For Abelian groups that cover in a single orbit, the ing points, Greek indices α; β; γ indexing feature channels, computation is very similar to ordinaryX Euclidean convo- and c indexing the coordinate dimensions of which there lution. Defining ai = log(ui), bj = log(vj), and using are d = 3 for PointConv and d = dim(G) + 2 dim(Q) bj ai ai bj 1 the fact that e− e = e − means that log(v− ui) = for LieConv.3 As the objects are not geometric tensors but j (log exp)(a b ). Defining f~ = f exp, h~ = h exp; simply n-dimensional arrays, we will make no distinction i j we get◦ − ◦ ◦ between upper and lower indices. After expanding into indices, it should be assumed that all values are scalars, and 1 h~(a ) = (k~ proj)(a b )f~(b ); (15) that any free indices can range over all of the values. i n θ ◦ i − j j j nbhd(i) 2 X Let kα,β be the output of the MLP k which takes ac ij θ ij where proj = log exp projects to the image of the loga- as input and acts independently over the locations i; jf. Forg rithm map. Apart from◦ a projection and a change to logarith- PointConv, the input ac = xc xc and for LieConv the ij i j mic coordinates, this is equivalent to Euclidean convolution c 1 − c input aij = Concat([log(vj− ui); qi; qj]) . in a vector space with dimensionality of the group. When We wish to compute the group is Abelian and is a homogeneous space, then the dimension of the groupX is the dimension of the input. In α α,β β hi = kij fj : (12) these cases we have a trivial stabilizer group H and single origin o, so we can view f and h as acting on the input Xj,β xi = uio. In Wu et al.(2019), it was observed that since kα,β is the ij This directly generalizes some of the existing coordinate α,β α,β γ output of an MLP, kij = γ Wγ si;j for some final transform methods for achieving equivariance from the liter- weight matrix W and penultimate activations sγ (sγ is P i;j i;j ature such as log polar coordinates for rotation and scaling simply the result of the MLP after the last nonlinearity). equivariance (Esteves et al., 2017), and using hyperbolic With this in mind, we can rewrite (12) coordinates for squeeze and scaling equivariance. α α,β γ β Log Polar Coordinates: Consider the Abelian Lie group hi = Wγ si;j fj (13) j,β γ of positive scalings and rotations: G = R∗ SO(2) acting X X 2 × on R . Elements of the group M G can be expressed as α,β γ β (14) 2 = Wγ si;jfj a 2 2 matrix β,γ j × X X r cos(θ) r sin(θ) M(r; θ) = r sin(θ) −r cos(θ) In practice, the intermediate number of channels is much less than the product of c and c : γ < α β and + 4 in out for r R and θ R. The matrix logarithm is so this reordering of the computation leadsj j toj ajj massivej 2 2 reduction in both memory and compute. Furthermore, r cos(θ) r sin(θ) log(r) θ mod 2π log − = − ; bγ,β = sγ f β can be implemented with regular ma- r sin(θ) r cos(θ) θ mod 2π log(r) i j i;j j α α,β γ,β trix multiplication and hi = β,γ Wγ bi can be also or more compactly log(M(r; θ)) = log(r)I+(θ mod 2π)J, P α α,ε " by flattening (β; γ) into a single axis ": hi = " W bi . which is mod in the basis for the Lie algebra P [log(r); θ 2π] [I;J]. It is clear that proj = log exp is simply mod 2π The sum over index j can be restricted to a subsetPj(i) (such ◦ β on the J component. as a chosen neighborhood) by computing f( ) at each of the · required indices and padding to the size of the maximum As R2 is a homogeneous space of G, one can choose the γ,β γ β 2 subset with zeros, and computing bi = j si;j(i)fj(i) us- global origin o = [1; 0] R . A little algebra shows that ing dense matrix multiplication. Masking out of the values 2 P 4Here θ mod 2π is defined to mean θ + 2πn for the integer 3dim(Q) is the dimension of the space into which Q, the orbit n such that the value is in (−π; π), consistent with the principal identifiers, are embedded. matrix logarithm. (θ + π)%2π − π in programming notation. Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data lifting to the group yields the transformation ui = M(ri; θi) A.3. Sufficient Conditions for Geodesic Distance 2 2 for each point pi = uio, where r = x + y , and 1 In general, the function d(u; v) = log(v− u) F , defined θ = atan2(y; x) are the polar coordinates of the point pi. k k 1 p on the domain of GL(d) covered by the exponential map, Observe that the logarithm of v− u has a simple expression j i satisfies the first three conditions of a distance metric but highlighting the fact that it is invariant to scale and rotational not the triangle inequality, making it a semi-metric: transformations of the elements, 1 1 log(v− u ) = log(M(r ; θ ) M(r ; θ )) 1. d(u; v) 0 j i j j − i i ≥ 1 = log(ri=rj)I + (θi θj mod 2π)J: 2. d(u; v) = 0 log(u− v) = 0 u = v − , , 1 1 3. d(u; v) = log(v− u) = log(u− v) = d(v; u). k k k − k Now writing out our Monte Carlo estimation of the integral: However for certain subgroups of GL(d) with additional 1 structure, the triangle inequality holds and the function is h(p ) = k~ (log(r =r ); θ θ mod 2π)f(p ); i n θ i j i − j j the distance along geodesics connecting group elements u j X and v according to the metric tensor which is a discretization of the log polar convolution from T T 1 A; B := Tr(A u− u− B); (16) Esteves et al.(2017). This can be trivially extended to h iu encompass cylindrical coordinates with the group T (1) where T denotes inverse and transpose. × − R∗ SO(2). × Specifically, if the subgroup G is in the image of the exp : Hyperbolic coordinates: For another nontrivial example, g G map and each infinitesmal generator commutes with ! T consider the group of scalings and squeezes G = R∗ SQ its transpose: [A; A ] = 0 for A g, then d(u; v) = 2 × 1 8 2 acting on the positive orthant = (x; y) R : x > log(v− u) F is the geodesic distance between u; v. 0; y > 0 . Elements of the groupX canf be expressed2 as the k k Geodesic Equation: product ofg a squeeze mapping and a scaling Geodesics of (16) satisfying γ_ γ_ = 0 can equivalently be derived by minimizing ther energy s 0 r 0 rs 0 functional M(r; s) = = 0 1=s 0 r 0 r=s 1 T T 1 E[γ] = γ;_ γ_ γ dt = Tr(_γ γ− γ− γ_ )dt for any r; s +. As the group is abelian, the logarithm h i R Zγ Z0 splits nicely2 in terms of the two generators I and A: using the calculus of variations. Minimizing curves γ(t), rs 0 1 0 1 0 connecting elements u and v in G (γ(0) = v; γ(1) = u) log = (log r) + (log s) : 0 r=s 0 1 0 1 satisfy − 1 Again is a homogeneous space of G, and we choose a T T 1 single originX o = [1; 1]. With a little algebra, it is clear that 0 = δE = δ Tr(_γ γ− γ− γ_ )dt Z0 M(ri; si)o = pi where r = pxy and s = x=y are the Noting that δ(γ 1) = γ 1δγγ 1 and the linearity of the hyperbolic coordinates of pi. − − − p trace, − Expressed in the basis = [I;A] for the Lie algebra above, B 1 we see that T T 1 T T 1 1 2 Tr(_γ γ− γ− δγ_ ) Tr(_γ γ− γ− δγγ− γ_ )dt = 0: 1 0 − log(vj− ui) = log(ri=rj)I + log(si=sj)A Z yielding the expression for convolution Using the cyclic property of the trace and integrating by parts, we have that 1 h(pi) = k~θ(log(ri=rj); log(si=sj))f(pj); 1 n d T T 1 1 T T 1 j 2 Tr (_γ γ− γ− )+γ− γ_ γ_ γ− γ− δγ dt = 0; X − dt Z0 which is equivariant to squeezes and scalings.

Load more