Probabilistic Permutation Synchronization Using the Riemannian Structure of the Birkhoff Polytope

Probabilistic Permutation Synchronization using the Riemannian Structure of the Birkhoff Polytope 1 2 Probability Simplex Δ Tolga Birdal Umut S¸ims¸ekli Orthogonal Hypersphere : 1 Birkhoff Polytope Fakultat¨ fur¨ Informatik, Technische Universitat¨ Munchen,¨ 85748Permutation Munchen,¨ Matrices = ∩ Germany Tangent Space 2 LTCI, Tel´ ecom´ ParisTech, Universite´ Paris-Saclay, 75013Common Paris, center of mass France (a) Initialization (b) Solution (c) Certainty/Confidence (d) Multiple Hypotheses Solution Abstract We present an entirely new geometric and probabilis- Top-N tic approach to synchronization of correspondences across Certainty multiple sets of objects or images. In particular, we present two algorithms: (1) Birkhoff-Riemannian L-BFGS for op- timizing the relaxed version of the combinatorially in- Top-2 tractable cycle consistency loss in a principled manner, (2) Certainty Birkhoff-Riemannian Langevin Monte Carlo for generating Figure 1. Our algorithm robustly solves the multiway image samples on the Birkhoff Polytope and estimating the con- matching problem (a, b) and provides confidence maps (c) that fidence of the found solutions. To this end, we first intro- can be of great help in further improving the estimates (d). The duce the very recently developed Riemannian geometry of bar on the right is used to assign colors to confidences. For the the Birkhoff Polytope. Next, we introduce a new probabilis- rest, incorrect matches are marked in red and correct ones in blue. tic synchronization model in the form of a Markov Random In this paper, by using the fact that correspondences Field (MRF). Finally, based on the first order retraction op- 1 erators, we formulate our problem as simulating a stochas- are cycle consistent , we propose two novel algorithms tic differential equation and devise new integrators. We for refining the assignments across multiple images/scans show on both synthetic and real datasets that we achieve (nodes) in a multi-way graph and for estimating assignment high quality multi-graph matching results with faster con- confidences, respectively. We model the correspondences vergence and reliable confidence/uncertainty estimates. between image pairs as relative, total permutation matrices and seek to find absolute permutations that re-arrange 1. Introduction the detected keypoints to a single canonical, global order. This problem is known as map or permutation synchro- Correspondences fuel a large variety of computer vi- nization [64, 81]. Even though in many practical scenar- sion applications such as structure-from-motion (SfM) [73], ios matches are only partially available, when shapes are SLAM [61], 3D reconstruction [24, 10, 8], camera re- complete and the density of matches increases, total permu- localization [71], image retrieval [52] and 3D scan stitch- tations can suffice [42]. ing [45, 26]. In a typical scenario, given two scenes, an Similar to many well received works [97, 72], we re- initial set of 2D/3D keypoints is first identified. Then the lax the sought permutations to the set of doubly-stochastic neighborhood of each keypoint is summarized with a lo- (DS) matrices. We then consider the geometric structure cal descriptor [55, 27] and keypoints in the given scenes of DS, the Birkhoff Polytope [11]. We are - to the best are matched by associating the mutually closest descrip- of our knowledge, for the first time introducing and ap- tors. In a majority of practical applications, multiple images plying the recently developed Riemannian geometry of the or 3D shapes are under consideration and ascertaining such Birkhoff Polytope [30] to tackle challenging problems of two-view or pairwise correspondences is simply not suffi- computer vision. Note that lack of this geometric under- cient. This necessitates a further refinement ensuring global standing caused plenty of obstacles for scholars dealing consistency. Unfortunately, at this stage even the well de- with our problem [72, 88]. By the virtue of a first order re- veloped pipelines acquiesce either heuristic/greedy refine- traction, we can use the recent Riemannian limited-memory ment [25] or incorporate costly geometric cues related to the linking of individual correspondence estimates into a 1Composition of correspondences for any circular path arrives back at globally coherent whole [35, 73, 84]. the start node. 1 BFGS (LR-BFGS) algorithm [95] to perform a maximum- The first applications of synchronization, a term coined a-posteriori (MAP) estimation of the parameters of the con- by Singer [78, 77], to correspondences only date back to sistency loss. We coin our variation as Birkhoff-LRBFGS. early 2010s [62]. Pachauri et al. [64] gave a formal def- At the next stage, we take on the challenge of confi- inition and devised a spectral technique. The same au- dence/uncertainty estimation for the problem at hand by thors quickly extended their work to Permutation Diffu- drawing samples on the Birkhoff Polytope and estimating sion Maps [63] finding correspondence between images. the empirical posterior distribution. To achieve this, we first Unfortunately, this first method was quadratic in the num- formulate a new geodesic stochastic differential equation ber of images and hence was not computationally friendly. (SDE). Our SDE is based upon the Riemannian Langevin In a sequel of works called MatchLift, Huang, Chen and Monte Carlo (RLMC) [36, 91, 66] that is efficient and ef- Guibas [42, 19] were the firsts to cast the problem of es- fective in sampling from Riemannian manifolds with true timating cycle-consistent maps as finding the closest pos- exponential maps. Note that similar stochastic gradient itive semidefinite matrix to an input matrix. They also geodesic MCMC (SG-MCMC) [54, 15] tools have already addressed the case of partial permutations. Due to the been used in the context of synchronization of spatial rigid semidefinite programming (SDP) involved, this perspec- transformations whose parameters admit an analytically de- tive suffered from high computational cost in real applica- fined geodesic flow [9]. Unfortunately, for our manifold the tions. Similar to Pachauri [64], for N images and M edges, retraction map is only up to first order and hence we can- this method required computing an eigendecomposition of not use off-the-shelf schemes. Alleviating this nuisance, we an NM × NM matrix. Zhou et al. [98] then introduced further contribute a novel numerical integrator to solve our MatchALS, a new low-rank formulation with nuclear-norm SDE by replacing the intractable exponential map of DS relaxation, globally solving the joint matching of a set of matrices by the approximate retraction map. This leads to images without the need of SDP. Yu et al. [94] formulated another new algorithm: Birkhoff-RLMC. a synchronization energy similar to our method and pro- In a nutshell, our contributions are: posed proximal Gauss-Seidel methods for solving a relaxed 1. We function as an ambassador and introduce the Rie- problem. However, unlike us, this paper did not use the ge- mannian geometry of the Birkhoff Polytope [30] to ometry of the constraints or variables and thereby resorted solve problems in computer vision. to complicated optimization procedures involving Frank- 2. We propose a new probabilistic model for the permu- Wolfe subproblems for global constraint satisfaction. Ar- tation synchronization problem. rigoni et al. [4] and Maset et al. [4] extended Pachauri [64] 3. We minimize the cycle consistency loss via a to operate on partial permutations using spectral decompo- Riemannian-LBFGS algorithm and outperfom the sition. To do so, they considered the symmetric inverse state-of-the-art both in recall and in runtime. semigroup of the partial matches that are typically hard to 4. Based upon the Langevin mechanics, we introduce a handle. Their closed form methods did not need initializa- new SDE and a numerical integrator to draw samples tion steps to synchronize, but also did not establish an ex- on the high dimensional and complex manifolds with plicit cycle consistency. Tang et al. [82] opted to use or- approximate retractions, such as the Birkhoff Poly- dering heuristics improving upon Pachauri [64]. Cosmo et tope. This lets us estimate the confidence maps, which al. [23] brought an interesting solution to the problem of es- can aid in improving the solutions and spotting consis- timating consistent correspondences between multiple 3D tency violations or outliers. shapes, without requiring initial pairwise solutions as in- Note that the tools developed herewith can easily ex- put. Schiavinato and Torsello [72] tried to overcome the tend beyond our application and would hopefully facilitate lack of group structure of the Birkhoff polytope by trans- promising research directions regarding the combinatorial forming any graph-matching problem into a multi-graph optimization problems in computer vision. matching one. Bernard et al. [7] used an NMF-based approach to generate a cycle-consistent synchronization. Park 2. Related Work and Yoon [65] used multi-layer random walks framework to address the global correspondence search problem of multi- Permutation synchronization is an emerging domain of attributed graphs. Starting from a multi-layer random-walks study due to its wide applicability, especially for the prob- initialization, the authors proposed a robust solver by itera- lems in computer vision. We now review the developments tive reweighting. Hu et al. [41] revisited the MatchLift and in this field, as chronologically as possible. Note that multi- developed a scalable, distributed solution with the help of way graph matching problem formulations involving spatial ADMMs, called DMatch. Their idea of splitting the input geometry are well studied [22, 58, 51, 33, 92, 23], as well into sub-collections can still lead to global consistency un- as transformation synchronization [87, 17, 83, 86, 5, 6, 38]. der mild conditions while improving the efficiency. Finally, For brevity, we omit these literature and focus on works that Wang et al. [88] made use of the domain knowledge and explicitly operate on correspondence matrices. added the geometric consistency of image coordinates as a Probability Simplex Δ푛 low-rank term to increase the recall.

Load more