Article

The International Journal of Robotics Research No belief propagation required: Belief 1–43 © The Author(s) 2017 Reprints and permissions: space planning in high-dimensional state sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/0278364917721629 spaces via factor graphs, the journals.sagepub.com/home/ijr lemma, and re-use of calculation

Dmitry Kopitkov1 and Vadim Indelman2

Abstract We develop a computationally efficient approach for evaluating the information-theoretic term within belief space plan- ning (BSP), where during belief propagation the state vector can be constant or augmented. We consider both unfocused and focused problem settings, whereas uncertainty reduction of the entire system or only of chosen variables is of interest, respectively. State-of-the-art approaches typically propagate the belief state, for each candidate action, through calcula- tion of the posterior information (or covariance) matrix and subsequently compute its determinant (required for entropy). In contrast, our approach reduces runtime complexity by avoiding these calculations. We formulate the problem in terms of factor graphs and show that belief propagation is not needed, requiring instead a one-time calculation that depends on (the increasing with time) state dimensionality, and per-candidate calculations that are independent of the latter. To that end, we develop an augmented version of the , and show that computations can be re-used when eval- uating impact of different candidate actions. These two key ingredients and the factor graph representation of the problem result in a computationally efficient (augmented) BSP approach that accounts for different sources of uncertainty and can be used with various sensing modalities. We examine the unfocused and focused instances of our approach, and compare it with the state of the art, in simulation and using real-world data, considering problems such as autonomous navigation in unknown environments, measurement selection and sensor deployment. We show that our approach significantly reduces running time without any compromise in performance.

Keywords Belief space planning, active SLAM, informative planning, active inference, autonomous navigation

1. Introduction observable Markov decision process (POMDP), while cal- culating an optimal solution of a POMDP was proven to be Decision making under uncertainty and belief space plan- computationally intractable (Kaelbling et al., 1998) for all ning (BSP) are fundamental problems in robotics and arti- but the smallest problems due to curse of history and curse ficial intelligence, with applications including autonomous of dimensionality. Recent research has therefore focused on driving, surveillance, sensor deployment, object manipu- the development of sub-optimal approaches that trade-off lation, and active simultaneous localization and mapping optimality and runtime complexity. These approaches can (SLAM). The goal is to autonomously determine the best be classified into those that discretize the action, state, and actions according to a specified objective function, given measurement spaces, and those that operate over continuous the current belief about random variables of interest that spaces. could represent, for example, robot poses, a tracked tar- get, or mapped environment, while accounting for different sources of uncertainty. 1Technion Autonomous Systems Program (TASP), Technion - Israel Insti- Since the true state of interest is typically unknown and tute of Technology, Haifa, Israel only partially observable through acquired measurements, 2Department of Aerospace Engineering, Technion - Israel Institute of it can only be represented through a probability distribution Technology, Haifa, Israel conditioned on available data. BSP and decision-making approaches reason how this distribution (the belief ) evolves Corresponding author: Dmitry Kopitkov, Technion Autonomous Systems Program (TASP), Tech- as a result of candidate actions and future expected obser- nion - Israel Institute of Technology, Haifa 32000, Israel. vations. Such a problem is an instantiation of a partially Email: [email protected] 2 The International Journal of Robotics Research 00(0)

Approaches from the former class include point-based matrices for each candidate action, and do so from scratch value iteration methods (Pineau et al., 2006), simula- (Zimmerman, 2006; Zhu and Stein, 2006). tionbased (Stachniss et al., 2005), and sampling-based A similar situation also arises in measurement selection approaches (Agha-Mohammadi et al., 2014; Prentice and (Carlone et al., 2014; Davison, 2005) and graph pruning Roy, 2009). On the other hand, approaches that avoid dis- (Carlevaris-Bianco et al., 2014; Huang et al., 2012; Mazu- cretization are often termed direct trajectory optimization ran et al., 2014; Vial et al., 2011) in the context of long-term methods (e.g. Indelman et al., 2015; Patil et al., 2014; autonomy in SLAM. In the former case, the main idea is to Platt et al., 2010; Van Den Berg et al., 2012; Walls et determine the most informative measurements (e.g. image al., 2015); these approaches typically calculate a locally features) given measurements provided by robot sensors, optimal solution from a given nominal solution. thereby discarding uninformative and redundant informa- Decision making under uncertainty, also sometimes tion. Such a process typically involves reasoning about MI referred to as active inference, and BSP can be for- (see e.g. Chli and Davison, 2009; Davison, 2005), for each mulated as selecting an optimal action from a set of candidate selection. Similarly, graph pruning and sparsifi- candidates, based on some cost function. In information- cation can be considered as instances of decision making based decision making, the cost function typically con- in high-dimensional state spaces (Carlevaris-Bianco et al., tains terms that evaluate the expected posterior uncertainty 2014; Huang et al., 2012), with decision corresponding to upon action execution, with commonly used costs includ- determining what nodes to marginalize out (Ila et al., 2010; ing (conditional) entropy and mutual information (MI). Kretzschmar and Stachniss, 2012), and avoiding the result- Thus, for Gaussian distributions the corresponding cal- ing fill-in in an information matrix by resorting to sparse culations typically involve calculating a determinant of a approximations of the latter (Carlevaris-Bianco et al., 2014; posteriori covariance (information) matrices and, more- Huang et al., 2012; Mazuran et al., 2014; Vial et al., 2011). over, these calculations are to be performed for each Also here, existing approaches typically involve calcula- candidate action. tion of the determinant of large matrices for each candidate Decision making and BSP become even more chal- action. lenging problems when considering high-dimensional state In this paper we develop a computationally efficient spaces. Such a setup is common in robotics, for example and exact approach for decision making and BSP in high- in the context of BSP in uncertain environments, active dimensional state spaces that addresses the aforementioned SLAM, sensor deployment, graph reduction, and graph challenges. The key idea is to use the (augmented) general sparsification. In particular, calculating a determinant of matrix determinant lemma to calculate action impact with information (covariance) matrix for an n-dimensional state complexity independent of state dimensionality n,whilere- is in general O(n3), and is smaller for sparse matrices as in using calculations between evaluating impact for different SLAM problems (Bai et al., 1996). candidate actions. Our approach supports general observa- Moreover, state-of-the-art approaches typically perform tion and motion models, and non-myopic planning, and is these calculations from scratch for each candidate action. thus applicable to a wide range of applications such as those For example, in the context of active SLAM, state-of- mentioned above, where fast decision making and BSP in the-art BSP approaches first calculate the posterior belief high-dimensional state spaces is required. within the planning horizon, and then use that belief to Although many particular domains can be specified as evaluate the objective function, which typically includes an decision-making and BSP problems, they all can be clas- information-theoretic term (Huang et al., 2005; Indelman sified into two main categories, one where state vector is et al., 2015; Kim and Eustice, 2014; Valencia et al., 2013). fixed during belief propagation and another where the state These approaches then determine the best action by per- vector is augmented with new variables. Sensor deployment forming the mentioned calculations for each action from is an example of the first case, while active SLAM, where a given set of candidate actions, or by local search using future robot poses are introduced into the state, is an exam- dynamic programming or gradient descent (for continuous ple of the second case. Conceptually the first category is a setting). particular case of the second, but as we will see both will Sensor deployment is another example of decision mak- require different solutions. Therefore, in order to differenti- ing in high-dimensional state spaces. The basic formula- ate between these two categories, in this paper we consider tion of the problem is to determine locations to deploy the first category (fixed-state) as a BSP problem, and the the sensors such that some metric can be measured most second category (augmented-state) as an augmented BSP accurately through the entire area (e.g. temperature). The problem. problem can also be viewed as selecting the optimal action Moreover, we show the proposed concept is appli- from the set of candidate actions (available locations) and cable also to active focused inference. Unlike the the objective function usually contains a term of uncer- unfocused case discussed thus far, active focused tainty, such as the entropy of a posterior system (Krause inference approaches aim to reduce the uncertainty over et al., 2008). Also here, state-of-the-art approaches evaluate only a predefined set of the variables. The two problems a determinant over large posterior covariance (information) can have significantly different optimal actions, with an Dmitry Kopitkov et al. 3 optimal solution for the unfocused case potentially per- subject matrix is first augmented by zero rows/columns and forming badly for the focused setup, and vice versa (see only then new information is introduced; (c) we develop e.g. Levine and How, 2013). While the set of focused an approach for a non-myopic focused and unfocused variables can be small, exact state-of-the-art approaches (augmented) BSP in high-dimensional state spaces that uses calculate the marginal posterior covariance (information) the (augmented) matrix determinant lemma to avoid cal- matrix, for each action, which involves a computationally culating of large matrices, with per-candidate expensive Schur complement operation. For example, Mu complexity independent of state dimension; (d) we show et al. (2015) calculate posterior covariance matrix per each how calculations can be re-used when evaluating impacts measurement and then use the selection matrix in order to of different candidate actions; we integrate the calcula- obtain the marginal of the focused set. Levine and How tions re-use concept and AMDL into a general and highly (2013) developed an approach that determines MI between efficient BSP solver, that does not involve explicit calcu- focused and unfocused variables through message- lation of posterior belief evolution for different candidate passing algorithms on Gaussian graphs but their approach actions, naming this approach rAMDL; (e) we introduce an is limited to only graphs with unique paths between the even more efficient rAMDL variant specifically addressing relevant variables. a sensor deployment problem. In contrast, we provide a novel way to calculate posterior This paper is an extension of the work presented in entropy of focused variables, which is fast, simple, and Kopitkov and Indelman (2016, 2017). As a further contri- general; yet, it does not require calculation of a posterior bution, in this manuscript we show the problem of (aug- covariance matrix. In combination with our re-use algo- mented) BSP can be formulated through factor graphs rithm, it provides a focused decision-making solver that (Kschischang et al., 2001), thereby providing insights into is significantly faster compared with standard approaches. the sparsity patterns of Jacobian matrices that correspond to Calculating the posterior information matrix in aug- future posterior beliefs. We exploit the sparsity of the aug- mented BSP problems involves augmenting an appropri- mented BSP problem and provide a solution for this case ate prior information matrix with zero rows and columns, which outperforms our previous approach from Kopitkov i.e. zero padding, and then adding new information due to and Indelman (2017). In addition, we present an improved candidate action (see Figure 1). While the general matrix approach, compared with Kopitkov and Indelman (2016), determinant lemma is an essential part of our approach, to solve the non-augmented BSP problem. Moreover, we unfortunately it is not applicable to the mentioned aug- develop a more efficient variant of our approach, specifi- mented prior information matrix since the latter is singular cally addressing the sensor deployment problem (Section (even though the posterior information matrix is full rank). 4.1), and show in detail how our augmented BSP method In this paper, we develop a new variant of the matrix deter- can be applied particularly to autonomous navigation in minant lemma, called the augmented matrix determinant unknown environments and active SLAM (Section 4.2). lemma (AMDL), that addresses the general augmentation This paper also includes an extensive experimental eval- of a future state vector. Based on AMDL, we then develop uation using synthetic and real-world data (Section 7), a augmented BSP approach, considering both unfocused with the aim of comparing the time performance of all and focused cases. the approaches proposed herein in a variety of challeng- Finally, there is also a relation to the recently intro- ing scenarios, and benchmarking against the state of the duced concept of decision making in a conservative sparse art. Finally, here we attempt to give the reader some more information space (Indelman, 2015b, 2016). In particular, insights on the problem, discussing theoretical interpreta- considering unary observation models (involving only one tion of information gain (IG) metric, relation to MI (Davi- variable) and greedy decision making, it was shown that son, 2005; Kaess and Dellaert, 2009) (Section 3.5), and appropriately dropping all correlation terms and remain- practical remarks regarding the proposed approach. ing only with a diagonal covariance (information) matrix This paper is organized as follows. Section 2 introduces does not sacrifice performance while significantly reduc- the concepts of BSP, and gives a formal statement of the ing computational complexity. While the approach pre- problem. Section 3 describes our approach rAMDL for gen- sented herein confirms this concept for the case of unary eral formulation. Section 4 tailors the approach for spe- observation models, our approach addresses a general non- cific domains, providing even more efficient solutions to a myopic decision-making problem, with arbitrary observa- number of them. In Section 5 standard approaches are dis- tion and motion models. Moreover, compared with state- cussed as the main state-of-the-art alternatives to rAMDL. of-the-art approaches, our approach significantly reduces Further, in Section 6 we analyze the runtime complexity runtime while providing identical decisions. of approaches presented herein and their alternatives. Sec- To summarize, our main contributions in this paper are tion 7 presents experimental results, evaluating the pro- as follows: (a) we formulate (augmented) BSP in terms of posed approach and comparing it against the mentioned factor graphs, which allow us to see the problem in a more state of the art. Conclusions are drawn in Section 8. To intuitive and simple way; (b) we develop an augmented improve readability, proofs of several lemmas are given in version of matrix determinant lemma (AMDL), where the the Appendix. 4 The International Journal of Robotics Research 00(0)

j υj where hi is a known nonlinear function, i is zero-mean j j j = Gaussian noise, and ri is the expected value of hi (ri E j j [hi(Xi ) ]). Such a factor representation is a general way to express information about the state. In particular, it can j represent a measurement model, in which case hi is the j υj observation model and ri and i are the actual measurement z and measurement noise, respectively. Similarly, it can also Fig. 1. Illustration of the construction of k+L for a given candi- Aug represent a motion model (see Section 4.2). A maximum date action in the augmented BSP case. First, + is created by k L a posteriori (MAP) inference can be calculated efficiently adding n zero rows and columns. Then, the new information of (see e.g. Kaess et al., 2012) such that = Aug + T belief is added through k+L k+L A A. P | = N ∗  ( Xk Hk) (Xk , k), (4) 2. Notation and problem definition ∗  where Xk and k are the mean vector and covariance In this paper, we develop computationally efficient matrix, respectively. approaches for BSP. As evaluating action impact involves We shall refer to the posterior P( Xk|Hk) as the belief and inference over an appropriate posterior, we first formulate write . the corresponding inference problem. b[Xk] = P( Xk|Hk). (5) Consider a high-dimensional problem-specific state vec- n In the context of BSP, we typically reason about the tor Xk ∈ R at time tk. In different applications the state evolution of future beliefs b[Xk+l] at different look-ahead Xk can represent the robot configuration and poses (option- ally for whole history), environment-related variables, or steps l as a result of different candidate actions. A particu- any other variables to be estimated. In addition, consider lar candidate action can provide unique information (future 1 1 ni ni observations and controls) in a form of newly introduced factors Fi ={f (X ),..., f (X ) } that were added at time i i i i factors (see Figures 2 and 3) and can be more and less ben- 0 ≤ t ≤ t , where each factor f j(X j) represents a specific i k i i eficial for specific tasks such as reducing future uncertainty. measurement model, motion model or prior, and as such For example, in a SLAM application, choosing a trajectory involves appropriate state variables X j ⊆ X . i i that is close to the mapped landmarks (see Figure 3) will The joint probability distribution function (pdf) can then reduce uncertainty because of loop closures. Furthermore, be written as conceptually each candidate action can introduce different k ni additional state variables into the future state vector, as in P | ∝ j j ( Xk Hk) fi (Xi ), (1) the case of the smoothing SLAM formulation in Figure i=0 j=1 3 where the state is augmented by a (various) number of future robot poses. where Hk is history that contains all the information gath- Therefore, in order to reason about the belief b[X + ], first ered until time tk (measurements, controls, etc.). k l The inference problem can be naturally represented by a it needs to be carefully modeled. More specifically, let us ={¯. ... ¯ } factor graph (Kschischang et al., 2001), which is a bipartite focus on a non-myopic candidate action a a1, , aL graph G =( F, , E) with two node types: variables nodes that is a sequence of myopic actions with planning hori- zon L. Each action a¯ can be represented by new factors θi ∈ and factor nodes fi ∈ F (see e.g. Figures 2 and l ={1 1 ... nk+l nk+l } 3). Variable nodes represent state variables that need to be Fk+l fk+l(Xk+l), , fk+l (Xk+l ) and, possibly, new k+l ≤ ≤ estimated, while factor nodes express different constraints state variables Xnew (1 l L) that are acquired/added ¯ between different variables. Each factor node is connected while applying al. Similar to (1), the future belief b[Xk+L] can be explicitly written as by edges eij ∈ E to variable nodes that are involved in the corresponding constraint. Such a formulation is general k+L nl and can be used to represent numerous inference problems j j b[X + ] ∝ b[X ] f (X ), (6) (e.g. SLAM), while exploiting sparsity. Furthermore, com- k L k l l l=k+1 j=1 putationally efficient approaches, based on such formula- tion and exploiting its natural sparsity, have been developed ={. ∪ k+1 ∪ ··· ∪ k+L} where Xk+L Xk Xnew Xnew contains old recently (Kaess et al., 2012, 2008). and new state variables. For example, in Figure 3 at each As is common in many inference problems, we will future time step k + l, a new robot pose xk+l is added assume that all factors have a Gaussian form: to the state; new motion model factors {f7, f8, f9} and a 1 j j ∝ −  j j − j2 new observation model factor are also introduced. Simi- fi (Xi ) exp( hi(Xi ) ri j ), (2) 2 i lar expressions can be also written for any other lookahead with appropriate model step l. Observe in the above belief (6) that the future fac- tors depend on future observations, whose actual values j = j j +υj υj ∼ N j ri hi(Xi ) i , i (0, i ), (3) are unknown. Dmitry Kopitkov et al. 5

Fig. 2. Illustration of belief propagation in non-augmented BSP SLAM scenario. Prior belief b[Xk] and posterior belief b[Xk+L]are represented through factor graphs, with state vector being Xk = Xk+L ={x1, x2, l1, l2, l3} where xi and li denote robot poses and landmarks, respectively. Factor f0 is prior on the robot’s initial pose x1; factors {f2, f3, f4} are priors on landmark positions; factors between robot poses represent motion models; factors between pose and landmark represent observation models. Candidate action a introduces new measurements that are represented by new factors {f5, f6, f7}.

Fig. 3. Illustration of belief propagation in the augmented BSP SLAM scenario. Prior belief b[Xk] and posterior belief b[Xk+L]are represented through factor graphs, with appropriate state vectors being Xk ={x1, x2, x3, l1, l2} and Xk+L ={x1, x2, x3, l1, l2, x4, x5, x6}, where xi and li represent robot poses and landmarks, respectively. Factor f0 is prior on robot’s initial pose x1; factors {f2, f3, f4} are priors on landmark positions; factors between robot poses represent motion models; factors between pose and landmark represent observation models. Candidate action a makes the robot re-observe landmark l1; in order to do so it introduces new robot poses {x4, x5, x6} and new factors {f7, f8, f9, f10}.

It is important to note that, according to our definition whereas the filtering formulation can be considered as a from Section 1, new variables are added only in the aug- focused BSP scenario as described below. mented setting of the BSP problem, e.g. in the active SLAM As such the non-augmented BSP setting can be seen as context in Figure 3 where new robot poses are introduced by a special case of augmented BSP. In order to use similar a candidate action. On the other hand, in a non-augmented notation for both problems, however, in this paper we will k+l BSP setting, the states Xk+L and Xk are identical, while the consider Xnew to be an empty set for the former case and beliefs b[Xk+L] and b[Xk] are still conditioned on different non-empty for augmented BSP. data. For example, in sensor deployment and measurement It is not difficult to show (see e.g. Indelman et al., 2015) selection (see Figure 2) problems, the candidate actions that in the case of non-augmented BSP the posterior infor- are all possible subsets of sensor locations and of acquired mation matrix of the belief b[Xk+L], which is the inverse of observations, respectively. Here, when applying a candidate belief ’s covariance matrix, is given by action, new information about Xk is brought in, but the state vector itself is unaltered. k+L nl In contrast, in the augmented BSP problem new variables j T j −1 j + = + (H ) ·( ) ·H ,(7) are always introduced. In particular, in both smoothing and k L k l l l l=k+1 j=1 filtering formulations of SLAM, candidate actions (trajecto- ries) will introduce both new information (future measure- j = . j ments) and also new variables (future robot poses). While where k is the prior information matrix and Hl xhl j in filtering formulation old pose variables are marginalized are the Jacobian matrices of hl functions (see (2)) for all the out, the smoothing formulation instead keeps past and cur- new factor terms in (6). rent robot poses and newly mapped landmarks in the state As was already mentioned, in the case of augmented BSP, vector (see Figure 3), which is beneficial for better estima- the joint state Xk+L includes also new variables (with respect n tion accuracy and sparsity. As such, the smoothing formu- to the current state Xk). Considering Xk ∈ R ,first,newn N lation is an excellent example for augmented BSP problem, variables are introduced into future state vector Xk+L ∈ R 6 The International Journal of Robotics Research 00(0)

. with N = n + n , and then new factors involving appropri- with L immediate cost functions cl, for each look-ahead ate variables from Xk+L are added to form a posterior belief step, and one cost function for terminal future belief cL. b[Xk+L], as shown in (6). Each such cost function can include a number of differ- Consequently, in the augmented BSP scenario the pos- ent terms related to aspects such as information measure terior information matrix of belief b[Xk+L], i.e. k+L, can of future belief, distance to goal, and energy spent on con- be constructed by first augmenting the current informa- trol. Arguably, evaluating the information terms involves tion matrix k with n zero rows and columns to obtain the heaviest calculations of J. Aug ∈ RN×N k+L , and thereafter adding new information to Thus, in this paper, we focus only on the information- it, as illustrated in Figure 1 (see e.g. Indelman et al., 2015): theoretic term of terminal belief b[Xk+L], and consider dif- ferential entropy H (further referred to just as entropy) and k+L nl = Aug + j T · j −1 · j IG as the cost functions. Both can measure the amount of k+L k+L (Hl ) ( l ) Hl (8) information of future belief b[Xk+L], and will lead to the = + = l k 1 j 1 same optimal action. Yet, calculation of one is sometimes . j more efficient than the other, as will be shown in Section 3. where H = h are augmented Jacobian matrices of all l x l Therefore, we consider two objective functions: new factors in (6), linearized about the current estimate of Xk and about initial values of newly introduced variables. . JH(a) = H (b[X + ]) , (13) After stacking all new Jacobians in (7) and (8) into a k L single matrix A, and combining all noise matrices into . JIG(a) = H(b[Xk]) −H( b[Xk+L]) , (14) block-diagonal , we respectively obtain where the information matrix k+L, that corresponds to = + T · −1 · = + T · k+L k A A k A A (9) the belief b[Xk+L], is a function of candidate a’s Jacobian matrix A, see (9) and (10). The optimal candidate a∗, which = Aug + T · −1 · = Aug + T · k+L k+L A A k+L A A, (10) produces the most certain future belief, is then given by ∗ = ∗ = where a arg mina∈A JH(a), or by a arg maxa∈A JIG(a) with . − 1 A =  2 · A (11) both being mathematically identical. In particular, for Gaussian distributions, entropy is a × is an m N matrix that represents both Jacobians and function of the determinant of a posterior information noise covariances of all new factor terms in (6). The above (covariance) matrix, i.e. H (b[Xk+L]) ≡ H (k+L), and the equations can be considered as a single iteration of Gauss– objective functions can be expressed as Newton optimization and, similar to prior work (Indelman et al., 2015; Kim and Eustice, 2014; Van Den Berg et · γ n 1 1 k+L al., 2012), we take a maximum-likelihood assumption by JH(a) = − ln k+L , JIG(a) = ln 2 2 2 assuming they sufficiently capture the impact of candidate k (15) action. Under this assumption, the posterior information for BSP, and matrix k+L is independent of (unknown) future observa- tions (Indelman et al., 2015). One can further incorporate · γ · γ N 1 n 1 k+L reasoning if a future measurement will indeed be acquired JH(a)= − ln k+L , JIG(a)= + ln 2 2 2 2 (Chaves et al., 2015; Indelman et al., 2015; Walls et al., k (16) 2015); however, this is outside the scope of this paper. . for augmented BSP, where γ = 1 + ln(2π), and + can Each block row of matrix A represents a single factor k L be calculated according to (9) and (10). Thus, evaluating J from new terms in (6) and has a sparse structure. Only a requires determinant calculation of an n × n (or N × N) limited number of its sub-blocks is non-zero, i.e. sub-blocks 3 j matrix, which is in general O(n ), per candidate action a ∈ that correspond to the involved variables Xl in the relevant A j j . In many robotics applications, state dimensionality can factor fl (Xl ). be huge and even increasing with time (e.g. SLAM), and For notational convenience, we define the set of non- straightforward calculation of the above equations makes A ={ ...} myopic candidate actions by a1, a2, with appro- real-time planning hardly possible.  ={ ...} priate Jacobian matrices A A1, A2, . Although the So far, the exposition has referred to unfocused BSP ∈ A planning horizon is not explicitly shown, each a can problems, where the action impact is calculated by consid- represent a future belief b[Xk+L] for different number of ering all the random variables in the system, i.e. the entire lookahead steps L. state vector. However, as will be shown in the following, our A general objective function in decision making/BSP can approach is applicable also to focused BSP problems. be written as (Indelman et al., 2015) Focused BSP, in both augmented and non-augmented L−1 cases, is another important problem, where in contrast to . the former case, only a subset of variables is of interest (see J(a) = E cl(b[Xk+l], uk+l) +cL(b[Xk+L]) , Z + + k 1:k L l=0 e.g. Krause et al., 2008; Levine and How, 2013; Mu et al., (12) 2015). For example, one can look for an action that reduces Dmitry Kopitkov et al. 7 the uncertainty of the robot’s final pose. The complexity of prior covariance entries. Further, in Section 3.4 we discuss such a problem is much higher and proposed techniques another key component of rAMDL: the re-use of covariance succeeded to solve it in O(kn3) (Krause et al., 2008; Levine calculations, which exploits the fact that many calculations and How, 2013) with k being the size of candidate actions can be shared among different candidate actions. Finally, in set, and in O(n˜4) (Mu et al., 2015) with n˜ being the size of Section 3.5 we describe the connection between our tech- involved clique within a Markov random field representing nique and the MI approach from Davison (2005) and Kaess the system. and Dellaert (2009), and discuss an interesting conceptual Considering posterior entropy over the focused vari- meaning of the IG metric. F ⊆ ables Xk+L Xk+L we can write · γ 3.1. BSP as a factor graph F F nF 1 M,F JH(a) = H( X + ) = + ln  , (17) k L 2 2 k+L In the following we show that, similarly to the inference problem, the BSP problem can also be formulated in terms F M,F where nF is the dimensionality of the state Xk+L, and k+L of factor graphs. The belief at time t , b[X ] can be repre- F k k is the posterior marginal covariance of X + (suffix M for k L sented by a factor graph Gk =(Fk, Xk, Ek), where with little marginal), calculated by simply retrieving appropriate parts −1 abuse of notation, we use Xk to denote the estimated vari- of posterior covariance matrix k+L = + . k L ables, Fk is the set of all factors acquired until time tk, and Solving the above problem in a straightforward manner where E encodes connectivity according to the variables 3 k involves O( N ) operations for each candidate action, where X j involved in each factor f j, as defined in (2). The future N = n + n is the dimension of the posterior system. In i i belief b[Xk+L] is constructed by introducing new variables the following sections we develop a computationally more and by adding new factors to the belief b[Xk], as was shown efficient approach that addresses both unfocused and in Section 2. Therefore, it can be represented by a factor focused (augmented) BSP problems. As will be seen, graph which is an augmentation of the factor graph Gk,as this approach naturally supports non-myopic actions and j will be shown below. arbitrary factor models hi, and it is, in particular, attractive More specifically, in the case of non-augmented BSP, let to BSP in high-dimensional state spaces. F(a) ={f 1, ..., f na } denote all the new factors from (6) In both the developed approach and the alternative meth- introduced by action a, with na being the number of such ods described in Section 5, we are making two assump- factors. This abstracts the explicit time notations of factors tions. First, we assume that factors have statistical model inside (6), which in turn can be seen as unimportant for with Gaussian white noise (see (3)). Second, we take the the solution of the BSP problem. Then the factor graph of maximum-likelihood assumption (Platt et al., 2010) and b[Xk+L] is the prior factor graph Gk with newly introduced consider a single Gauss–Newton iteration (see (9) and (10)) factor nodes F(a) connected to appropriate variable nodes is sufficient to capture information impact of a candidate (see Figure 4 for illustration). Thus, it can be denoted by action. Gk+L(a): Gk+L(a) =( Fk+L, Xk+L, Ek+L) , (18) 3. Approach where Fk+L ={Fk, F(a) }, Xk+L ≡ Xk are unaltered Our approach, rAMDL, utilizes the well-known matrix state variables, and Ek+L represents connectivity between determinant lemma (Harville, 1998) and re-use of calcula- variables and factors according to the definition of each tions to significantly reduce the computation of the impact factor (2). of a candidate action, as defined in Section 2, for both aug- For instance, consider the running example in Fig- mented and non-augmented cases of the BSP problem. In ure 2 where action a introduces new measurements in Section 3.1 we reformulate these problems in terms of fac- context of active SLAM. The state vector Xk+L ≡ Xk tor graphs that will allow us to see another, more simplified, contains robot poses and landmarks {x1, x2, l1, l2, l3}.Old picture of the BSP problem. In Section 3.2.1 we develop a factors within Fk are priors {f0, f2, f3, f4} and motion novel way to calculate the information-theoretic term for model f1. Candidate action introduces new observation unfocused non-augmented BSP, and then extend it in factors F(a) ={f5, f6, f7}. Thus, factors of the posterior Section 3.2.2 to the focused case. In addition, in order factor graph Gk+L(a) =( Fk+L, Xk+L, Ek+L)areFk+L = to significantly reduce computational complexity of the {f0, f1, f2, f3, f4, f5, f6, f7}, while Ek+L contains edges between augmented BSP problem, as defined in Section 2, in Sec- nodes in Xk+L and nodes in Fk+L as depicted in the figure. tion 3.3.1 we extend the matrix determinant lemma for For simplicity, we denote the augmentation of a fac- the matrix augmentation case. We then discuss in Sec- tor graph Gk with a set of new factors through opera- tions 3.3.2–3.3.3.2 how this extension can be used within tor ⊕. Thus, for the non-augmented BSP setting, we have . unfocused and focused augmented BSP. The con- Gk+L(a) = Gk ⊕ F(a). In addition, with a slight abuse of clusion from Sections 3.2–3.3 will be that in all investi- notation we will use the same augmentation operator ⊕ to gated types of BSP problem, the information impact of define a combination of two factor graphs into one, which candidate action can be calculated efficiently given specific will be required in the context of augmented BSP. 8 The International Journal of Robotics Research 00(0)

Fig. 4. Illustration of belief propagation in a factor graph representation in the non-augmented case. Two actions ai and aj are considered, introducing two new factor sets F(ai)andF(aj), respectively, into the graph (colored in green).

Fig. 5. Illustration of belief propagation in a factor graph representation in the augmented case. Two actions ai and aj are considered, conn introducing their own factor graphs G(ai)andG(aj) (colored in pink) that are connected to prior Gk through factor sets F (ai)and conn F (aj) (colored in green), respectively. Dmitry Kopitkov et al. 9

In the augmented BSP scenario, we denote all new state and new factors within Gk+L(a). Since information matrix T variables introduced by action a as Xnew, and also separate = A A is a product of Jacobian matrices, k+L of future all new factors F(a) from (6) into two groups: belief b[Xk+L] can also be calculated without knowing the future observations. Thus, we can reason about the infor- F ={F new F conn } (a) (a), (a) . (19) mation (and covariance) matrix of Gk+L(a), as was shown in Section 2. Factors connecting only new variables Xnew are denoted by Now we can reformulate the information-theoretic objec- F new (a), tive of the BSP problem. In order to evaluate information F new ={ 1 ... nnew } (a) f , , f , (20) impact of action a in a non-augmented BSP setting (15), we while the rest of the factors are denoted by F conn(a), need to measure the amount of information added to a fac- tor graph after augmenting it with new factors Gk+L(a) = F conn ={ 1 ... nconn } (a) f , , f , (21) Gk ⊕ F(a). In the case of augmented BSP (16), in order to evaluate the information impact of action a we need to mea- connecting between old and new variables. sure the amount of information added to a factor graph after = Next, let us denote an action’s factor graph as G(a) connecting it to another factor graph G(a) through factors new F E E conn conn ( (a),Xnew, new) with new representing connectivity in F (a), G + (a) = G ⊕ G(a) ⊕F (a). F new k L k according to involved variables in each factor in (a). In (9) and (10) we expressed the posterior informa- Then the factor graph that represents the future belief tion matrix k+L of Gk+L(a) through matrix A, which is b[Xk+L] is a combination of two factor graphs, the prior Gk the weighted Jacobian of new terms from (6). In non- F conn and action’s G(a), connected by factors from (a) (see augmented BSP, each block-row of A represents a specific Figure 5 for illustration). Thus, factor from F(a), while in augmented BSP block-rows in A new conn conn represent factors from F(a) ={F (a),F (a) }. Block- Gk+L(a) = Gk ⊕ G(a) ⊕F (a) =( Fk+L, Xk+L, Ek+L), columns of A represent all estimated variables within X + . (22) k L new conn As was mentioned, each factor’s block-row is sparse, with where F + ={F , F (a),F (a) }, X + ≡{X , X } k L k k L k new only non-zero block entries under columns of variables is an augmented state vector, and E + represents connectiv- k L connected to the factor within the factor graph. For exam- ity between variables and factors according to factors’ def- ple, the Jacobian matrix’s block-row that corresponds to a inition. The separation of factors into two groups allows us motion model factor p(x + |x + − , u + − ) will involve only to present future belief ’sfactor graph as a simple graph aug- k l k l 1 k l 1 two non-zero block entries for the state variables x + and mentation, and will also be useful during derivation of our k l x + − . Factors for many measurement models, such as pro- approach in Section 3.3. Moreover, the reason for F conn(a) k l 1 jection and range models, will also have only two non-zero not to be defined as part of G(a) is due to the fact that factors blocks (see Figure 6). inside F conn(a) involve state variables outside of G(a). We define two properties for any set of factors F that will For instance, consider the running example in Figure be used in the following to analyze the complexity of the 3 where action a represents a candidate trajectory to proposed approach. Denote by M( F) the sum of dimen- perform loop-closure in a SLAM application. Here the sions of all factors in F, where the dimension of each factor prior state vector X contains old and current robot poses k is the dimension of its expected value rj from (3). In addi- {x , x , x } and mapped until now landmarks {l , l }.The i 1 2 3 1 2 tion, let D( F) denote the total dimension of all variables newly introduced state variables X are the future robot new involved in at least one factor from F. It is not difficult to poses {x , x , x } that represent the candidate trajectory. Old 4 5 6 show that the Jacobian matrix A ∈ Rm×n of F has height factors within F are prior f on the initial robot pose, k 0 m = M( F), and number of its columns that are not entirely motion models {f , f }, and observation factors {f , f , f , f }. 3 6 1 2 4 5 equal to zero is D( F). The letter D is used here because the Here F new(a) consists of factors {f , f } (colored purple) 8 9 density of the information matrix is affected directly by the that are connected only to variables from X , while new value of D( F). It is important to note that, for any candidate F conn(a) contains factors {f , f } (colored green) that con- 7 10 action a, the total dimension of new factors M( F(a)) and nect between variables in X and variables in X . Thus, k new the dimension of involved variables D( F(a)) are indepen- state variables of the posterior factor graph G + (a)are k L dent of n, which is the dimension of the belief at planning Xk+L ={x1, x2, x3, x4, x5, x6, l1, l2}, and factors of Gk+L(a) time b[Xk]. Instead, both properties are only functions of the are F + ={f , f , f , f , f , f , f , f , f , f , f }. k L 0 1 2 3 4 5 6 7 8 9 10 planning horizon L. Note that the new factors in G + (a) are not fully defined, k L In the following sections we describe our BSP approach, as some of them involve future observations that are using the above notions of factor graphs. unknown at planning time. However, the taken maximum- likelihood assumption assumes that the mean vector of b[Xk+L] will coincide with current estimate of Xk and with 3.2. BSP via the matrix determinant lemma initial values of new variables Xnew (Indelman et al., 2015; Platt et al., 2010; Van Den Berg et al., 2012). Knowing 3.2.1. Unfocused case Information-theoretic BSP the mean vector, it is possible to calculate Jacobians of old involves evaluating the costs from (15), operations that 10 The International Journal of Robotics Research 00(0)

Table 1. Different partitions of state variables in BSP

Notation Description

Xk = Xk+L state vector at times k and k + L I X subset of Xk+L with variables involved in new terms in (6) ¬I X subset of Xk+L with variables not involved Fig. 6. Concept illustration of A’s structure. Each column rep- in new terms in (6) F = F = F resents some variable from the state vector. Each row repre- Xk Xk+L X subset of Xk+L with focused variables sents some factor from (6). Here, A represents a set of factors U = U = U Xk Xk+L X subset of Xk+L with unfocused variables F ={f (x − , x ),f (x , l ) }, where factor f of motion model 1 i 1 i 2 i j 1 IX U subset of X U with variables involved that involves two poses xi and xi−1 will have non-zero values only in new terms in (6) at columns of xi and xi−1. Factor f2 of observation model that ¬IX U subset of X U with variables not involved involves together variables xi and lj will have non-zero values only in new terms in (6) at columns of xi and lj.

Consequently, only specific entries from the covariance require calculating the determinant of a large n × n matrix matrix k are really required, and sparse matrix techniques (posterior information matrix), with n being the dimension- exist to calculate them efficiently (Golub and Plemmons, ality of the state Xk+L. State-of-the-art approaches typically 1980; Kaess and Dellaert, 2009). More formally, denote perform these calculations from scratch for each candidate by IX the set of all variables that are connected to factors action. in F(a) (see Table 1), i.e. these are the variables that are In contrast, our approach contains a one-time calculation involved in at least one factor among the new factors gen- that depends on the state dimension and will be re-used erated due to the currently considered candidate action a, afterwards to calculate the impact of each candidate action see (6). Clearly, the columns of A that correspond to the (see Section 3.4). As we show below, the latter depends only rest of the variables, ¬IX, are entirely filled with zeros (see M F D F on ( (a)) and ( (a)), while being independent of the Figure 6). Thus, equation (23) can be re-written as state dimension. Recalling notation from the previous section, we would 1 I I M, X I T JIG(a) = ln I + A ·  ·( A) , (24) like to measure the amount of information gained after 2 m k graph augmentation G + (a) = G ⊕ F(a). We can mea- k L k I sure it through the IG as the utility function. It is not dif- where A is constructed from A by removing all zero M,IX ficult to show that IG from (15) can be written as JIG(a) = columns, and k is a prior joint marginal covariance of + T variables in IX, which should be calculated from the (square 1 k A A m×n ln , where A ∈ R is the Jacobian of fac- I 2 root) information matrix k. Note that dimension of X is k D F tors in F(a) weighted by their noise, with m = M( F(a)). ( (a)). Using the generalized matrix determinant lemma (Harville, Intuitively, the posterior uncertainty reduction that corre- sponds to action a is a function of only the prior marginal 1998), this equation can be written as I covariance over variables involved in F(a) (i.e. M, X ) and 1 k = + ·  · T  ≡ −1 the new information introduced by the F(a)’s Jacobian A, JIG(a) ln Im A k A , k k (23) 2 with the latter also involving the same variables IX. More- as previously suggested in Ila et al. (2010) and Mu over, from the above equation it can be seen that uncertainty et al. (2015) in the context of compact pose-SLAM and reduction in the posterior will be significant for large entries focused active inference. in A and high prior uncertainty over the variables IX. Equation (23) provides an exact and general solution In particular, in the case of myopic decision making for information-based decision making, where each action with unary observation models (that involve only a single candidate can produce any number of new factors (non- state variable), calculation of IG(a) for different candidate  myopic planning) and where factors themselves can be of actions only requires recovering the diagonal entries of k, any motion or measurement model (unary, pairwise, etc.). regardless of the actual correlations between the states, as In many problem domains, such as SLAM, inference is was recently shown in Indelman (2015b, 2016). However, typically performed in the information space and, as such, while in the mentioned papers the per-action calculation the joint covariance matrix k is not readily available and takes O(n), the IG(a) calculation is not dependent on n at needs to be calculated upon demand, which is expensive in all, as will be shown in Section 3.4. M,IX general. While, at first sight, it might seem the entire joint Given a prior marginal covariance k , whose covariance matrix needs to be recovered, in practice this is dimension is D( F(a)) ×D( F(a)), the calculation in not the case due to sparsity of the Jacobian matrix A,aswas (24) is bounded by calculating the determinant of mentioned above. an M( F(a)) ×M( F(a)) matrix, which is, in general, Dmitry Kopitkov et al. 11

M F 3 M F F O( ( (a)) ), where ( (a)) is the number of con- Therefore, the posterior entropy of Xk+L (see(17))isa U straints due to new factors (for a given candidate action function of the posterior k+L and its partition k+L: a). This calculation should be performed for each candi- A nF · γ 1 k+L date action in the set . Furthermore, in many problems it is J F (a) = H( X F ) = − ln . (27) H k+L U logical to assume that M( F(a))  n,asM( F(a)) depends 2 2 k+L mostly on the planning horizon L, which is typically defined U = U + U T · U From (9) one can observe that k+L k ( A ) A , and constant, while n (state dimensionality) can be huge and × where AU ∈ Rm nU is constructed from Jacobian A by grow with time in real systems (e.g. SLAM). Consequently, taking only the columns that are related to variables in X U . given the prior covariance our complexity for selecting best k The next step is to use IG instead of entropy, with the action is O( |A|), i.e. independent of state dimensionality n. same motivation and benefits as in the unfocused case To conclude this section, we showed that calculation of (Section 3.2.1). The optimal action a∗ = arg max J F (a) the impact of an action for a single candidate action does not a∈A IG will maximize J F (a) = H( X F) −H( X F ), and by combin- depend on n. While this result is interesting by itself in the IG k k+L ing (27) with the generalized matrix determinant lemma we context of active inference, in Section 3.4 we go a step fur- can write ther and present an approach to calculate covariance entries, required by all candidates, with one-time calculation that F 1 T J (a) = ln Im + A · k · A can be re-used afterwards. IG 2 1 U|F − ln I + AU ·  ·( AU )T , (28) 2 m k | × 3.2.2. Focused case In this section, we present a novel U F ∈ RnU nU U where k is a prior covariance matrix of Xk approach to calculate the change in entropy of a focused F U conditioned on Xk , and it is actually the inverse of k . set of variables after factor graph augmentation Gk+L(a) = U I U ¬I U ⊕ F Further, A can be partitioned into A and A , rep- Gk (a), combining it with the ideas from the previ- resenting unfocused variables that are, respectively, ous sections (the generalized matrix determinant lemma involved (IX U )ornot involved (¬IX U ) (see also Table 1). and IG cost function) and showing that the impact of one Note that ¬IAU contains only zeros, and it can be concluded candidate action can be calculated independently of state that dimension n. | I U First we recall definitions from Section 2 and introduce + U · U F· U T = + I U ·  X |F· I U T , Im A k ( A ) Im A k ( A ) F ≡ F ∈ RnF additional notation (see also Table 1): Xk Xk+L (29) F denotes the set of focused variables (equal to X to IX U |F k+L where  is the prior covariance of IX U conditioned on remind us that prior and posterior states are identical in the k . X F. non-augmented case), X U = X /X F ∈ RnU is a set of the k k k k Taking into account (24) and (29), J F (a) can be calcu- remaining unfocused variables with n = n + n .The IG F U lated through nF × nF prior marginal covariance and information matri- M,F ces of X F are denoted, respectively, by  (suffix M for 1 I k k J F (a) = ln I + IA · M, X ·( IA)T M,F ≡ M,F −1 IG 2 m k marginal) and k ( k ) . Furthermore, we partition the joint information matrix as 1 IX U |F k − ln I + IAU ·  ·( IAU )T . (30) 2 m k M,U M,UF U U,F We can see that the focused and unfocused IGs have  = k k , = k k , k (M,UF)T M,F k ( U,F)T F a simple relation between them k k k k (25) 1 I U × F I U X |F I U T F ∈ RnF nF J (a) = JIG(a) − ln + ·  · . (31) where k is constructed by retrieving from k IG Im A k ( A ) F 2 only the rows and the columns related to Xk (it is actually F The second term in (31) is negative and plays the role conditional information matrix of Xk , conditioned on the × rest of the variables X U ), U ∈ RnU nU is defined similarly of a penalty, reducing the action’s impact on the posterior k k F × entropy of X . In Section 3.5 we discuss the intuition for X U , and U,F ∈ RnU nF contains the remaining blocks k+L k k behind this penalty term. Note that when all involved vari- of as shown in (25). k ables are focused, IX ⊆ X F , the variable set IX U is The marginal information matrix of X F, i.e. M,F, k+L k k empty and second term’s matrix will be an identity matrix can be calculated via the Schur complement M,F = k I . In such a case, the second term becomes zero and we F−( UF)T ·( U )−1 ·UF. However, one of the Schur m k k k k have J F (a) = J (a). = IG IG complement’s properties (Ouellette, 1981) is k M,IX IX U |F M,F Also here, given prior covariances  and  , · U , from which we can conclude that k k k k calculation of focused IG (30) is independent of state dimensionality n, with complexity bounded by 1 k M F 3 M,F = = . (26) O( ( (a)) ). In Section 3.4 we show how the required k M,F U covariances can be efficiently retrieved. k k 12 The International Journal of Robotics Research 00(0)

Table 2. Different partitions of state variables in augmented BSP

Notation Description

Xk state vector at time k Xk+L state vector at time k + L F Xk+L subset of Xk+L with focused variables Xold subset of Xk+L with old variables, i.e. Xk Xnew subset of Xk+L with new variables I Xold subset of Xold with variables involved in new terms in (6) ¬I Xold subset of Xold with variables not involved in new terms in (6)

F Focused augmented BSP (Xk+L ⊆ Xnew), Section 3.3.3.1 Fig. 7. Partitions of Jacobians and the state vector Xk+L in the F Xnew subset of Xnew with focused variables augmented BSP case, in the unfocused scenario. Note that the X U subset of X with unfocused variables shown variable ordering is only for illustration, while the devel- new new oped approach supports any arbitrary variable ordering. Also note F Focused augmented BSP (Xk+L ⊆ Xold), Section 3.3.3.2 that all white blocks consist of only zeros. Top: Jacobian A of fac- conn new I F I tor set F(a) = {F (a), F (a) }. Bottom: Jacobians B and D Xold subset of Xold with focused variables of factor sets Fconn(a) and Fnew(a), respectively. I U I Xold subset of Xold with unfocused variables ¬I F ¬I Xold subset of Xold with focused variables 3.3. Augmented BSP via AMDL ¬I U ¬I Xold subset of Xold with unfocused variables 3.3.1. AMDL In order to simplify calculation of IG within augmented BSP (16) one could resort, similar to previous sections, to the matrix determinant lemma. However, due to Aug In addition, we can extend the AMDL lemma for a spe- zero-padding, the information matrix 3k+L is singular and, cific structure of matrix A. As was explained in Section 3.1, thus, the matrix determinant lemma cannot be applied. In in the case of augmented BSP, the new factors can be sep- this section we develop a variant of the matrix determinant arated into two sets F new(a) and F conn(a). It is not difficult lemma for the considered augmented case (further referred to see that A’s structure in such a case will be to as AMDL).      Specifically, we want to solve the following problem: Bold Bnew Bold Bnew + A = Aold Anew = = recalling 3 = 3Aug + AT · A (see also (10)), and dropping Dold Dnew 0 Dnew the time indices to avoid clutter, our objective is to express (34) the determinant of 3+ in terms of 3 and 6 = 3−1. where B’s rows represent factors from F conn(a), and D’s rows represent factors from F new(a) (see also Figure 7). Lemma 1. The ratio of determinants of 3+ and 3 can be Note that Dold ≡ 0. calculated through + 3+ Lemma 2. The ratio of determinants of 3 and 3 where A T −1 = 1 · Anew · 1 · Anew , (32) has structure from (34) can be calculated through 3 3+ . T Rm×n = 1 · T −1 T with 1 = Im +Aold ·6·Aold, where the matrices Aold ∈ 1 Bnew · 11 · Bnew + Dnew · Dnew (35) m×n0 3 and Anew ∈ R are constructed from A by retrieving . T conn columns of only old n variables (denoted as Xold) and only with 11 = Im +Bold ·6 ·B and mconn = M( F (a)), 0 conn old new n variables (denoted as Xnew), respectively (see Figure where partitions of B and D are defined above in (34) and 7 and Table 2). can also be seen in Figure 7. The proof of Lemma 1 is given in Appendix A.1. The proof of Lemma 2 is given in Appendix A.2. Remark 1. It is not difficult to show that AMDL for the We note the above equations are general standalone matrix update of the form 3+ = 3Aug + eAT · 9−1 · eA (see solutions for any augmented positive-definite symmetric (10)) assumes the form matrix. To summarize, we developed two augmented

3+ determinant lemmas (32) and (35), with the latter = 9−1 · 1e · eAT · 1e−1 · eA (33) 3 new new exploiting additional knowledge about A’s struc- ture. The dimension of matrix 1 from (32) is e . e eT with 1 = 9 + Aold · 6 · Aold. M( F(a)) ×M( F(a)), whereas the dimension of 11 Dmitry Kopitkov et al. 13 from (35) is M( F conn(a)) ×M( F conn(a)). Thus, the application when candidate action a introduces only motion complexity of calculation in (35) is lower than in (32) since (or odometry) factors between the new variables. In such a M( F conn(a)) ≤ M( F(a)). In the following sections we case, it is not difficult to show that (37) will be reduced to use both of the lemmas in order to develop an efficient = n ·γ + JIG(a) 2 ln Anew . In other words, the IG in such a solution to the augmented BSP problem. case depends only on the partition Anew of A (see Figure 7), Jacobian entries related to new variables, while the prior k 3.3.2. Unfocused augmented BSP through IG Here we is not involved in the calculations at all. show how the AMDL from Section 3.3.1 can be used to effi- ciently calculate the unfocused IG as defined in (16), e.g. Remark 2. It is possible that posterior state dimension N = change in system’s entropy after factor graph augmentation n + n will be different for different candidate actions (see conn Gk+L(a) = Gk ⊕ G(a) ⊕F (a) (see Figure 5). e.g. Section 7). In such a case, the entropy (or IG), being First we introduce different partitions of the joint state a function of the posterior eigenvalues’ product, will be Xk+L, and the corresponding sub-matrices in the Jacobian of different scale for each candidate and cannot be com- matrix A from (10) (see Table 2 and Figure 7). Recall def- pared directly. Thus, dimension normalization of (37) may initions of X and X (see Section 3.3.1) and let IX n ·γ new old old be required. Even though the term 2 may already play the ¬I and Xold denote, respectively, the old involved and the old role of such a normalization, the detailed investigation of uninvolved state variables in the new terms in (6). We rep- this aspect is outside the scope of this paper. I ¬I resent by Aold and Aold the columns of matrix A that cor- I ¬I respond to the state variables Xold and Xold, respectively We can further enhance the presented above approach by ¬I (see Figure 7). Note, Aold ≡ 0. considering the structure of A from (34) (see also Figure Next, using the AMDL (Lemma 1), the determinant ratio 7). This will allow us to slightly improve the complexity of between posterior and prior information matrices is JIG(a)’scalculation. By applying the AMDL (Lemma 2), we can show that information gained from connecting Gk (with + k L T −1 covariance matrix  ) and G(a) (with information matrix = C · A · C · Anew , (36) k new = T · F conn k a Dnew Dnew) through factors (a) will be

=. + ·  · T where C Im Aold k Aold. n · γ 1 1 = + + T · −1 · + Consequently, the IG objective from (16) can be re- JIG(a) ln C1 ln B C Bnew a , 2 2 2 new 1 written as (39) T n · γ 1 1 C1 = Im + Bold · k · B , (40) J (a) = + lnC + lnAT · C−1 · A . (37) conn old IG 2 2 2 new new where matrix B = Bold Bnew is the Jacobian of factors in conn Moreover, considering the above partitioning of Aold,we F (a). M,IX ·  · T = I ·  old · I T Since Bold is sparse (see Figure 7), the same as partition conclude that Aold k Aold Aold k ( Aold) , I M, Xold I Aold in (36), C1 also can be calculated efficiently: where k is the marginal prior covariance of Xold. Thus, matrix C can be rewritten as I = + I · M, Xold · I T C1 Imconn Bold k ( Bold) . (41) I = + I · M, Xold · I T C Im Aold k ( Aold) . (38) It is interesting to note that the terms of the above-presented I M, Xold solution for the unfocused augmented BSP problem Observe that, given k , all terms in (38) have relatively small dimensions and M( F(a)) ×M( F(a)) matrix C can (39) and (41) can be recognized as belonging to differ- ⊕ ⊕F conn be computed efficiently for each candidate action, with time ent operands in augmentation Gk G(a) (a): prior I M, Xold complexity no longer depending on state dimension n,sim- covariance matrix k represents information coming ilarly to the non-augmented BSP approach in Section 3.2.1. from prior factor graph Gk, information matrix a provides Calculation of the inverse C−1, which is required in (37), information of an action’s factor graph G(a), and various is O( M( F(a))3 ) and will also not depend on n. The run- partitions of matrix B introduce information coming from time of overall calculation in (37) will have complexity connecting factors F conn(a). O( M( F(a))3 +n3) and will depend only on the number Although the above solution (39) and (41) look some- of new factors M( F(a)) and number of new variables what more complicated, its matrix terms have slightly n. Both are functions of the planning horizon L and can lower dimensions compared with matrix terms in the gen- be considered as being considerably smaller than the state eral solution presented in (37) and (38), with complex- dimension n. Moreover, higher ratios n/M( F(a)) lead to ity O( M( F conn(a))3 +n3), and therefore can be calculated a bigger advantage of our approach versus the alternatives more quickly, as will be shown in our simulations below. (see Section 7). Moreover, equations (39) and (41) have more independent It is worthwhile to mention a specific case, where terms that can be calculated in parallel, further improving M( F(a)) = n, which happens for example in SLAM time performance. 14 The International Journal of Robotics Research 00(0)

U Anew, that correspond, respectively, to columns of new vari- ables that are focused and unfocused. Denote the former set F U of variables as Xnew and the latter as Xnew (see also Table 2). F ≡ F Note, Xnew Xk+L.

F Lemma 3. The posterior entropy of Xnew (equation (17)) is given by · γ F nF 1 U T −1 U JH(a) = + ln (A ) ·C · A 2 2 new new 1 − ln AT · C−1 · A , (42) 2 new new where C is defined in (38).

The proof of Lemma 3 is given in Appendix A.3. F We obtained an exact solution for JH(a) that, given Fig. 8. Partitions of Jacobians and state vector X + in the aug- k L M,IX F  old , can be calculated efficiently with complexity mented BSP case, focused (X + ⊆ Xnew) scenario. Note k k L M F 3 + 3 that the shown variable ordering is only for illustration, while the O( ( (a)) n ), similarly to unfocused augmented developed approach supports any arbitrary variable ordering. Also BSP in Section 3.3.2. In Section 3.4, we explain how I M, Xold note that all white blocks consist of only zeros. Top: Jacobian A of the prior marginal covariance term ( k ) can be effi- factor set F(a) ={Fconn(a),Fnew(a) }. Bottom: Jacobians B and ciently retrieved, providing a fast solution for focused D of factor sets Fconn(a)andFnew(a), respectively. augmented BSP. In addition, it is interesting to note that there is an effi- 1 U T · −1 · U − cient way to calculate the term 2 ln (Anew) C Anew 3.3.3. Focused augmented BSP The focused scenario 1 T · −1 · 2 ln Anew C Anew from (42). First, we calculate the in the augmented BSP setting, with the factor graph aug- =. T · −1 · conn matrix V Anew C Anew. Note that each row/column mentation G + (a) = G ⊕ G(a) ⊕F (a), can be sep- k L k of V represents one of the new variables in X .Next,we arated into different cases. One such case is when the new reorder rows and columns of V to obtain matrix V UF where set of focused variables X F contains only new variables k+L first go rows and columns of X U ,followedbyrowsand added during BSP augmentation, as illustrated in Figure 8, new columns of X F . Now, we can perform Cholesky decompo- i.e. X F ⊆ X are the variables coming from factor graph new k+L new sition of V UF = LT · L and retrieve L’s diagonal entries that G(a). Such a case happens, for example, when we are inter- belong to variables X F , denoted by rF . It is not difficult to ested in reducing the entropy of a robot’slast pose within the new i,i show that planning horizon. Another case is when the focused vari- F 1 1 ables Xk+L contain only old variables, as shown in Figure 9, U T · −1 · U − T · −1 · F ⊆ ≡ ln (Anew) C Anew ln Anew C Anew i.e. Xk+L Xold Xk are the variables coming from factor 2 2 graph Gk. This, for example, could correspond to a scenario =− F log rii . (43) where reducing entropy of already-mapped landmarks is of i interest (e.g. improve 3D reconstruction quality). The third F option is for both new and old variables to be inside Xk+L.In Further, as in Section 3.3.2, we additionally exploit the spe- the following we develop a solution for the first two cases; cial structure of A from (34) (see also Figure 8). Similarly to the third case can be handled in a similar manner. unfocused augmented BSP, this will allow us to improve the complexity of J F (a)’s calculation. Remark 3. In most cases, actual variable ordering will be H more sporadic than that depicted in Figures 7, 8, and 9. For F Lemma 4. The posterior entropy of Xnew (equation (17)), example, iSAM (Kaess et al., 2012) determines variable where A has the structure from (34), is given by ordering using COLAMD (Davis et al., 2004) to enhance · γ the sparsity of the square root information matrix. We note F nF 1 U T −1 U U|F JH(a) = + ln ( B ) ·C · B + that our approach applies to any arbitrary variable ordering, 2 2 new 1 new a with the equations derived herein remaining unchanged. 1 − ln |BT · C−1 · B + |, (44) 2 new 1 new a F ⊆ = T · 3.3.3.1 Focused augmented BSP (Xk+L Xnew): focused where C1 is defined in (41), a Dnew Dnew is the variables belong to G(a) First we define additional parti- information matrix of the action’s factor graph G(a), and U|F = U T · U tions of Jacobian A (see Figure 8). The sub-matrices Aold, a ( Dnew) Dnew is the information matrix of variables I ¬I U F Anew, Aold, and Aold were already introduced in the sec- Xnew conditioned on Xnew and calculated from distribution F tions above. We now further partition Anew into Anew and represented by G(a). Dmitry Kopitkov et al. 15

F Lemma 5. The focused IG of Xk is given by 1 F = + T · −1 · − JIG(a) (ln C ln Anew C Anew ln S 2 − T · −1 · ln Anew S Anew ) , (45)

where C is defined in (38), and

IX U |F =. + I U ·  old · I U T S Im Aold k ( Aold) , (46)

IX U |F  old I U and where k is the prior covariance of Xold condi- F tioned on Xk . The proof of Lemma 5 is given in Appendix A.5. Similarly to the cases discussed above (Sections 3.3.2 I IX U |F Fig. 9. Partitions of Jacobians and state vector Xk+L in the aug- M, Xold  old F ⊆ and 3.3.3.1), given k and k , calculation of mented BSP case, focused (Xk+L Xold) scenario. Note that F JIG(a) per each action a can be performed efficiently with the shown variable ordering is only for illustration, while the complexity O( M( F(a))3 +n 3), independently of state developed approach supports any arbitrary variable ordering. Also dimension n. note that all white blocks consist of only zeros. Top: Jacobian A of conn new It is interesting to note the specific case where factor set F(a) ={F (a),F (a) }. Bottom: Jacobians B and M( F(a)) = n . In other words, the number of new mea- D of factor sets Fconn(a)andFnew(a), respectively. surements is equal to the number of new state variables, which can happen for example when only new robot poses The proof of Lemma 4 is given in Appendix A.4. and new motion model factors are added. In such a case, Also here, the matrix terms from the above solution of the it is not difficult to show that (45) will always return zero. M F = focused augmented BSP problem (44) have lower dimen- We can conclude that for this specific case ( ( (a)) n ) sions compared with the matrix terms from the general there is no new information about the old focused variables F solution presented in (42). Given the prior marginal covari- Xk . I In addition, similar to previous sections, we use the spe- ance M, Xold its complexity is O( M( F conn(a))3 +n3). We k cial structure of A from (34) (see also Figure 9) in order to demonstrate a runtime superiority of this solution in our improve the complexity of J F (a)’s calculation. simulations below. IG It is important to mention that the information-based F Lemma 6. The focused IG of Xk , where A has the planning problem for a system that is propagated through structure from (34), is given by the (extended) Kalman filter (Van Den Berg et al., 2012; 1 Walls et al., 2015), where the objective is to reduce the F = + T · −1 · + − JIG(a) (ln C1 ln Bnew C1 Bnew a ln S1 uncertainty of only marginal future state of the system, is 2 focused F ⊆ − T · −1 · + an instance of the augmented BSP (Xk+L ln Bnew S1 Bnew a ) , (47) Xnew) problem. Thus, the solution provided in this section = T · is applicable also for Kalman filter planning. where C1 is defined in (41), a Dnew Dnew is the information matrix of an action’s factor graph G(a), and

3.3.3.2 Focused augmented BSP (X F ⊆ X ): focused I U | k+L old I U Xold F I U T S1 = Im + B ·  ·( B ) , (48) variables belong to Gk Similarly to the previous section, conn old k old we first introduce additional partitions of Jacobian A for the IX U |F  old I U considered case (see Figure 9). From the top part of the fig- and where k is the prior covariance of Xold condi- ¬I F ure we can see that Aold can be further partitioned into tioned on Xk . ¬I U ¬I F ¬I U Aold and Aold. In particular, Aold represents columns of old variables that are both not involved and unfocused, The proof of Lemma 6 is given in Appendix A.6. ¬I F The matrix terms from the above solution (47) and and Aold represents columns of old variables that are both not involved and focused. We denote the former group of (48) have lower dimensions compared with the matrix ¬I U ¬I F terms from the general solution presented in (45) and (46), variables by Xold and the latter by Xold (see Table 2). conn 3 3 I I U I F with complexity O( M( F (a)) +n ) given the prior Likewise, Aold can be partitioned into A and A , rep- old old I IX U |F M, Xold  old resenting old involved variables that are, respectively, unfo- marginal covariance matrices k and k . The next I U I F cused ( Xold) or focused ( Xold). Note that in this case, the set section presents our approach to calculate the appropriate F = F ={¬I F ∪ I F } of focused variables is Xk+L Xk Xold Xold and is entries in the prior covariance only once and re-use the contained in factor graph Gk. result whenever required. 16 The International Journal of Robotics Research 00(0)

IX U |F 3.4. Re-use calculations technique  old also need the term k (see (30) and (46)). This term As we have seen above, unfocused and focused (aug- can be computed using two different methods as described mented) BSP problems require different prior covariance below. entries, in order to use the developed expressions. The First method: calculate it through additional marginal required entries for each problem are summarized in Table covariance entries. First we calculate the prior marginal I M,IX 3. Note that M, X and  old both represent exactly M,(IX U ,F) k k covariance  old for the set of variables {IX U , X F}, the same thing, prior marginal covariance of old variables k old k and then compute the Schur complement over the relevant involved in new terms in (6), and have slightly differ- M,(IX U ,F)  old ent notation due to the specifics of augmented and non- partitions in k (where suffix M denotes marginal): IX U |F augmented settings of BSP. The same goes for  I U | I U I U I U k Xold F M, Xold M, XoldF M,F −1 M,F Xold IX U |F  =  −  ·( ) · . (49)  old k k k k k and k , with both representing prior covariance of I U unfocused and involved old variables Xold conditioned Consequently, we can use a one-time calculation also for on focused variables X F. In this section we use the nota- the focused BSP and for focused augmented BSP I IX U |F M, Xold  old F ⊆ tion of augmented BSP ( k and k ), considering (Xk+L Xold) cases as follows. Let us extend the set the non-augmented BSP setting as its special case. M,XAll XAll to contain also all focused variables. Once k From Table 3 it is seen that all approaches require prior M,(IX U ,F)  old I is calculated, k will be just its partition and can marginal covariance of the involved old variables, i.e. Xold. In terms of factor graphs, in non-augmented BSP the IX be easily retrieved from it. As a result, the calculation old IX U |F F  old represents variables connected to factors from set (a) and of k per candidate action becomes computationally D F M,F −1 has dimension ( (a)), whereas in the augmented BSP cheap (through (49)). Furthermore, the term ( k ) can I scenario the Xold represents variables from prior factor be calculated only once for all candidates. The pseudo-code conn graph Gk connected to factors in the set F (a) and has of this approach can be found in Algorithm 2. dimension D( F conn(a)). Although each candidate action IX U |F  old may induce a different set of involved variables, in prac- Second method: compute k through information ={ F U } tice these sets will often have many variables in common matrix partitioning. Recall Xk Xk , Xold and consider as they are all related to the belief at the current time (e.g. the following partitioning of a prior information matrix: about the robot pose), in one way or another. With this in F F,U mind, we perform a one-time calculation of prior marginal = k k k X U , (50) covariance for all involved variables (due to at least one F,U T old ( k ) k candidate action) and re-use it for efficiently calculating IG F and entropy of different candidate actions. where first go rows/columns of focused variables Xk , ⊆ U More specifically, denote by XAll Xk the subset of vari- and then rows/columns of old unfocused variables Xold. ables that were involved in new terms in (6) for at least one Note that a partition of information matrix k that belongs X U candidate action. We can now perform a one-time calcula- to X U , old , is an information matrix of the conditional tion of the prior marginal covariance for this set, i.e. M,XAll . old k k distribution The complexity of such calculation may be different for X U X U |F different applications. For example, when using an informa- P( X U |X F) = N −1( ×, old ) = N ( ×,  old ) , (51) tion filter, the system is represented by information matrix old k k k , and in general the inverse of the Schur compliment of k where × is the information or mean vector of this distribu- XAll variables should be calculated. However, there are tech- X U |F X U  old = old −1 I U ⊆ U niques that exploit the sparsity of the underlying matrices tion. Since k ( k ) and recalling Xold Xold,it IX U |F X U |F in SLAM problems, in order to efficiently recover marginal  old  old follows that k is just a partition of k that belongs covariances (Kaess and Dellaert, 2009), and more recently, I U to old unfocused involved variables Xold. Therefore, we to keep and update them incrementally (Ila et al., 2015). In X U M,X need to calculate specific entries of the inverse of old . Section 7 we show that the calculation time of  All while k k To do so, our one-time calculation will be as follows. We exploiting sparsity (Golub and Plemmons, 1980; Kaess and denote by X U ⊆ X the subset of unfocused variables Dellaert, 2009) is relatively small compared with the total All k that were involved in new terms in (6) for at least one can- decision-making time of alternative approaches. Still, the X U |F  All more detailed discussion about complexity of covariance didate action. Next, we calculate k , the entries of the X U retrieval can be found in Kaess and Dellaert (2009) and old U inverse of k that belong to XAll (e.g. via the method Ila et al. (2015). The pseudo-code for BSP problems that I U | Xold F M,IX from Kaess and Dellaert (2009)). Now, the required  require only marginal prior covariances  old (see Table k k X U |F  All 3) can be found in Algorithm 1. is just a partition of k and can be retrieved easily for For focused BSP (Section 3.2.2) and for focused each candidate action. The pseudo-code of this approach is F ⊆ augmented BSP (Xk+L Xold) (Section 3.3.3.2) cases, we summarized in Algorithm 3. Dmitry Kopitkov et al. 17

Table 3. Different problems and required entries of prior covariance: BSP denotes non-augmented belief space planning; Augmented BSP denotes augmented belief space planning

Problem Required covariance entries

I M, X Unfocused BSP, k (prior marginal covariance of variables Section 3.2.1 involved in new terms in (6))

I I U M, X  X |F Focused BSP, k and k (prior covariance of unfocused and Section 3.2.2 involved variables IX U conditioned on focused variables X F)

I M, Xold Unfocused Augmented BSP, k (prior marginal covariance of old variables Section 3.3.2 involved in new terms in (6))

I F ⊆ M, Xold Focused Augmented BSP (Xk+L Xnew), k Section 3.3.3.1

I IX U |F F ⊆ M, Xold  old Focused Augmented BSP (Xk+L Xold), k and k (prior covariance of unfocused and I U F Section 3.3.3.2 involved old variables Xold conditioned on focused variables X )

Algorithm 1: Pseudo-code for BSP problems requiring only prior marginal covariance entries.

1 Inputs: 2 A: candidate actions {a1, a2, ...} 3 Outputs: ∗ 4 a : optimal action

5 begin: I 6 Xall ← union of old involved variables Xold from each candidate action ai M,XAll Calculate prior marginal covariances k of variables in Xall (e.g. via the method from Kaess and Dellaert (2009)) for ai ∈ A do 7 Calculate information impact (IG or posterior entropy), using required prior marginal covariances from M,XAll k 8 end ∗ 9 Select candidate a with maximal IG or minimal posterior entropy 10 end

The first method is a good option when the dimension of 3.5. Connection to the MI approach and theoret- F Xk is relatively small. In such a case, equation (49) can be ical meaning of IG calculated very quickly. When this is not the case, i.e. the | number of focused variables is large, the second technique Mutual information I(a b) is one additional metric from becomes much faster and, thus, is preferable over the first information theory that is used frequently in the field of technique. information-based decision making. Basically it encodes the quantity of information about set of variables a that Remark 4. As we show in Section 4.2, there are cases where we would obtain in the case that the value of variables in I U the other set b would be revealed to us. For example, this Xold is identical between all candidate actions. In such cases IX U |F metric was used in Davison (2005) and Kaess and Dellaert  old k can be calculated only once and further reused by (2009) to determine the most informative measurements in each candidate action. a measurement selection problem, and more recently in Bai et al. (2016) for information-based active exploration, with To summarize this section, the presented technique per- both problems being very similar. In addition, it was used forms time-consuming calculations in one computational in Carlevaris-Bianco et al. (2014) to create a sparse approx- effort; the results are then used for efficiently evaluating imation of the true marginalization using a Chow–Liu tree. the impact of each candidate action. This concept thus pre- serves expensive CPU resources of any given autonomous In this section we explore the connection between our system. BSP approach that uses IG (see Section 3.2) and the MI 18 The International Journal of Robotics Research 00(0)

Algorithm 2: Pseudo-code for BSP problems requiring both prior marginal and conditional covariance entries: first method.

1 Inputs: 2 A: candidate actions {a1, a2, ...} F 3 X : focused old variables 4 Outputs: ∗ 5 a : optimal action

6 begin: I 7 Xall ← union of old involved variables Xold from each candidate action ai F 8 Xall = Xall ∪ X M,XAll 9 Calculate prior marginal covariances k of variables in Xall (e.g. via the method from Kaess and Dellaert (2009)) M,F −1 M,F M,XAll 10 Calculate ( k ) explicitly, by retrieving k from k and inverting it 11 for ai ∈ A do IX U |F M,IX U M,IX U F  old  old  old M,XAll 12 Calculate k via (49), by retrieving appropriate k and k from k IX U |F F  old M,XAll 13 Calculate IG of X using k and required prior marginal covariances from k 14 end ∗ 15 Select candidate a with maximal IG 16 end

Algorithm 3: Pseudo-code for BSP problems requiring both prior marginal and conditional covariance entries: second method.

1 Inputs: 2 A: candidate actions {a1, a2, ...} F 3 X : focused old variables 4 Outputs: ∗ 5 a : optimal action

6 begin: I 7 Xall ← union of old involved variables Xold from each candidate action ai M,XAll 8 Calculate prior marginal covariances k of variables in Xall (e.g. via the method from Kaess and Dellaert (2009)) U ← I U 9 Xall union of old involved unfocused variables Xold from each candidate action ai X U old ← U = F 10 k partition of prior information matrix k that belongs to old unfocused variables Xold Xk X X U |F X U  All ← old U 11 k entries of the inverse of k that belong to XAll (e.g. via the method from Kaess and Dellaert (2009)) 12 for ai ∈ A do M,IX U IX U |F X U |F F  old M,XAll  old  All 13 Calculate IG of X ,byretrieving k from k and k from k 14 end ∗ 15 Select candidate a with maximal IG 16 end approach that is applied in Davison (2005) and Kaess and candidate measurement has a specific measurement model = i +υ υ ∼ N  Dellaert (2009); we show that objective functions of both zi hi(Xk) i with i (0, i). The candidate mea- are mathematically identical and calculate exactly the same surements are aprioriunknown and can be viewed as ran- metric, even though calculations in our approach are made dom variables whose statistic properties are fully defined in a much more efficient way. Moreover, we also present the by a random state vector Xk and random noises υi, due to theoretical meaning of IG that provides better intuition for measurement models. Combining candidate measurements equations (23) and (28). with the state vector, we have In a MI approach we would like to select the most infor- mative measurements from the available set {z1, z2, ...} and T also to account for possible measurement correlation. Each W =(Xk, z1, z2, ...) , (52) Dmitry Kopitkov et al. 19 and similarly to the mentioned papers, it can be shown that Most importantly, from the above equations we can see the covariance matrix of W is conceptually a very interesting meaning of the metric that   e T e T is calculated (IG or MI). Without omitting the noise matrix 6k 6k · A1 6k · A2 · · ·  T T  Ae · 6 Ae · 6 · Ae + 9 Ae · 6 · Ae · · · 9 from our formulation, we can show that the unfocused  1 k 1 k 1 1 1 k 2  6W =  e e e T e e T  , IG of future measurement z is A2 · 6k A2 · 6k · A1 A2 · 6k · A2 + 92 · · · . . . e eT . . . .. 9 + A · 6 · A . . . . 1 T 1 k JIG( z) = ln Im + A · 6k · A = ln . (53) 2 2 9 e where Ai is the Jacobian of the measurement model (55) i z . e eT function hi(X ) and where it was not yet combined with Further, from (53) we see that 6 = 9 + A · 6 · A is the k e k model noise 9i, similarly to A defined in (9). The MI covariance matrix of the random z. Thus, we can see that approach (Davison, 2005; Kaess and Dellaert, 2009) cal- 1 z 1 culates I(X |z ) for each candidate z from 6 and selects JIG( z) = ln 6 − ln 9 = H( z) −H( υ) , (56) k i i W 2 2 candidates with the highest MI. where υ is random noise from z’s measurement model, with Now we will show that objective I(Xk|zi) is mathe- υ ∼ N (0, 9). From (56) we see that IG is exactly the differ- matically identical to our JIG( zi) from Section 3.2.1 (see | T ence between entropies of future measurement and its noise. also (23)). First, note that 6Xk zi =( 3 + Ae · 9−1 · Ae)−1 W k i i i It can be explained in the following way: as was mentioned (easy to check by using Schur complement from left and previously, random variable z is fully defined by random from right). Further, MI for variables X and υ through measurement model. When z’s Gaussian distributions can be calculated through covariance k value is revealed it obviously provides information about matrices as both state and noise. The information about the state (the IG) will then be the whole received information (the entropy

6M,Xk of random variable z) minus the information about the noise 1 W I(Xk|zi) = H(Xk) −H(Xk|zi) = ln υ. 2 6Xk |zi W From the above we can see that in order for measurement M,X z to be notably informative, three conditions should apply. 6 k 1 W H = ln First, its noise should have small entropy ( υ), which also 2 6M,Xk − 6M,Xk zi ·(6M,zi )−1 ·6M,ziXk comes from general knowledge about measurement estima- W W W W tion. In addition, z should have large entropy H( z) from 1 6k = ln which we can conclude the second and third conditions: the 2 eT e eT −1 e 6k − 6k · Ai ·( Ai · 6k · Ai + 9i) ·Ai · 6k involved variables IX from the measurement model should (54) have high prior uncertainty (high prior entropy), as also their IeA (the Jacobian of measurement model at the lin- earization point of IX) should contain high absolute values e and further can be reduced to I(Xk|zi) = (the sign does not matter because of the quadratic term of A

eT −1 e 3 + A · 9 · A in (55)). 1 k i i i ln which is exactly the In the same way we can review the equation for 2 3 k focused IG (28). The first term 1 ln I + A · 6 · AT unfocused IG from (23) for the case when the 2 m k measures the amount of information about whole state Xk, candidate action ai ≡ zi introduces a single factor into the factor graph. while the second term

While both approaches are obviously calculating the 1 U U|F U T ln Im + A · 6k ·( A ) same metric, the computation complexity is not the same. 2 f U|F f In both Davison (2005) and Kaess and Dellaert (2009), the 1 9 + AU · 6 ·( AU )T = ln k = H( z|X F) −H( υ) objective was calculated through (54) and its complexity 2 9 k was dependent on the dimension of X . In contrast, our k (57) approach rAMDL does so independently of state dimen- F sion through (24) as has been shown above, making it more measures the information given that Xk was provided, efficient compared with the MI technique. meaning information for only unfocused variables. The In addition, Kaess and Dellaert (2009) presented the difference between total information and information of approach to sequentially select informative measurements only unfocused variables will provide the information F that accounts for measurements correlation and redundancy, about the focused set Xk . but without the need to update state estimation during each Such interpretation of IG’s meaning through the entropy decision. In Section 4.1 we present our algorithm Sequen- of future measurement and of its noise can be consid- tial rAMDL where we combine a similar idea together with ered not only for the measurement selection problem, but the rAMDL technique in order to eliminate the need for a also for the more general formulation from Section 2, thus marginal covariance calculation at each decision. constituting a possible direction for future research. 20 The International Journal of Robotics Research 00(0)

4. Application to different problem domains is astronomical due to its combinatorial nature. Therefore, typically the problem is solved in a greedy way. In Section 3, we provided an efficient solution for a general We propose a sub-optimal approach where a sequence BSP problem, considering both non-augmented and aug- of decisions must be made instead of one decision. Dur- mented cases. In this section, we discuss various problem . ing each decision we are looking for subset S0, |S0| = c0, domains of (augmented) BSP and show how our approach with c0 locations chosen from locations that were not yet can be applied for each case. More concretely, we focus selected. The optimal S0 is the one that maximizes X’s esti- on sensor deployment (Section 4.1), active SLAM (Sec- mation accuracy. The algorithm ends when the overall set tion 4.2), and graph reduction (Section 4.3), as specific of locations S = {S0 , S0 , . . .} grows to cardinality of c. non-augmented and augmented BSP applications. For the 1 2 Note that the number of locations in each subset, c0, should former, we develop a more computationally efficient variant  be such that the number of S0 candidates, n , is small of our approach. For each case, we first briefly formulate the c0 enough to be evaluated in a realistic time period. Thus, c0 problem and then describe our solution. is scenario-dependent and should be selected manually. More specifically, we assume that until time tk the dis- 4.1. Sensor deployment 0 0 joint subsets {S1, . . . , Sk} of locations were selected, where 0 1 c0 Sensor deployment is one of the most frequently researched each location subset Sj = {lj , . . . , lj } provided measure- 1 c0 problems of decision making. The basic idea is to measure ments Zj = {zj , . . . , zj }. Given these measurements, the a specific metric in domain space such as, e.g., temperature joint pdf at time tk is within a building space. The goal is to find the best locations Yk Yc0 for available sensors in order to estimate the metric in the P(X|Z ) ∝ P ( X) P( zi|xi) , (60) entire domain in the most accurate way. 1:k 0 j j j=1 i=1 Typically discretization of the domain space is made due to computation complexity considerations. Thus, we have n where observation model P( zi|xi) is defined in (58). . j j available locations in the space, L ={l , . . . , l }, where sen- MAP estimation of X according to information in (60) 1 n . sors can be deployed. The metric’s values in these locations will provide current state belief bk[X] = P( X|Z1:k) = N (X ∗, 6 ), and following (7) the information matrix of can be modeled as random variables and combined into a k k P P − k c0 state vector: X ={x1, . . . , xn}. 1 i T −1 i bk[X] is 3k = 6k = 30 + j=1 i=1(Hj ) ·(6υ,j,i) ·Hj Putting a sensor at location li will allow us to take mea- i . i where Hj = 5xhj are the Jacobian matrices of observation surement zi at that location, which will provide information model (58) for all measurement terms in (60), linearized about the metric at place xi. Assume that the measurement ∗ about the current estimate X . Note that the belief bk[X] model of a sensor is known and is k can be naturally represented by a factor graph Gk as was explained in Section 3.1 (see also Figure 10). z = h (x ) +υ , υ ∼ N (0, 6 ) . (58) i i i i i υ,i The next decision requires us to select next candidate 0 In addition, correlation between different locations may action a: a location subset Sk+1 that will minimize pos- be known a priori. Such prior can be presented as X’s joint terior uncertainty. Therefore, candidate space contains all 0 0 0 0 0 distribution, P ( X). Assuming that it is Gaussian, it may be subsets of the form S ⊆ L \{S1 ∪···∪ Sk} and |S | = c . 0 0 represented as (Indelman, 2016; Krause et al., 2008; Zhu Each such candidate subset a ≡ S0 = {l1, . . . , lc } will pro- 0 and Stein, 2006; Zimmerman, 2006) vide future measurements Z0 ={z1, . . . , zc } and thus future belief bk+1[X] and its information matrix will be X ∼ P ( X) = N ( µ, 6 ) = N −1( η, 3 ) . (59) 0 0 0 Yc0 P 0 P i i Note that, in practice, in typical sensor deployment prob- bk+1[X] = ( X|Z1:k, Z ) ∝ bk[X] ( z |x ), i=1 lems 30 is not actually available and 60 is used instead. 0 Nevertheless, in further formulation we assume that 3 was Xc 0 i T −1 i −1 3k+1 = 3k + (H ) ·(6υ,i) ·H . (61) calculated a priori (as 60 ) and therefore is available to us. Finding the best sensor locations in order to estimate i=1 0 the metric in the most accurate way is another instance of Thus, the candidate S introduces to Gk the factor set F(a), information-based non-augmented BSP and therefore can which contains exactly c0 factors. Each of the factors is con- be viewed through a prism of factor graphs (see Figure 10) nected to one variable: the xi that represents location of as we show below. factor’s sensor (see Figure 10). Conceptually, the space of candidate actions in a sensor Similarly to the general formulation in Section 2, stack- deployment setting contains all subsets of possible sensor ing all new Jacobians in the above equation together into locations S ⊆ L with the usual constraint on cardinality a single matrix and combining all noise matrices into of subset S, |S| ≤ c, to represent that the number of sen- a block-diagonal one will lead to (9). Hence, the opti- sors is limited. However, considering all subsets of size cis mal candidate subset S0 will be the one that maximizes n usually unrealistic as the number of all possible subsets c IG from (15). Dmitry Kopitkov et al. 21

Fig. 10. Illustration of belief propagation in factor graph representation: sensor deployment scenario. The space is discretized through . a grid of locations L ={l1, ... , l9}. Factor f0 within prior factor graph Gk represents our prior belief about state vector, P0( X). Factors 3 6 7 8 f1–f3 represent measurements taken from sensors deployed at locations l1, l2, and l4. Two actions ai = {l , l } and aj = {l , l } are 0 considered, introducing into graph two new factor sets F(ai) and F(aj), respectively (colored in green). In this example value of c is 2.

Note that the block-columns of Jacobian matrix A ∈ after each decision. Such an approach gives an approxi- Rm×n from (9) represent all possible sensor locations mated solution (compared with the sub-optimal sequence of and block-rows represent new c0 measurement factors decisions described above), but without paying computation from (61). As was mentioned before, only involved vari- resources for expensive manipulation of high-dimensional ables will have non-zero values in their block-columns. It matrices. is not difficult to show that in the sensor deployment prob- The first decision will be performed in exactly the same 0 0 lem, the involved variables are xi that belong to locations in way: we will look for the best subset S1 of size c that max- subset S0. Block-columns of all other variables in A will be imizes IG (62), for the unfocused case. However, upon zeros. finding such a subset, the estimation solution of the system The rest of the problem definition (objective func- will not be updated due to measurements from new sensors. 0 tions, unfocused and focused settings) for the sensor Instead, in each next decision we will look for a subset Sk+1 deployment problem is identical to the general formulation. that maximizes the following objective 0 In particular, in the unfocused setting the optimal Sk+1 will be found through 0 e 1 T Ie + Ae · 6 · A Sk+1 = arg max JIG( S) = ln m S 0 eS , 0 0 1 T S0⊆X\{S0 ,...,S0 },|S0|=c0 2 = = I + A 0 ·6 · A 1 k Sk+1 arg max JIG( S ) ln m S k S0 ,   S0⊆X\{S0 ,...,S0 },|S0|=c0 2 1 k AS0 (62)  1  0  :  where A 0 is the Jacobian matrix of candidate S . AeS =   (63) S A 0 Sk AS0 Solution: Sequential rAMDL The above problem can be straightforwardly solved using the rAMDL approach, e . 0 0 0 through (24) and (30). However, for each sequential deci- where S = {S1, . . . , Sk, S }, and AeS is a matrix with all sion the marginal covariance should be calculated for a set appropriate Jacobians combined together. of variables involved in any of the candidate actions, and Note that the sequential decision making through (63) it is not difficult to show that this set will contain all as- will yield an exact solution, compared with sequential deci- yet unoccupied locations. In scenarios with a high number sion making through (62), if Jacobian matrices Hi (equa- of possible sensor locations, this can negatively affect the tion (61)) do not change after acquiring measurements from overall time performance. newly deployed sensors. This is the case, for instance, when ∗ Here we present an enhanced approach, Sequential linearization point Xk remains the same or when measure- rAMDL, that performs the same sub-optimal sequence of ment model (58) is linear with respect to xi (i.e. zi = xi +νi). decisions as described above, but uses only the prior covari- Otherwise, equation (63) will merely be the approximation ance matrix 60, without recalculating covariance entries of the above approach. 22 The International Journal of Robotics Research 00(0)

j j j j After looking into (63) one can see that matrix inside is in the form of a general factor model ri = hi(Xi ) +υi from actually (3) by moving the left side to the right:   VS0 YS0 ,S0 ··· YS0 ,S0 ¯  1 1 2 1  0=f (xi, ui)−xi+1 +ωi =f (xi, xi+1) +ωi , ωi ∼ N (0, 6ω,i). YS0 ,S0 VS0 ··· YS0 ,S0 T  2 1 2 2  (67) Ie + Ae · 6 · A =   , m S 0 eS  . . . .  . . .. . The joint pdf for the SLAM problem at time tk (or current belief ) is then Y 0 0 Y 0 0 ··· VS0 S ,S1 S ,S2 . T . T Yk V 0 = I + A 0 · 6 ·(A 0 ) , Y 0 0 = A 0 · 6 · A 0 , S m S 0 S Si,Sj Si 0 S j b[Xk] = P( Xk|Z0:k, u0:k−1) ∝ P( x0) (64) i=1   Yni where V 0 and Y 0 0 can be efficiently calculated (indepen- P( x |x , u ) P( z |x , l ) , (68) S Si,Sj i i−1 i−1 i,j i j dently of state dimension) due to the sparsity of Jacobians. j=1 Moreover, after 60 is calculated (or given) at the beginning where P( x0) is a prior on the robot’s first pose, Zi = of the algorithm, all its required entries are freely accessible {zi,1, ... , zi,n } represents all observations at time ti, with through the entire runtime of the algorithm. i ni being the number of such observations. The motion It can be seen that all diagonal matrices VS0 were already and observation models P( xi|xi−1, ui−1) and P( zi,j|xi, lj) are calculated during the first decision and can be kept and re- defined by (65) and (66). A factor graph representation, used. In addition, all the correlation matrices YS0,S0 (except i j considering for simplicity only two landmarks l1 and l2, for Y 0 0 ) were calculated in previous decisions. The only Sk ,S is shown in Figure 11. Performing MAP inference over 0 ∗ required calculation in every decision for each candidate S the belief b[Xk], one can write b[Xk] = N (Xk , 6k), with ∗ is the matrix Y 0 0 and determinant of the combined matrix. appropriate mean vector X and covariance matrix 6 . Sk ,S k k Our unfocused Sequential rAMDL approach can be The space of candidate actions in SLAM setting contains seen as providing a little increase in the per-candidate cal- all control sequences uk+1:k+L−1, where L is the planning culation in order to escape the necessity of prior covariance horizon and can vary between different candidates. Typi- calculation for each decision, similarly to the method of cally a finite set of candidates is pooled from this infinite sequential informative measurements selection presented in space according to their relevance to robot’s current desti- Kaess and Dellaert (2009). This approach can be a good nation or to loop-closure maneuver, for example through alternative to the rAMDL technique when one-time calcula- simulation (Stachniss et al., 2005) and sampling (Agha- tion part of rAMDL (Section 3.4) is more time-consuming Mohammadi et al., 2014; Prentice and Roy, 2009). Similar . than the part of candidates evaluation, as will be shown to (6), future belief b[Xk+L] = P( Xk+L|Z0:k+L, u0:k+L−1) for in our simulations. The focused Sequential rAMDL particular candidate action a = uk+1:k+L−1 can be explicitly approach is also possible, by following similar derivations. written as   Moreover, the same idea is applicable to other sequential kY+L Ynl domains such as the measurement selection problem. b[Xk+L] ∝ b[Xk] P( xl|xl−1, ul−1) P( zl,j|xl, lj) , l=k+1 j=1 4.2. Augmented BSP in unknown environments (69) where Xk+L is the state vector at the Lth lookahead step. It In this section, we discuss a specific case of the augmented contains all variables from the current state vector Xk and BSP problem from Section 2, considering a SLAM setting. is augmented by new robot poses Xnew = {xk+1, ... , xk+L}. Such a specification provides the reader with an illustrative Also note that in (69) we consider only new observations example of the augmented BSP problem for better intuition. of landmarks that were already mapped until time tk. It is Let us refine the definition. In the smoothing formulation also possible to reason about observing not as-yet mapped of visual SLAM the state vector Xk represents robot poses landmarks (Indelman, 2015a), but it is outside the scope of per each time step, {x , ... , x }, and landmarks mapped . 0 k this paper. until now, L = {l , ... , l }. Further, we model robot k 1 nk Following the model from Section 3.1, the candi- motion dynamics and sensor observations through new date’s factor graph G(a) =( F (a), Xnew, Enew) will con- tain all new robot poses connected by motion model fac- xi+1 = f (xi, ui) +ωi, ωi ∼ N (0, 6ω,i) (65) new M M tors F (a) = {fk+1, ... , fk+L−1} with appropriate motion ¯ ¯ N models {f (xk+1, xk+2), ... , f (xk+L−1, xk+L) }, whereas fac- zi,j = h(xi, lj) +υi,j, υi,j ∼ (0, 6υ,i,j) , (66) conn tors from F (a), which connect old variables Xk and where ui is control at time ti, zi,j represents observation of new variables Xnew, will contain one motion model fac- M ¯ landmark lj by a robot from position xi at time ti, and where tor fk (with motion model f (xk, xk+1)) and all of obser- ωi and υi,j are the motion and measurement noises, respec- vation model factors connecting new poses with observed tively. Note that the motion model can be easily presented landmarks (see Figure 11). Dmitry Kopitkov et al. 23

Fig. 11. Illustration of belief propagation in factor graph representation: SLAM scenario. Nodes xi represent robot poses, while nodes li represent landmarks. Factor f0 is prior on robot’s initial position x1; factors between robot poses represent the motion model (65); factors between pose and landmark represent the observation model (66). Two actions ai and aj are considered, performing loop-closure to re-observe landmarks l1 and l2, respectively. Both actions introduce their own factor graphs G(ai) and G(aj) (colored in pink) that conn conn are connected to prior Gk through factor sets F (ai) and F (aj) (colored in green), respectively.

Following the general formulation, the posterior infor- Solution: rAMDL applied to SLAM The augmented BSP mation matrix of belief b[Xk+L], i.e. 3k+L, can be con- problem for the SLAM case, described in the previous sec- structed by first augmenting the current information matrix tion, can be naturally solved by our general approach from −1 3k ≡ 6k with L zero block-rows and block-columns, each Section 3.3. However, we go one step further and provide block having dimension np of robot pose variable, to obtain a solution tailored specifically to the SLAM domain, as an Aug RN×N example of applying rAMDL to a real problem and in order 3k+L ∈ with N = n + L · np, and thereafter adding to it new information, as illustrated in Figure 1 (see e.g. to show the underlying structure of the SLAM solution. Indelman et al., 2015): First, let us model informative partitions of Jacobian matrices B and D from (34), IB , B , and D (see   old new new Xk+L Xnl also Figure 7), for one of the candidate actions, action a. Aug T −1 T −1 3k+L = 3k+L + Fl · 6ω,l · Fl + Hj · 6υ,l,j · Hj , As was mentioned above, the factors from action’s factor l=k+1 j=1 graph G(a), F new(a), contain all new motion model factors (70) M . . from (69), except for factor fk . Therefore, Dnew will have where Fl = 5xf and Hj = 5xh are augmented Jacobian the following form: matrices of all new factors in (69) (motion and observation terms all together), linearized about the current estimate of X and about initial values of newly introduced robot poses. k ( columns of x , ... , x ) Again, after stacking together all new Jacobians in the k+1 k+L above equation and combining all noise matrices into a block-diagonal matrix, we obtain the same posterior infor-   block-row for f¯(xk+1, xk+2) mation expression as in (10). Dnew = =  .  Note that the block-columns of matrix A ∈ Rm×N . ¯ from (10) represent all old robot poses, mapped until now block-row for f (xk+L−1, xk+L) landmarks, and new robot poses from L-horizon future. Here A’s block-rows represent new motion and observa- ( xk+1)( xk+2)( xk+3)( xk+4) tion factors from (69). As mentioned before, only involved   variables will have non-zero values in their block-columns. F −I 0 0 It is not difficult to see that in SLAM the involved ones k+1  0 Fk+2 −I 0  are: all new robot poses, current robot pose xk, and all    0 0 Fk+3 −I  − 1   landmarks that will be observed following the current can- 2  F  9new · 0 0 0 k+4  didate’s actions. Block-columns of all other variables in A  . . . .   . . . .  will be zeros.  . . . .    The rest of the problem definition (objective functions,  0 0 0 0  unfocused and focused settings) for the active SLAM 0 0 0 0 problem is identical to the general formulation in Section 2. 0 0 0 0 24 The International Journal of Robotics Research 00(0)

  ··· ( x ) ( x ) ( x ) 0 F k+L−2 k+L−1 k+L k   HLa  − 1  1 0  − 1 0 F   IB = 9 2 ·   = 9 2 · k , ··· 0 0 0 old conn . . conn HLa  . .  0 ··· 0 0 0    HLa 0   no ··· 0 0 0    ··· 0 0 0  HLa    1  . . . .  La . .  .. . . .  H =  .  (74)  . . .  . ··· −I 0 0  HLa   no ··· Fk+L−2 −I 0 ··· 0 F −I k+L−1   −I ··· 0 1 x x   . − 2 e H k+1 H k+L  = 9new · Dnew (71) − 1  1 ··· 1  − 1 F B = 9 2 ·   = 9 2 · , new conn . . . conn HXnew  . .. .  Hxk+1 Hxk+L no ··· no .   F xk+1 xk+L where k+l =5xf x=x is the Jacobian of motion model H ··· H k+l  1 1 function f from (65) with respect to x , −I is Jacobian . .   k+l F = −I ··· 0 , HXnew =  . .. .  ,(75) of f¯ from (67) with respect to second pose and is actually . . . Hxk+1 Hxk+L an identity matrix with dimension equal to the dimension no ··· no of the robot pose. Matrix 9new is block-diagonal, combin- new La . ing all noise matrices of F (a) factors. In addition, we where H = 5L hi is the Jacobian of the ith observa- e i a denote by Dnew the Jacobian entries of Dnew not weighted tion factor hi from (66) with respect to variables La, and by factors’ noise 9new. thus only one of its block-columns, corresponding to an Hxk+l . Assume that following a’s controls the set of landmarks observed landmark, is non-zero. Here i = 5xk+l hi is La ⊆ Lk will be observed. In addition, define the set of all the Jacobian of hi with respect to xk+l, and therefore is new observation factors F obs(a) as non-zero only if a factor’s observation was taken from pose xk+l. Matrix 9conn is block-diagonal, combining all noise matrices of F conn(a) factors. obs O As can be seen from the above, the Jacobian matrices F (a) ={ factor fi with observation model hi( x, l): I Bold, Bnew, and Dnew are sparse and can be efficiently manip- x ∈ Xnew, l ∈ La, 1 ≤ i ≤ no}, (72) ulated. More specifically, the information matrix of factor T graph G(a), 3a = Dnew · Dnew, which is required in our approach, can be calculated quickly as a product of sparse eT −1 e where no is the number of such factors. Thus, the connect- matrices 3a = Dnew · 9new · Dnew due to the formulation conn M obs in (71); in addition, it can be shown to be singular and ing factors are F (a) = {fk , F (a) }; and involved old I variables will be Xold = {La, xk}, containing xk because of block-tridiagonal. first factor’s motion model f¯(x , x ). Therefore, IB and The matrix C1 from (41) can also be reduced to the k k+1 old following form: Bnew will be

− 1 C = I + 9 2 · 1 mconn conn ! { / } F M,xk FT F M, xk La HLa T ( columns of La, xk, xk+1, . . . , xk+L) k · 6k · k k · 6k ·( ) { / } HLa · 6M, La xk · FT HLa · 6M,La ·( HLa )T   k k k ¯ − 1 block-row for f (xk, xk+1) 2  ·9conn I   Bold Bnew = block-row for h1 − 1   = 9 2 ·  .  " conn ! # { / } . F M,xk FT F M, xk La HLa T k · 6k · k k · 6k ·( ) 9conn + { / } block-row for hno HLa M, La xk FT HLa M,La HLa T · 6k · k · 6k ·( ) 1 1 1 − 2 . − 2 − 2 ( La) ( xk) ( xk+1) ··· ( xk+L) ·9conn = 9conn · C2 · 9conn, (76)

 F  ! 0 k −I ··· 0 M,xk T M,{xk /La} La T 1 x x . Fk · 6 · F Fk · 6 ·( H ) − 2 La k+1 k+L k k k H 0 H ··· H  C2 = 9conn + M,{La/xk } M,La = 9conn · 1 1 1  (73) HLa · 6 · FT HLa · 6 ·( HLa )T  . . . . .  k k k ...... (77) M,{x /La} xk+1 xk+L k HLa H H where 6k is the prior cross-covariance between no 0 no ··· no variables xk and La. Dmitry Kopitkov et al. 25

In addition, C1’s determinant and its inverse can be variables (Ila et al., 2010; Kretzschmar and Stachniss, 2012; calculated through Paull et al., 2016) and sparsifying entries of the information

matrix (Carlevaris-Bianco et al., 2014; Huang et al., 2012; C 1 1 2 −1 2 −1 2 Mazuran et al., 2014; Vial et al., 2011), respectively. C1 = , C1 = 9conn · C2 · 9conn. (78) 9conn Graph reduction requires us to first select nodes to expel. T −1 In such cases, having a state vector X with variables Next, we can calculate term Bnew · C1 · Bnew from (39) as  {x1, . . . , xn}, it would be logical to remove the most uncer- BT · C−1 · B = FT ( HXnew )T · tain node, say x , without which the rest of the variables new 1 new   . i X = {X \ x } would have the smallest entropy H( X ). In − 1 1 1 − 1 F i i i 9 2 · 9 2 · C−1 · 9 2 · 9 2 · conn conn 2 conn conn HXnew this section we outline a new approach for such a selection   which is closely related to our rAMDL technique.  F T Xnew T −1 eT −1 e Similarly to the focused objective function from (17), = F ( H ) · C · = B · C · Bnew, 2 HXnew new 2 ∗ the best choice for expelled variable xi among state vari- (79) ables will minimize the following objective function:   F e . x∗ = arg min J (x ) where Bnew = contains the Jacobian entries i GR i HXnew x ∈X i of Bnew not weighted by factors’ noise 9conn. Then, the (n − nx) ·γ 1 = H( X ) = − ln 3M,Xi , (81) unfocused IG objective from (16) in the SLAM setting i 2 2 is given by where H( X ) is entropy of the state variables without x , and n0 · γ 1 1 1 i i eT n is x ’s dimension. JIG(a) = − ln 9conn + ln C2 + ln|Bnew · x i 2 2 2 2 Using Equation (26) from our approach, in order to −1 e eT −1 e M,X C2 · Bnew + Dnew · 9new · Dnew|. (80) calculate 3 i , we can reduce our objective function to Above we have shown in detail how our rAMDL approach (n − nx) ·γ 1 1 J (x ) = − ln 3 + ln 3xi , (82) can be applied to information-based SLAM planning prob- GR i 2 2 2 lem types. The derived equation (80) is very similar to the x general solution from (39), having exactly the same run- where 3 is the information matrix of the whole X, and 3 i time complexity. However, within both (80) and (77) we is its partition related to variable xi. can see a clear separation between noise of factor model Given that all xi variables have the same dimension ∗ and the actual Jacobian entries. Such a separation can pro- nx, eventually we can conclude that optimal xi will also vide further theoretical insight about how different terms of minimize the SLAM problem affect the information impact of candi- ∗ x = arg min J (x ) = ln 3xi (83) date action a = u . Moreover, it can provide a good i GR i k:k+L−1 xi∈X starting point for the derivation of JIG(a)’s gradient with respect to uk:k+L−1, which, in turn, can be used for gradient- which practically implies calculating the determinant of xi descent algorithms that search for locally optimal controls every partition 3 and choosing the state variable xi with (Indelman et al., 2015; Van Den Berg et al., 2012). Note minimal determinant value. In cases where all xi are scalars, the variable ordering in the above equation serves only for 3xi is just a value from the diagonal of the information visualization; the derivation remains valid for an arbitrary matrix 3. In cases where xi’s dimension is nx, we will have variable ordering. to calculate the determinants of n matrices, each one of In addition, for the sake of completeness we also pro- dimension nx × nx. Taking into account that nx is usually vide a SLAM-specific solution for focused cases, where not big at all (e.g. a 3D pose has six dimensions), the overall we consider either reducing the entropy of the last pose calculation is very fast and is just O(n). F F (Xk+L ≡ xk+L) or of all the mapped landmarks (Xk ≡ Lk). The corresponding derivation can be found in Appendices 5. Alternative approaches A.7 and A.8. We compare the presented rAMDL approach with two alter- 4.3. Graph reduction natives, namely the From-Scratch and iSAM techniques. In From-Scratch, the posterior information matrix 3k+L It is a known fact that in long-term SLAM applications, is computed by adding new information AT · A, followed the state dimension of smoothing techniques can grow by the calculation of its determinant. In the focused sce- F unboundedly. In such cases, even the most efficient state-of- nario, the marginal information matrix of Xk+L is retrieved the-art estimation algorithms such as iSAM2 (Kaess et al., through the Schur complement performed on 3k+L, and its 2012) can become slow and will not support online opera- determinant is then computed. tion. Approaches such as graph reduction and graph sparsi- The second alternative, uses the iSAM algorithm (Kaess fication try to tackle the problem by reducing the number of et al., 2012) to incrementally update the posterior. Here the 26 The International Journal of Robotics Research 00(0)

(linearized) system is represented by a square root informa- The complexity of both focused and unfocused 3 tion matrix Rk, which is encoded, while exploiting sparsity, From-Scratch approaches is governed by the term O( N ), by the Bayes tree data structure. The posterior matrix Rk+L with N being the posterior state dimension. In the is acquired (e.g. via Givens rotations (Kaess et al., 2012) unfocused case, the From-Scratch method calculates a or another incremental factorization update method), and determinant of dimension N for each candidate action. In QN 2 the focused case, the From-Scratch method calculates a then the determinant is calculated |3k+L| = rii, with i=1 Schur complement per each candidate, which is typically rii being the ith entry on the diagonal of triangular Rk+L. more computationally expensive than the computation of a For the focused case, the marginal covariance matrix of matrix determinant. F Xk+L is computed by recursive covariance per-entry equa- The iSAM technique propagates posterior belief per each tions (Kaess and Dellaert, 2009) that exploit the sparsity of candidate action and then evaluates the appropriate pos- matrix Rk+L. terior entropy (unfocused or focused). This belief While the iSAM technique outperforms batch From- propagation is done efficiently through a Bayes tree, yet Scratch, it still requires calculating Rk+L for each action, its runtime complexity is difficult to analyze. It is declared which can be expensive, particularly in loop closures, in Kaess et al. (2012) that typically belief propagation 1.5 and requires a copy/clone of the original matrix Rk. In takes O( N ). In the unfocused case, |3k+L| is calcu- contrast, in rAMDL, the per-candidate action calculation lated in O( N) through diagonal entries of Rk+L, providing (e.g. in (37)) has constant complexity in general, given a final per-candidate complexity of O( N 1.5) for the iSAM the prior marginal covariance terms that are calculated Unfocused approach. In the focused case the marginal F only once. covariance matrix of focused variables Xk+L is computed via the method from Kaess and Dellaert (2009), which 2 is O( Nnz · N) where N is the posterior state dimension and Nnz is the number of non-zero entries in the posterior 6. Computational complexity analysis square root information matrix Rk+L. Further, the determi- nant of the marginal covariance matrix is computed, which In this section, we analyze the computational complex- takes O( |X F|3). In total the iSAM Focused approach will ity of the developed-herein family of rAMDL algorithms, have per-candidate complexity of O( N 1.5 +N 2 ·N +|X F|3). and of alternative approaches from Section 5. We sum- nz From Table 4 we can see that the per-candidate marize the runtime complexity of different approaches in complexity of rAMDL does not depend on the state Table 4. dimension while the complexity of both iSAM and From- The computational complexity of rAMDL algorithms Scratch does. For example, in the unfocused case, consists of two parts. First, there is a one-time computation rAMDL requires O( M( F(a))3 ), while iSAM and From- I IX U |F M, Xold old 1.5 3 of prior covariance entries 6k (and sometimes 6k , Scratch need O( N ) and O( N ), respectively. Since see Section 3.4). Second, there is a per-candidate compu- M( F(a)) represents the total dimension of newly intro- tation of an approach-specific objective function. There- duced factors by candidate action a, it is typically con- fore, overall computational complexity is equal to O(one- siderably smaller than the posterior state dimension N, time complexity ) +O(per-action complexity × number of i.e. M( F(a))  N. This difference makes the rAMDL candidate actions). technique significantly faster, as will be shown further in I M, Xold Section 7. The prior covariance 6k is calculated from the square root information matrix Rk using a recursive method Note that we do not analyze the BSP complexity in terms as described in Golub and Plemmons (1980) and Kaess and of the horizon lag of a candidate action. Instead, we use 2 dimensions of different components of prior and posterior Dellaert (2009). Its complexity is bounded by O(nnz · n), where n is the state dimension and nnz is the number of factor graphs since these give a better insight on the real non-zero entries in the Rk. However, typically its complex- complexity of the factor-graph-based algorithms presented ity depends on variable ordering within Rk and is much herein. lower. For more detailed complexity analysis of this covari- ance recovery method, see Kaess and Dellaert (2009) and I U Xold|F 7. Results Ila et al. (2015). For cases where 6k is also required, we consider the method described in Algorithm 2 that In this section, we evaluate the performance of the pro- calculates the prior conditional covariance entries through posed approach and compare it with alternative approaches the Schur complement. Its complexity is bounded by the considering unfocused and focused instantiations of M,F −1 F 3 F inverse calculation (6k ) , which is O( |X | ) with |X | several fundamental problems: sensor deployment, mea- being the dimension of focused variables. surement selection, and autonomous navigation in unknown The per-action complexity of rAMDL algorithms was environments. analyzed in Sections 3.2–3.3, next to the equation definition In sensor deployment, each candidate action represents of each approach, and is summarized in Table 4. a set of possible locations for deploying a sensor, with a Dmitry Kopitkov et al. 27

Table 4. The presented approaches and their alternatives, along with the corresponding runtime complexity. Used symbols are: n is the 0 dimension of prior state vector Xk; nnz is the number of non-zero entries in a prior square root information matrix Rk; n is the dimension 0 of newly introduced variables Xnew by candidate action a; N = n+n is the dimension of posterior state vector Xk+L; Nnz is the number of non-zero entries in a posterior square root information matrix Rk+L; F(a) represents newly introduced factors by candidate action a; M( F(a)) is the total dimension of newly introduced factors F(a); Fconn(a) represents a subset of factors from F(a) that involves at conn conn least one old variable from Xk; M( F (a)) is the total dimension of factors in F (a). Further details can be found in Section 6. Approach Per-action complexity One-time complexity

Non-augmented BSP rAMDL approaches 3 2 rAMDL Unfocused, equation (24) O( M( F(a)) ) O(nnz · n) 3 2 F 3 rAMDL Focused, equation (30) O( M( F(a)) ) O(nnz · n +|X | ) Augmented BSP rAMDL approaches 3 2 rAMDL Unfocused, equations (37), (38) O( M( F(a)) ) O(nnz · n) conn 3 03 2 rAMDL-Extended Unfocused, equations (39), (41) O( M( F (a)) +n ) O(nnz · n) 3 03 2 rAMDL Focused New, equations (42), (38) O( M( F(a)) +n ) O(nnz · n) conn 3 03 2 rAMDL-Extended Focused New, equations (44), (41) O( M( F (a)) +n ) O(nnz · n) 3 03 2 F 3 rAMDL Focused Old, equations (45), (46) O( M( F(a)) +n ) O(nnz · n +|X | ) conn 3 03 2 F 3 rAMDL-Extended Focused Old, equations (47), (48) O( M( F (a)) +n ) O(nnz · n +|X | ) Alternative approaches From-Scratch Unfocused & Focused, Section 5 O( N3) - iSAM Unfocused, Section 5 O( N1.5) - 1.5 2 F 3 iSAM Focused, Section 5 O( N + Nnz · N +|X | ) -

single sensor deployment corresponding to a unary factor. 7.1. Sensor deployment (focused and We consider a non-myopic setting and let each candidate unfocused) action represent two sensor locations. In the measurement selection problem, we consider a greedy decision-making In this section, we apply our approach rAMDL to the sen- focused paradigm in the context of aerial visual SLAM with pair- sor deployment problem, considering both and unfocused wise factors. instantiations of this problem (see Section 4.1 Further, we present simulation results of applying our for a detailed formulation). The prior of the sensor field is approach to autonomous navigation in unknown environ- represented by the information matrix 3 and it is dense as ments (both unfocused and focused cases) on syn- usual in the problem of sensor deployment. thetic and real-world datasets. The robot has to visit a We compare our rAMDL approach against the batch sequence of goals while minimizing an objective function From-Scratch technique that is described in Section 5, and comprising two terms (to be defined in the sequel): dis- also against the Sequential rAMDL described in Section 4.1, tance to goal, and an uncertainty metric. Candidate actions which does not require marginal covariance computation at are non-myopic and involve multiple new and old state each decision. variables. While decision making involves evaluating the impact of A In all cases, the presented simulations reflect the com- an action for all candidate actions , we first analyze action A putational performance of different approaches developed impact calculation (JIG(a)) for a single candidate a ∈ , within this paper, and alternative methods that are described comparing rAMDL with the From-Scratch approach for the unfocused in Section 5. In Table 5 we summarize the considered case. Figure 12 shows these timing results as approaches in each of the above problems, and refer to a function of state dimension n (Figure 12a) and as function appropriate equations for each case. Moreover, we empha- of Jacobian A’sheight m (Figure 12b). As expected, n effects size that all techniques presented herein, rAMDL and their the running time of both the From-Scratch technique and alternatives, are mathematically identical in the sense that calculation of 6k (inverse of 3k, which is dense in the case they determine identical optimal actions given a set of of sensor deployment), while m only effects the calculation candidate actions. of the IG objective of rAMDL (red line). The code is implemented in Matlab; for measurement One might think, based on Figure 12a and (b), that the selection and autonomous navigation we use the GTSAM proposed approach is slower than the From-Scratch alterna- library (Dellaert, 2012; Kaess et al., 2012). All scenarios tive because of the time needed for inverse calculation to were executed on a Linux machine with an i7 2.40 GHz obtain 6k. Yet, it is exactly here that our calculation re-use processor and 32 GB of memory. paradigm comes into play (see Section 3.4): this calcula- tion is performed only once for all candidate actions A, 28 The International Journal of Robotics Research 00(0)

Table 5. Considered approaches in different problems from Section 7, along with their appropriate equations

Problem Approach Equations/Section

Sensor Deployment, rAMDL Unfocused Equation (24) Section 7.1 rAMDL Focused Equation (30) Sequential rAMDL Equations (63), (64) Partitions Givens rotations & Equation (27) From-Scratch, Unfocused & Focused Section 5 Measurement selection, rAMDL Unfocused Equation (24) Section 7.2 iSAM Unfocused Section 5 Autonomous Navigation, rAMDL Unfocused Equations (37), (38) Section 7.3 rAMDL-Extended Unfocused Equations (39), (41) rAMDL Focused New Equations (42), (38) rAMDL-Extended Focused New Equations (44), (41) rAMDL Focused Old Equations (45), (46) rAMDL-Extended Focused Old Equations (47), (48) From-Scratch, Unfocused & Focused Section 5 iSAM, Unfocused & Focused Section 5

+ while, given 6k, calculating IG for each action is no longer Givens rotations to compute R and instead of performing a function of n. the Schur complement, calculates the posterior entropy of The substantial reduction in running time of our the focused set via (27). This equation is one of our main approach, compared with the From-Scratch approach, can contributions, being an essential step in the derivation of our be clearly seen in Figure 12c, which considers the entire approach, and we show here that compared with the From- decision-making problem, i.e. evaluation of all candidate Scratch technique, the Partitions approach is considerably actions A. The figure shows the running time for sequen- faster. Our focused approach applies the matrix determi- tial decision making, where at each time instant we choose nant lemma, transforming (27) into (30), which, together the best locations of two sensors, with around |A| = 105 with the re-use concept (Section 3.4), makes it possible to candidate actions. The number of all sensor locations is drastically reduce the running time as shown in Figure 13a n = 625 in this example. Overall, 15 sequential decisions (10 seconds versus about 1000 seconds in Partitions and were made. As seen, decision making using our approach 1300 seconds in From-Scratch). requires only about 5 seconds, while the From-Scratch approach requires about 400 seconds. 7.2. Measurement selection in SLAM The Sequential rAMDL technique is not always faster than rAMDL, as can be seen in Figure 12c. As described in In this section, we consider a measurement selection prob- Section 4.1 this technique will be superior in cases where lem (see Section 3.5) within a visual aerial SLAM frame- the covariance calculation makes up a significant part of work, where one has to choose the most informative image the whole decision calculation. We can see that this is the feature observations from the numerous image features case in Figure 12f, where the number of candidates is lim- typically calculated for each incoming new image. ited to 100, and where the covariance calculation time is the We demonstrate application of our approach in this prob- biggest part in the decision making of the rAMDL approach. lem, which, in contrast to the sensor selection problem, There we can see that Sequential rAMDL provides better involves pairwise factors of the type p( zi,j|xi, lj), relating performance than all other alternatives. between an image observation zi,j, camera pose xi, and We now consider the focused version of the sensor landmark lj. deployment problem (17). In other words, the goal is to find A top view of the considered aerial scenario is shown sensor locations that maximally reduce uncertainty about in Figure 14a: an aerial vehicle performs visual SLAM, chosen focused variables X F. We have 54 such variables, mapping the environment and at the same time localizing which are shown in Figure 13c, while the rest of the problem itself. The figure shows the landmarks and the estimated setup remains identical to the unfocused case. trajectory, along with the uncertainty covariance for each In Figure 13 we show the corresponding results of time instant. One can clearly see the impact of loop-closure rAMDL, compared with the From-Scratch. The latter first observations on the latter. In the considered scenario there calculates, for each candidate action, the posterior 3+ = are about 25,000 landmarks and roughly 500 image features 3 + AT A, followed by calculation of the Schur comple- in each view. ment 3 M,F of the focused set X F, and its determinant The number of image features that correspond to previ- M,F F 3 in order to obtain JH(a) (17). We also compare it ously seen landmarks is relatively small (around 30–50, see with an additional approach, termed Partitions, which uses Figure 14b), which corresponds to a much smaller set of Dmitry Kopitkov et al. 29

Fig. 12. Unfocused sensor deployment scenario. Running time, for simplicity termed as “Time” within graphs, for calculating the impact of a single action as a function of state dimension n (a) and as a function of Jacobian A’s height m (b). In (a) m = 2 and in (b) n = 625. rAMDL Unfocused Objective represents only the calculation time of candidates’ impacts (IG objective for all actions), without one-time calculation of prior covariance; Covariance Inverse represents the time it took to calculate covariance matrix 6k from −1 dense information matrix 3k, 6k = 3k . (c) Running time for sequential decision making, i.e. evaluating the impact of all candidate actions, each representing candidate locations of two sensors. (d) Prior and final uncertainty of the field, with red dots marking selected locations; note locations with negligible uncertainty are the observed locations where sensors were deployed. (e) Number of action candidates per decision. (f) Running time for sequential decision making, with the number of candidates limited to 100. actions A compared with the sensor deployment problem In addition, as opposed to the sensor deployment prob- (Section 7.1) where the cardinality of A was huge (105). lem, in the current problem, state dimensionality n grows Such a dataset was chosen on purpose in order to show with time as more poses and landmarks are added into the behavior of the proposed algorithm in domains with a inference (see Figure 14c) and the information matrix is small number of candidates. In addition, in this scenario sparse. the actions are myopic since the measurements are greedily Figure 14d shows the timing results for choosing 10 most selected. informative image observations comparing the proposed 30 The International Journal of Robotics Research 00(0)

Fig. 13. Focused sensor deployment scenario. (a) Overall time it took to make a decision with different approaches; rAMDL Focused Objective represents only the calculation of candidates’ impacts (IG objective for all actions) while rAMDL Focused represents both one-time calculation of prior covariance 6k and candidates’ evaluation. (b) Final uncertainty of the field, with red dots marking selected locations; note locations with negligible uncertainty are the observed locations where sensors were deployed. (c) Focused set of variables (green circles) and locations selected by algorithm (red dots). (d) Overall system entropy (above) and entropy of focused set (bottom) after each decision, with the blue line representing the unfocused algorithm and the red line representing the focused algorithm. Note that all unfocused methods make exactly the same decisions, with difference only in their runtime complexity. The same is also true for all focused methods.

rAMDL with the iSAM approach (computing the poste- 7.3. Autonomous navigation in an unknown rior square root information matrix using iSAM, and then environment calculating the determinant; see Section 5). This BSP prob- lem is solved sequentially, each time a new image is In this section we present simulation results of apply- acquired. As seen, our approach rAMDL is substantially ing our approach to autonomous navigation in unknown unfocused focused faster than the iSAM, while providing identical results (the environments (both and cases) on same decisions). In particular, the running time of the iSAM synthetic and real-world datasets. approach for the last time index with n = 10,000 state In the synthetic scenario (Figure 15c), the robot’s task is G dimensionality, is around 7 seconds. In contrast, rAMDL to visit a predefined set of goals = {G1, . . . , G14} in an takes about 0.05 seconds: calculation time of action impacts unknown environment while reducing an uncertainty met- via calculation re-use is negligible (red line), while the ric. More specifically, the state vector Xk contains all robot M,X poses and landmarks mapped until time t (see Section one-time calculation of marginal covariance 6 All (yel- k k 4.2). At each point of time, the robot autonomously selects low line) is performed efficiently, in the current implemen- an optimal non-myopic action a = u , performs tation, via sparse factorization techniques using GTSAM k:k+L−1 its first control u , and subsequently observes landmarks (Dellaert, 2012; Kaess et al., 2012). k within a radius of 900 meters from its new position. The Dmitry Kopitkov et al. 31

Fig. 14. Measurement selection scenario. (a) Simulated trajectory of a robot; black dots are the landmarks, blue marks and surrounding ellipses are the estimated trajectory along with the uncertainty covariance for each time instant, red mark is the robot’s initial pose. (b) Number of measurement candidates per decision. (c) State’s dimension n per decision. (d) Overall time it took to evaluate the impacts of all poses’ measurements, with different approaches; rAMDL Unfocused Objective represents only the calculation of candidates’ M,XAll impacts (IG objective for all actions) while rAMDL Unfocused represents both one-time calculation of marginal covariance 6k and candidates’ evaluation.

landmarks can be either old (seen before) or new (seen a “loop-closure” action acl = uk:k+L−1 that navigates the for the first time). Next, a SLAM solution is calculated robot from xk to gcl. given these new observations and a motion model. To that Each action in A, taking the robot from xk to location g, end, the factor graph from the previous inference time is constructed by first discretizing the map into a grid and is updated with the new observation and motion model thereafter searching for an optimal trajectory from the cur- factors, and new variable nodes, representing the current rent position to g using an A∗ search algorithm, similarly robot pose and new landmarks, are added (see Section to Kim and Eustice (2014) and Indelman et al. (2015). The 4.2). Afterwards, the next action is chosen and executed, optimal candidate action is chosen by evaluating an objec- and so on. tive that has the following two terms: distance to the current The set of candidate actions A contains one action that goal Gi and a term of uncertainty navigates the robot from its current pose xk to the current F J(a) = d(xk+L, Gi) +JH/IG(a) . (84) goal Gi from a predefined set G (see Figure 15c); it also contains a set of “loop-closure” actions that are generated in In the scenarios from Figures 15, 16, 17, and 19 we con- F the following way. We start by taking all mapped landmarks sider as the term of uncertainty the entropy JH(a) of the within a radius of 1000 meters from the robot’scurrent pose. last pose xk+L in the planning segment (Section 3.3.3.1), We cluster these landmarks, similarly to Kim and Eustice while in the scenario from Figure 18 we instead use the F (2014), and obtain a set of landmark clusters. Each cluster’s IG of mapped until now landmarks JIG(a) (Section 3.3.3.2). center gcl represents a “loop-closure” goal and contributes Note that the running time presented in the figures refers 32 The International Journal of Robotics Research 00(0)

Fig. 15. Focused BSP scenario with focused robot’s last pose. (a) Dimensions of the BSP problem (state dimension n, average number of new factor terms m, average number of new variables n0, average number of old involved variables l) at each time. (b) Number of action candidates at each time. (c) Final robot trajectory. Blue dots are mapped landmarks, the red line with small ellipses is the estimated trajectory with pose covariances, the blue line is the real trajectory, the red pluses with numbers beside them are robot’s goals. The green mark is the robot’s start position. (d) Enlarged view of the robot’s trajectory near goal 12.

only to the uncertainty term, since it is the focus of this metric, thus yielding the same decisions and only differ in paper and because the calculation complexity of the first running time. term (Euclidean distance d(xk+L, Gi)) is relatively insignifi- In Figures 16 and 17 it can be clearly seen that while cant. As can be seen from above, we consider a non-myopic iSAM is faster than From-Scratch, the running time of setting and let each candidate action represent trajectories both techniques is increasing with state dimensionality, as of various length. Limiting the clustering process to a spe- was mentioned previously. On the other hand, the running cific radius is done in order to bound the horizon laIn the time of the rAMDL approach is shown to be bounded, scenarios fromg of candidate actions. due to horizon lag of all candidate actions being lim- In parallel, in scenarios from Figures 16 and 17, an ited (see Figure 15a). The number of candidate actions in unfocused uncertainty objective JIG(a) is calculated our scenario is around 20 at each planning phase (Figure (Section 3.3.2), mainly for the purpose of performance 15b). Even with such a relatively small candidate set, the comparison between focused and unfocused cases. rAMDL approach is faster than its alternatives iSAM and The robot’s motion is controlled only by the focused From-Scratch, while the rAMDL-Extended approach is the objective function. fastest of all. This trend appears to be correct for both Four techniques were applied to solve the planning prob- focused and unfocused objective functions, though lem: more common techniques From-Scratch and iSAM for the later, iSAM comes very close to the rAMDL (Section 5) and the proposed techniques, our general technique. approach rAMDL and its extension rAMDL-Extended that While comparing the running time of both From-Scratch exploits the Jacobian inner structure from (34) (see Table 5, and iSAM in focused and unfocused objective func- and Sections 3.3.2 and 3.3.3.1). The calculated values of the tions, it is easy to see that the unfocused case is evalu- objective function were numerically compared to validate ated much faster. The reason for this is that the focused that all four approaches are calculating exactly the same calculations contain computation of marginal covariance of Dmitry Kopitkov et al. 33

Fig. 16. Focused BSP scenario with focused robot’s last pose. (a) Running time of planning, i.e. evaluating the impact of all candidate actions, each representing a possible trajectory. Results are shown both for focused and unfocused cases. (b) Enlarged view of the fastest approaches from (a). (c) Focused approaches from (b). Note that iSAM Focused is not depicted because, as seen in (a), it is much slower compared with other focused techniques. (d) Unfocused approaches from (b). The lowest line, labeled M,XAll Marginal Cov, represents the time it took to calculate prior marginal covariance 6k in rAMDL approach (see Section 3.4). As can be seen, while the rAMDL technique (Unfocused and Focused) is faster than From-Scratch and iSAM, the rAMDL-Extended technique gives even better performance. Further, it is interesting to note that the performance of Unfocused and Focused rAMDL is almost the same, as so is the performance of Unfocused and Focused rAMDL-Extended.

the focused variable (last pose xk+L) for each candidate and (42)) have a similar complexity, which is sup- action, which requires marginalization over the posterior ported by the shown times. The same is correct for the information matrix 3k+L. Although this can be performed rAMDL-Extended approach (equations (39) and (44)). efficiently by exploiting the sparsity of matrix 3k+L (Kaess Next, we repeated our autonomous navigation scenario, F and Dellaert, 2009), the time complexity is significantly but this time Xk+L contained only landmarks seen by time F affected by variable elimination ordering of the iSAM algo- k (see Figure 18). The IG of such a focused set Xk+L rithm (Kaess et al., 2012). While in our simulation we can be used as an objective function for example in the did not modify the default ordering of iSAM (COLAMD case when we want to improve 3D reconstruction quality. heuristic), different strategies of ordering can be a point for As can be seen in Figure 18, this focused set causes both future investigation. From-Scratch and iSAM techniques to be much slower com- In contrast, for the rAMDL approach both unfocused pared with their performance in the first scenario, where F F and focused objective functions (equations (37) Xk+L contained only xk+L. The reason for this is that Xk+L’s 34 The International Journal of Robotics Research 00(0)

Fig. 17. Focused BSP scenario with focused robot’s last pose. Running times from Figure 16 normalized by the number of candidates. dimension is much higher here, representing the dimen- according to the dataset. Recalling that our main contri- sions of all landmarks, and computation of its marginal bution is to reduce time complexity, such an evaluation covariance is significantly more expensive. In contrast, the allowed us to compare the time performance of all of the performance of rAMDL has barely changed thanks to the considered techniques, despite not actually using the cal- re-use of calculations (see Section 3.4). Moreover, rAMDL- culated actions in the hybrid simulation. As can be seen, Extended performs even better than rAMDL, with candidate here rAMDL and rAMDL-Extended also outperform both of action impact evaluation being insignificant compared with the alternatives, From-Scratch and iSAM, keeping the same the one-time calculation of marginal covariance, as can be trends that were observed in previous simulations. seen in Figures 18e and (f). We also performed a hybrid simulation where part of the real-world Victoria Park dataset (Guivant et al., 2012) was 8. Conclusions used for offline planning (see Figure 19). At each timestep We have developed a computationally efficient and exact we collected candidate actions by clustering landmarks seen approach for non-myopic focused and unfocused until that time, just as was done in the first simulation. Fur- BSP in both augmented and non-augmented settings, in ther, we considered a focused objective function for each high-dimensional state spaces. As a key contribution, we F candidate with Xk+L containing only xk+L. After evaluat- have developed an augmented version of the well-known ing all candidates, the robot was moved to the next pose general matrix determinant lemma and used both of them Dmitry Kopitkov et al. 35

Fig. 18. Focused BSP scenario with focused landmarks. (a) Number of action candidates at each time. (b) Final robot trajectory. (c) Running time of planning, i.e. evaluating the impact of all candidate actions, each representing a possible trajectory. (d) Running time from (c) normalized by number of candidates. (e) Enlarged view of the fastest approaches from (c). (f) Enlarged view of the fastest approaches from (d). The lowest line, labeled Marginal Cov, represents the time it took to calculate the prior marginal covariance M,XAll 6k in the rAMDL approach (see Section 3.4). to efficiently evaluate the impact of each candidate action among different candidate actions. Our approach drastically on posterior entropy, without explicitly calculating the pos- reduces running time compared with the state of the art, terior information (or covariance) matrices. The second especially when the set of candidate actions is large, with ingredient of our approach is the re-use of calculations, running time being independent of state dimensionality that that exploits the fact that many calculations are shared increases over time in many of BSP domains. The approach 36 The International Journal of Robotics Research 00(0)

Fig. 19. Focused BSP scenario with focused robot’s last pose, using the Victoria Park dataset. (a) Number of action candidates at each time. (b) Final robot trajectory. (c) Running time of planning, i.e. evaluating the impact of all candidate actions, each representing a possible trajectory. (d) Running time from (c) normalized by the number of candidates. (e) Enlarged view of the fastest approaches from (c). (f) Enlarged view of the fastest approaches from (d). The lowest line, labeled Marginal Cov, represents time it took to calculate M,XAll the prior marginal covariance 6k in the rAMDL approach (see Section 3.4). has been examined in three problems, sensor deployment, running time by several orders of magnitude (e.g. 5 versus measurement selection in visual SLAM, and autonomous 400 seconds in sensor deployment). navigation in unknown environments, using both simulated and real-world datasets, and exhibiting in each superior per- Funding formance compared with the state of the art, and reducing This work was supported by the Israel Science Foundation. Dmitry Kopitkov et al. 37

References Indelman V (2015b) Towards information-theoretic decision mak- ing in a conservative information space. In: American Control Agha-Mohammadi AA, Chakravorty S and Amato NM (2014) Conference, pp. 2420–2426. FIRM: Sampling-based feedback motion planning under Indelman V (2016) No correlations involved: Decision making motion uncertainty and imperfect measurements. The Interna- under uncertainty in a conservative sparse information space. tional Journal of Robotics Research 33(2): 268–304. IEEE Robotics and Automation Letters 1(1): 407–414. Bai S, Wang J, Chen F and Englot B (2016) Information-theoretic Indelman V, Carlone L and Dellaert F (2015) Planning in the exploration with Bayesian optimization. In: IEEE/RSJ Interna- continuous domain: a generalized belief space approach for tional Conference on Intelligent Robots and Systems (IROS). autonomous navigation in unknown environments. The Inter- IEEE, pp. 1816–1822. national Journal of Robotics Research 34(7): 849–882. Bai Z, Fahey G and Golub G (1996) Some large-scale matrix Kaelbling LP, Littman ML and Cassandra AR (1998) Planning computation problems. Journal of Computational and Applied and acting in partially observable stochastic domains. Artificial 74(1): 71–89. Intelligence 101(1): 99–134. Carlevaris-Bianco N, Kaess M and Eustice RM (2014) Generic Kaess M and Dellaert F (2009) Covariance recovery from a square node removal for factor-graph SLAM. IEEE Transactions on root information matrix for data association. Robotics and Robotics 30(6): 1371–1385. Autonomous Systems 57(12): 1198–1210. Carlone L, Censi A and Dellaert F (2014) Selecting good Kaess M, Johannsson H, Roberts R, Ila V,Leonard J and Dellaert measurements via L1 relaxation: A convex approach for F (2012) iSAM2: Incremental smoothing and mapping using robust estimation over graphs. In: IEEE/RSJ International the Bayes tree. The International Journal of Robotics Research Conference on Intelligent Robots and Systems (IROS). IEEE, 31: 217–236. pp. 2667–2674. Kaess M, Ranganathan A and Dellaert F (2008) iSAM: Incremen- Chaves SM, Walls JM, Galceran E and Eustice RM (2015) Risk tal smoothing and mapping. IEEE Transactions on Robotics aversion in belief-space planning under measurement acqui- 24(6): 1365–1378. sition uncertainty. In: IEEE/RSJ International Conference on Kim A and Eustice RM (2014) Active visual SLAM for robotic Intelligent Robots and Systems (IROS). IEEE, pp. 2079–2086. area coverage: Theory and experiment. The International Jour- Chli M and Davison AJ (2009) Active matching for visual track- nal of Robotics Research 34(4–5): 457–475. ing. Robotics and Autonomous Systems 57(12): 1173–1187. Kopitkov D and Indelman V (2016) Computationally efficient Davis T, Gilbert J, Larimore S and Ng E (2004) A column approx- decision making under uncertainty in high-dimensional state imate minimum degree ordering algorithm. ACM Transactions spaces. In: IEEE/RSJ International Conference on Intelligent on Mathematical Software 30(3): 353–376. Robots and Systems (IROS). IEEE, pp. 1793–1800. Davison A (2005) Active search for real-time vision. In: Interna- Kopitkov D and Indelman V (2017) Computationally efficient tional Conference on Computer Vision (ICCV), pp. 66–73. belief space planning via augmented matrix determinant Dellaert F (2012) Factor graphs and GTSAM: A hands-on intro- lemma and re-use of calculations. IEEE International Confer- duction. Technical Report GT-RIM-CP&R-2012-002, Georgia ence on Robotics and Automation (ICRA) and IEEE Robotics Institute of Technology. and Automation Letters (RA-L), Mutual Submission. Golub G and Plemmons R (1980) Large-scale geodetic least- Krause A, Singh A and Guestrin C (2008) Near-optimal sensor squares adjustment by dissection and orthogonal decomposi- placements in Gaussian processes: Theory, efficient algorithms tion. and Its Applications 34: 3–28. and empirical studies. Journal of Machine Learning Research Guivant J, Nieto J and Nebot E (2012) Victoria park dataset. 9: 235–284. Available at: http://www-personal.acfr.usyd.edu.au/nebot/vic Kretzschmar H and Stachniss C (2012) Information-theoretic toria_park.htm compression of pose graphs for laser-based SLAM. The Inter- Harville DA (1998) Matrix algebra from a statistician’s perspec- national Journal of Robotics Research 31(11): 1219–1230. tive. Technometrics 40(2): 164–164. Kschischang F, Frey B and Loeliger HA (2001) Factor graphs and Huang G, Kaess M and Leonard J (2012) Consistent sparsifica- the sum-product algorithm. IEEE Transactions on Information tion for graph optimization. In: Proceedings of the European Theory 47(2): 498–519. Conference on Mobile Robots (ECMR), pp. 150–157. Levine D and How JP (2013) Sensor selection in high- Huang S, Kwok N, Dissanayake G, Ha Q and Fang G (2005) dimensional gaussian trees with nuisances. In: Advances in Multi-step look-ahead trajectory planning in SLAM: Possi- Neural Information Processing Systems (NIPS), pp. 2211– bility and necessity. In: IEEE International Conference on 2219. Robotics and Automation (ICRA). IEEE, pp. 1091–1096. Mazuran M, Tipaldi GD, Spinello L and Burgard W (2014) Non- Ila V,Polok L, Solony M, Smrz P and Zemcik P (2015) Fast covari- linear graph sparsification for SLAM. In: Robotics: Science ance recovery in incremental nonlinear least square solvers. In: and Systems (RSS), pp. 1–8. IEEE International Conference on Robotics and Automation Mu B, Agha-mohammadi Aa, Paull L, Graham M, How J and (ICRA). IEEE, pp. 4636–4643. Leonard J (2015) Two-stage focused inference for resource- Ila V, Porta JM and Andrade-Cetto J (2010) Information-based constrained collision-free navigation. In: Robotics: Science and compact Pose SLAM. IEEE Transactions on Robotics 26(1): Systems (RSS). 78–93. Ouellette DV (1981) Schur complements and statistics. Linear Indelman V (2015a) Towards cooperative multi-robot belief space Algebra and its Applications 36: 187–295. planning in unknown environments. In: Proceedings of the Patil S, Kahn G, Laskey M, Schulman J, Goldberg K and Abbeel International Symposium on Robotics Research (ISRR). P (2014) Scaling up Gaussian belief space planning through 38 The International Journal of Robotics Research 00(0)

covariance-free trajectory optimization and automatic differen- and setting new diagonal entries with parameter θ, noting tiation. In: International Workshop on the Algorithmic Founda- that θ → ∞:   tions of Robotics (WAFR), pp. 515–533. 6 0 6Aug = . (85) Paull L, Huang G and Leonard JJ (2016) A unified resource- 0 θ · I constrained framework for graph SLAM. In: IEEE Interna- tional Conference on Robotics and Automation (ICRA). IEEE, Next, note that the inverse of 6Aug is given by the pp. 1–8. following expression: Pineau J, Gordon GJ and Thrun S (2006) Anytime point-based approximations for large POMDPs. Journal of Artificial Intel-   3 0 ligence Research 27: 335–380. (6Aug)−1 = , (86) Platt R, Tedrake R, Kaelbling L and Lozano-Pérez T (2010) Belief 0  · I space planning assuming maximum likelihood observations. In: Robotics: Science and Systems (RSS), Zaragoza, Spain, pp. . 1 where  = θ . Taking the limit  → 0 into account, we 587–593. can see that the above equation converges to 3Aug as was Prentice S and Roy N (2009) The belief roadmap: Efficient plan- defined above. Then, in the limit, we have that ( 3Aug)−1 = ning in belief space by factoring the covariance. The Interna- 6Aug. In addition, note that  → 0, even that it never tional Journal of Robotics Research 28(11–12): 1448–1465. Stachniss C, Grisetti G and Burgard W (2005) Information gain- becomes zero,  6= 0, thus if needed we can divide by  based exploration using Rao–Blackwellized particle filters. In: without worry. Aug Robotics: Science and Systems (RSS), pp. 65–72. Taking into account the limit of , expressing 3 Valencia R, Morta M, Andrade-Cetto J and Porta J (2013) Plan- through (86) will not change the problem definition. How- ning reliable paths with pose SLAM. IEEE Transactions on ever, such a model allows us to take the inverse of 3Aug: Robotics 29(4): 1050–1059.   Van Den Berg J, Patil S and Alterovitz R (2012) Motion planning 6 0 ( 3Aug)−1 = 6Aug = , (87) under uncertainty using iterative local optimization in belief 0 θ · I space. The International Journal of Robotics Research 31(11): 1263–1278. and therefore to use the generalized matrix determinant Vial J, Durrant-Whyte H and Bailey T (2011) Conservative sparsi- lemma (Harville, 1998): fication for efficient and consistent approximate estimation. In: IEEE/RSJ International Conference on Intelligent Robots and + Aug Aug T k Systems (IROS). IEEE, pp. 886–893. 3 = 3 · Im + A · 6 · A = 3 ·  Walls JM, Chaves SM, Galceran E and Eustice RM (2015) Belief T T · Im + Aold · 6 · Aold + θ · Anew · Anew (88) space planning for underwater cooperative localization. In: IEEE/RSJ International Conference on Intelligent Robots and m×n m×k where matrices Aold ∈ R and Anew ∈ R are Systems (IROS). IEEE, pp. 2264–2271. constructed from A by retrieving columns of only old n Zhu Z and Stein ML (2006) Spatial sampling design for prediction variables and of only new k variables, respectively (see with estimated parameters. Journal of Agricultural, Biological, and Environmental Statistics 11(1): 24–44. Figure 7). Zimmerman DL (2006) Optimal network design for spatial pre- Using the matrix determinant lemma once more, we diction, covariance parameter estimation, and empirical predic- obtain tion. Environmetrics 17(6): 635–652. + k T −1 3 = 3 ·  · 1 · Ik + θ · Anew · 1 · Anew (89)

Appendix . T where 1 = Im + Aold · 6 · Aold. A.1. Proof of Lemma 1 Moving  inside the last determinant term, we have

Problem definition: given a positive-definite and symmetric + T −1 3 = 3 · 1 ·  · Ik +  · θ · Anew · 1 · Anew (90) matrix 3 ∈ Rn×n (e.g. a prior information matrix) and its inverse 6 (prior covariance matrix), first 3 is augmented Recalling that  → 0 and  · θ = 1, we arrive at by k zero rows and columns and the result is stored in 3Aug. × + + Then we have matrix A ∈ Rm (n k) and calculate 3 = + T −1 3 = 3 · 1 · Anew · 1 · Anew . (91) 3Aug + AT · A (see Figure 1). We would like to express the + determinant of 3 in terms of 3 and 6. The augmented determinant ratio will be We start by modeling the matrix 3Aug through 6. By introducing k new variables, before adding any new con- 3+ T T T −1 straints involving these variables, we can say that new vari- = Im +Aold · 6 · A · A ·( Im +Aold · 6 · A ) · 3 old new old ables are uncorrelated with old variables, and their uncer- T −1 tainty is infinite (nothing yet is known about them). Then Anew = 1 · Anew · 1 · Anew . (92) the appropriate covariance matrix after augmentation, 6Aug, can just be created by adding k zero rows and columns to 6,  Dmitry Kopitkov et al. 39

A.2. Proof of Lemma 2 From (10) we can see that the partition of the posterior information matrix 3R can be calculated as For A’s structure given in (34), 1 from (32) will be k+L      R Aug,R T Bold T 11 0 3k+L = 3k + ARAR, (100) 1 = Im + · 6 · Bold 0 = , (93) 0 0 Im new Aug,R where 3k can be constructed by augmenting 3k with T M F conn U where 11 = Imconn + Bold · 6 · Bold, mconn = ( (a)), zero rows and columns in the number of Xnew’s dimension new and mnew = M( F (a)). (see Figure 8). The above equation has an augmented deter- Then we can conclude that minant form as defined in Section 3.3.1, and so the aug-

mented determinant lemma can be applied on it. Using (32) 1 = 11 (94) we have and that   3R −1 k+ L U T −1 U −1 11 0 = C · (Anew) ·C · Anew , (101) 1 = . (95) 3k 0 Imnew

Now, by exploiting the structure of Anew we obtain where C is defined in (38).     Next, dividing (101) by (36), we obtain  −1 T −1 T T 11 0 Bnew Anew · 1 · Anew = Bnew Dnew · · R R 0 Im Dnew 3 3k 3 new 6M,F = k+ L · = k+L T −1 T k+L 3 3 3 = Bnew · 11 · Bnew + Dnew · Dnew. (96) k k+L k+L (AU )T ·C−1 · AU Then we can conclude that the augmented determinant new new = T −1 , (102) lemma will be: Anew · C · Anew

3+ F T −1 T and posterior entropy of Xk+L is given by = 11 · B · 1 · B + D · D . (97) 3 new 1 new new new n · γ 1 J F (a) = F + ln (AU )T ·C−1 · AU  H 2 2 new new

1 T −1 − ln A · C · Anew . (103) A.3. Proof of Lemma 3 2 new Consider the scenario of focused augmented BSP where  F Note that the variables inside information matrices do not the focused set Xk+L contains only newly added variables as defined in Section 3.3.3.1, with an appropriate illustration have to be ordered in any particular way, and that the proof shown in Figure 8. provided above is correct for any ordering whatsoever. First, let us given an overview of the various partitions of Jacobian A that are relevant to our current problem A.4. Proof of Lemma 4 (Figure 8). Here A , A , IA , and ¬IA have been old new old old T −1 introduced in previous sections. Further, we can partition For A’s structure given in (34), the term Anew · C · Anew F from (103), similarly to (96), will be Anew into Anew columns of new variables that are focused F F RnF U Xnew ≡ Xk+L ∈ and Anew columns of new unfocused T −1 T −1 T U Anew · C · Anew = Bnew · C1 · Bnew + Dnew · Dnew, (104) variables Xnew. Considering the figure, the set of all unfo- R . U nR cused variables in Xk+L will be X ={Xold ∪X } ∈ R , k+L new where C1 is defined in (41). such that N = nF + nR, providing another A partition In the same way, we can conclude (see Figure 8) that U AR = [Aold, Anew]. Next, we partition the posterior information matrix 3 U T −1 U U T −1 U U T U k+L (Anew) ·C · Anew =( Bnew) ·C1 · Bnew+( Dnew) ·Dnew. F R respectively to the sets Xk+L and Xk+L defined above as (105)   Therefore, the posterior entropy of X F from (103) is R R,F k+L 3k+L 3k+L given by 3k+L = R,F T F . (98) ( 3k+L) 3k+L n · γ 1 J F (a) = F + ln ( BU )T ·C−1 · BU + 3U|F As was shown in (26), determinant of the marginal covari- H 2 2 new 1 new a F ance of Xk+L can be calculated through 1 − ln BT · C−1 · B + 3 , (106) 2 new 1 new a 3R M,F k+L 6k+L = . (99) T 3k+L where 3a = Dnew · Dnew is the information matrix of an U|F U T U action’s factor graph G(a), and where 3a =( Dnew) ·Dnew R U Now let us focus on the 3k+L term from the right-hand side. is the information matrix of variables Xnew conditioned 40 The International Journal of Robotics Research 00(0)

F F on Xnew ≡ Xk+L and calculated from the distribution We can see from the above partitions (107)–(109) that the R R represented by G(a). posterior information partition 3k+L of Xk+L is simply the  Rold augmentation of prior information partition 3k and can Note that the variables inside information matrices do not be calculated as have to be ordered in any particular way, and that the proof R Aug,Rold T provided above is correct for any ordering whatsoever. 3k+L = 3k + ARAR, (110)

Aug,Rold where 3k can be constructed by first taking the par- R A.5. Proof of Lemma 5 tition of the prior information matrix 3k related to Xold, Rold 0 Consider the scenario of focused augmented BSP where 3k , and augmenting it with n zero rows and columns (see F Figure 9), where n0 is just the number of newly introduced the focused set Xk+L contains only old variables, with appropriate illustration shown in Figure 9 and with various variables. The above equation has an augmented determi- partitions of Jacobian A defined in Section 3.3.3.2. nant form as defined in Section 3.3.1, and so the augmented First, let us look again over relevant partitions of Jaco- determinant lemma can be applied also here. Using (32) we bian A (Figure 9). The A , A , IA , and ¬IA were have old new old old 3R already introduced in previous sections. From the figure k+L T −1 = S · Anew · S · Anew , (111) we can see that ¬IA can further be separated into ¬IAU Rold old old 3k columns of old variables that are both not involved and R Rold −1 R T ¬I U ¬I F S = Im + A ·( 3 ) ·(A ) . (112) unfocused ( Xold) and Aold columns of old variables that old k old ¬I F I are both not involved and focused ( Xold). In addition, Aold Then by combining (99), (36), and the above equations, I U can be partitioned into Aold columns of old variables that we can see that I U I F are both involved and unfocused ( Xold), and Aold columns M,F R T −1 I F 6k+L 3k+L 3k S · Anew · S · Anew of old variables that are both involved and focused ( Xold) = · = , F M,F R T −1 = 6 3k+L old C · Anew · C · Anew (see Table 2). The set of focused variables is then Xk+L k 3k ¬I F I F RnF { Xold ∪ Xold} ∈ , containing both involved and not (113) F . F involved variables. We use the notation Xk+L = Xk to where C is defined in (38). remind us that the focused set of variables is part of both F Apparently the IG of Xk+L can be calculated as Xk+L and Xk. 1 1 Likewise, the set of all remained, unfocused variables F H F H F M,F M,F . JIG(a) = ( Xk ) − ( Xk+L) = ln 6k − ln 6k+L is X R = {¬IX U ∪ IX U ∪ X } ∈ RnR , containing all 2 2 k+L old old new new variables and some of the old ones (which can be 1 T −1 = ( ln C + ln Anew · C · Anew − ln S involved or not involved), and providing A’s partition AR = 2 ¬I U I U T −1 [ Aold, Aold, Anew]. Moreover, for the purpose of simplifica- − ln Anew · S · Anew ) , (114) tion of coming equations we will denote the set of old vari- R R R . ¬I U I U Next, the S term can be further reduced. It is clear that ables inside Xk+L by Xold, having that Xold ={ Xold ∪ Xold}, Rold −1 Rold|F R . ¬I U I U ( 3k ) = 6k , or namely the prior conditional covari- with appropriate Jacobian partition Aold = Aold, Aold . R F F R ance matrix of Xold conditioned on Xk . Moreover, due to the Next, noting that Xk = {Xk ∪ Xold} we can partition the R ¬I U sparsity of Aold (its sub-block Aold contains only zeros) we prior information matrix 3k, respectively, Rold|F " # will actually need only entries of matrix 6k that belong 3F 3F,Rold to variables involved in new terms of (6) (see Figure 9) and 3 = k k . (107) can conclude that k ( 3F,Rold )T 3Rold k k I U R Rold|F R T I U Xold|F I U T S = Im +Aold ·6 ·(Aold) = Im + Aold ·6 ·( Aold) . F R R . k k Similarly, due to Xk+L ={Xk ∪ Xold ∪ Xnew} and Xk+L = (115) R {Xold ∪ Xnew}, the posterior information matrix 3k+L can be  respectively partitioned into the next two forms: Note that the variables inside information matrices do not   have to be ordered in any particular way, and that the proof 3F 3F,Rold 3F,Xnew  k+L k+L k+L  provided above is correct for any ordering whatsoever.  F,Rold T Rold Rold,Xnew  3k+L = ( 3k+L ) 3k+L 3k+L ( 3F,Xnew )T ( 3Rold,Xnew )T 3Xnew  k+L k+L k+L A.6. Proof of Lemma 6 F F,R 3k+L 3k+L For A’s structure given in (34), the term S from (115) will = F,R T R (108) ( 3k+L) 3k+L be   BR with S = I + AR · 6Rold|F·(AR )T = I + old · 6Rold|F " # m old k old m 0 k   3Rold 3Rold,Xnew  R k+L k+L S1 0 3 = R ,X . (109) R T k+L ( 3 old new )T 3Xnew · ( Bold) 0 = , (116) k+L k+L 0 Imnew Dmitry Kopitkov et al. 41

R Rold|F R T U where S1 = Imconn + Bold · 6k ·( Bold) , mconn = From (75) we can see that Bnew has the following form: conn new M( F (a)) and mnew = M( F (a)). Then we can conclude that ( x ) ··· ( x ) k+1 k+L−1 S = S1 (117)   −I ··· 0 − 1 x x U 2  H k+1 ··· H k+L−1  and that Bnew = 9conn · 1 1     . . .  . .. . S−1 0 −1 1 Hxk+1 Hxk+L−1 S = , (118) no ··· no 0 Imnew   1 FU − 2 and similarly to (115) (see also Figure 9) we have that = 9conn · HU , Xnew .  FU = −I ··· 0 , S = I + BR · 6Rold|F·( BR )T   1 mconn old k old Hxk+1 Hxk+L−1 1 ··· 1 IX U |F I U old I U T U .  . .  = Im + B · 6 ·( B ) . (119) H =  . .. .  , (122) old k old Xnew . . . Hxk+1 Hxk+L−1 no ··· no T −1 Next, the term Anew · S · Anew from (114), similarly to (96), will be

Hxk+l . where i = 5x + hi is the Jacobian of hi with respect AT · S−1 · A = BT · S−1 · B + DT · D , (120) k l new new new 1 new new new to xk+l, and therefore is non-zero only if the factor’s obser- vation was taken from pose xk+l. Note that in the SLAM U unfocused with S1 defined in (119). case the Xnew (all new and variables) is Then, by applying equations (104), (117), (120), and {xk+1, . . . , xk+L−1}. F Similarly to (79), the term ( BU )T ·C−1 · BU from (44) notion C = C1 ,theIGof Xk+L ⊆ Xold from (114) can be new 1 new calculated as can be calculated as

1 J F (a) = ( ln C + ln BT · C−1 · B + DT · D IG 1 new 1 new new new  − 1 2 U T −1 U FU T HU T 2 ( Bnew) ·C1 · Bnew = ( ) ( X ) · 9conn − ln S − ln BT · S−1 · B + DT · D )  new  1 new 1 new new new 1 1 1 FU 2 −1 2 − 2 ·9conn · C · 9conn · 9conn · 1 T −1 2 HU = ( ln C1 + ln Bnew · C1 · Bnew + 3a  Xnew  2  FU − ln S − ln BT · S−1 · B + 3 ) , (121) = ( FU )T ( HU )T · C−1 · 1 new 1 new a Xnew 2 HU Xnew eU T −1 eU T =( Bnew) ·C2 · Bnew (123) where C1 is defined in (41), and where 3a = Dnew · Dnew is the information matrix of an action’s factor graph G(a).  where Note that the variables inside information matrices do not have to be ordered in any particular way, and that the proof provided above is correct for any ordering whatsoever. ( xk+1) ··· ( xk+L−1)   −I ··· 0   FU eU .  Hxk+1 ··· Hxk+L−1  A.7. SLAM solution: focus on the last pose Bnew = 1 1 = HU (124) F  . .. .  Xnew Xk+L ≡ xk+L . . . Hxk+1 Hxk+L−1 F no ··· no For Xk+L ≡ xk+L the focused entropy objective in the SLAM setting is given by (44). Here, we exploit the inner structure of Jacobian partitions in the SLAM scenario (see (71)–(75)) in order to provide a solution tailored specif- U ically to the SLAM domain. It will provide an illustrated contains the Jacobian entries of Bnew not weighted by the example of applying rAMDL to a real problem. factors’ noise 9conn. 42 The International Journal of Robotics Research 00(0)

U In addition, from (71) we can derive the structure of Dnew Using the above identity, the matrix S1 from (48) can also which is also used in (44): be reduced to the following form:

IX U |F − 1 − 1 I U old I U T 2 U 2 S1 = Imconn + Bold · 6k ·( Bold) = Imconn + 9conn Dnew = 9new ·   M,x F · 6 k · FT 0 − 1 ( xk+1) ( xk+2) ( xk+3) ( xk+4) · · · ( xk+L−2) ( xk+L−1) · k k k · 9 2 0 0 conn  F  "   # k+1 −I 0 0 · · · 0 0 M,x   − 1 F k FT  0 F −I 0 · · · 0 0  2 k · 6k · k 0  k+2  = 9conn · 9conn +  F   0 0 k+3 −I · · · 0 0  0 0    F   0 0 0 k+4 · · · 0 0  1 1 1   − 2 . − 2 − 2  ......  ·9conn = 9conn · S · 9conn, (129)  ......  2  ......       0 0 0 0 ··· −I 0      0 0 0 0 · · · Fk+L−2 −I M,xk T . Fk · 6 · F 0 F S = 9 + k k 0 0 0 0 · · · 0 k+L−1 2 conn 0 0   M,xk T − 1 6 + F · 6 · F 0 . 2 eU ω,k k k k = 9new · Dnew (125) = , (130) 0 9obs

where 6ω,k is the noise matrix from the motion model U|F (65), and matrix 9 is block-diagonal, combining all noise that due to its sparsity will allow fast calculation of 3a , obs matrices of F obs(a) factors:   3U|F =( DU )T ·DU =( eDU )T ·9−1 · eDU . (126) a new new new new new 6ω,k 0 9conn = . (131) 0 9obs Finally, placing all derived notation into (44), we obtain to the SLAM-specific solution for the entropy of the robot’s Further, let us define matrix S3: last pose: . F M,xk FT S3 = 6ω,k + k · 6k · k . (132)

Now we can see that S1’s determinant and inverse can be F nF · γ 1 eU T −1 eU eU T JH(a) = + ln ( Bnew) ·C2 · Bnew+( Dnew) calculated through 2 2 1 S S · 9 S ·9−1 · eDU − ln eBT · C−1 · eB 2 3 obs 3 new new new 2 new S1 = = = , (133) 2 9conn 9conn 6ω,k eT −1 e +D · 9 · Dnew , (127)   new new 1 1 1 S−1 0 1 S−1 = 9 2 · S−1 · 9 2 = 9 2 · 3 · 9 2 . 1 conn 2 conn conn 0 9−1 conn where C2 is defined in (77).  obs Note that the variables inside information matrices do not (134) T −1 have to be ordered in any particular way, and that the proof Similarly to (79), the term Bnew · S1 · Bnew from (47) can be calculated as provided above is correct for any ordering whatsoever.  T −1 FT HXnew T Bnew · S1 · Bnew = ( ) − 1 1 A.8. SLAM solution: focus on mapped landmarks 2 2 −1 ·9conn · 9conn · S2 F   X ≡ Lk 1 1 F k+L 2 − 2 ·9conn · 9conn · F HXnew The focused IG of Xk ≡ Lk in the SLAM setting is   given by (47). Here, we will exploit the inner structure of  F = FT ( HXnew )T · S−1 · Jacobian partitions in the SLAM scenario (see (71)–(75)) in 2 HXnew order to provide a solution tailored specifically to the SLAM    S−1 0 domain. It will provide an illustrated example of applying = FT ( HXnew )T · 3 · 0 9−1 rAMDL to areal problem.   obs F First, note that all old involved and unfocused vari- = FT · S−1 · F+( HXnew )T ·9−1 · HXnew , I U HXnew 3 obs ables Xold contain only the current robot’s pose xk. Thus, from (74) we can see that the relevant partition of Jacobian (135) B, the IBU used in (48), has the following inner structure: old where F and HXnew are defined in (75) as     F Hxk+1 ··· Hxk+L k  1 1   . .  . .  1  0  F = −I ··· 0 HXnew =  . .. .  I U − 2 , . . . . B = 9conn ·  .  . (128) old  .  Hxk+1 Hxk+L . no ··· no 0 (136) Dmitry Kopitkov et al. 43

FT −1 F Thus, we can see that ·S3 · from (135) is an L·np × where C2 is defined in (77). Note that matrix S3 will L · np matrix (np is the robot pose’s dimension and L is the be the same for all candidates. Therefore, the terms S3, FT −1 F horizon length) that has non-zero entries only at its np × np ln S3 , and · S3 · can be calculated only one time top left corner: and shared between the candidates thereafter. In addition, eT −1 e eT −1 e   the terms C2, B · C · Bnew, D · 9 · Dnew, and −1 new 2 new new S 0 ··· 0 ( HXnew )T ·9−1 · HXnew can be calculated efficiently through  3  obs  0 0 ··· 0 sparse matrix operators since we know the exact inner struc- FT −1 F   · S3 · = . . . . . (137)  . . .. . ture of all involved matrix operands. The overall complex- 0 0 ··· 0 ity of the above SLAM solution is the same as in (47), O( M( F conn(a))3 +n03).  Finally, placing all derived notation into (47), we arrive Note that the variables inside information matrices do not at the SLAM-specific solution for IG of already mapped have to be ordered in any particular way, and that the proof landmarks Lk: provided above is correct for any ordering whatsoever.

F 1 eT −1 e JIG(a) = ( ln C2 − ln 9conn + ln Bnew · C2 · Bnew 2 +eDT · 9−1 · eD − ln S new new new 3 + ln 6 − ln + FT · S−1 · F+( HXnew )T · ω,k 3 9−1 · HXnew + eDT · 9−1 · eD ) obs new new new

1 eT −1 e eT −1 = ln C2 + ln Bnew · C2 · Bnew + Dnew · 9new 2 ·eD new − ln S − ln FT · S−1 · F+( HXnew )T ·9−1 · 3 3 obs

HXnew eT −1 e + Dnew · 9new · Dnew − ln 9obs , (138)