IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. XX, NO. XX, XXXX 2021 1

Efficient Pairwise Neuroimage Analysis using the Soft Jaccard Index and 3D Keypoint Sets Laurent Chauvin, Kuldeep Kumar, Christian Desrosiers, William Wells III and Matthew Toews

Abstract— We propose a novel pairwise distance mea- relationships, and also characteristics shared across the popu- sure between image keypoint sets, for the purpose of large- lation including racial ancestry, sex, hereditary disease status, scale medical image indexing. Our generalizes the etc. [2], [3]. The brain is the center of cognition and a Jaccard index to account for soft set equivalence (SSE) be- tween keypoint elements, via an adaptive kernel framework complex organ, tightly coupled to the genetic evolution of modeling uncertainty in keypoint appearance and geom- animals and in particular that of the human species. To what etry. A new kernel is proposed to quantify the variability degree is the human brain phenotype coupled to the underlying of keypoint geometry in location and scale. Our distance genotype? How does the brain image manifold vary locally 2 measure may be estimated between O(N ) image pairs in with genotype, i.e. immediate relatives sharing 25-100% of O(N log N) operations via keypoint indexing. Experiments report the first results for the task of predicting family their polymorphic genes, or between broad groups defined by relationships from medical images, using 1010 T1-weighted subtle genetic factors such as racial ancestry or sex? MRI brain volumes of 434 families including monozygotic Large, publicly available databases allow us to investigate and dizygotic twins, siblings and half-siblings sharing these questions from aggregated MRI and genetic data, e.g. 100%-25% of their polymorphic genes. Soft set equivalence 1000+ subjects [4]. The shape of the brain has been modeled and the keypoint geometry kernel improve upon standard hard set equivalence (HSE) and appearance kernels alone as lying on a smooth manifold in high dimensional MRI data in predicting family relationships. Monozygotic twin iden- space [5], [6], where phenotype can be described as a smooth tification is near 100%, and three subjects with uncertain deformation conditioned on developmental factors including genotyping are automatically paired with their self-reported the environment. However the brain is also naturally described families, the first reported practical application of image- as a rich collection of spatially localized neuroanatomical based family identification. Our distance measure can also be used to predict group categories, sex is predicted with structures, including common structures such as the basal an AUC=0.97. Software is provided for efficient fine-grained ganglia shared across the population, but also highly specific curation of large, generic image datasets. patterns such as cortical folds that may only be observable in specific individuals or close family members [7]. Index Terms— neuroimage analysis, invariant keypoints, MRI, family prediction. The keypoint representation is an intuitively appealing means of modeling specific, localized phenomena, i.e. a set of descriptors automatically identified at salient image locations I.INTRODUCTION as shown in Figure 1. A keypoint set can be viewed as EALTH treatment is increasingly personalized, where an element of a high-dimensional manifold, and medical H treatment decisions are conditioned on patient-specific imaging applications such as regression or classification can information in addition to knowledge regarding the general be formulated in terms of a suitable geodesic distance between population [1]. Modern genetic testing allows us, based on sets. As keypoint sets are variable sized, typical metrics based large libraries of human DNA samples, to cheaply predict on fixed-length vectors such as L-norms [5], [6] do not readily patient-specific characteristics, including immediate family

arXiv:2103.06966v3 [eess.IV] 17 Sep 2021 apply. Distances defined based on set overlap measures as the Jaccard index [8] have proven to be effective in recent studies Manuscript submitted March 15, 2021. This work was supported by NIH grant P41EB015902 (NAC), the Quebec (FRQNT) New Re- investigating genetics and brain MRI [9], [10]. For example, searchers Startup Program and the Canadian National Sciences and by defining set equivalence in terms of nearest neighbor (NN) Research Council (NSERC) Discovery Grant. Data were provided by keypoint correspondences, the Jaccard distance may be used the Human Connectome Project, WU-Minn Consortium (Principal Inves- tigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded to predict pairwise relationships. Nevertheless, the assumption by the 16 NIH Institutes and Centers that support the NIH Blueprint of binary or hard set equivalence (HSE) between keypoints is for Neuroscience Research; and by the McDonnell Center for Systems a crude approximation given probabilistic uncertainty inherent Neuroscience at Washington University. L. Chauvin, C. Desrosiers and M. Toews are with the Ecole to natural image structure, and is ill-defined for variable sized de Technologie Superieure, Montreal, QC H3C 1K3, Canada datasets where the number of NN correspondences may be (email: [email protected], [email protected], variable or unknown a-priori. [email protected]). K. Kumar is with CHU Sainte-Justine Research Center, Uni- Our primary contribution is a new pairwise distance measure versity of Montreal, Montreal, QC H3T 1C5, Canada (email: for 3D keypoint sets, generalizing the Jaccard index to account [email protected]). for soft set equivalence (SSE) between keypoint elements. SSE W. Wells is with the Harvard Medical School, Boston, MA 02115, USA and the Massachusetts Institute of Technology, Boston, MA 02139, USA is achieved via a kernel framework for keypoint similarity, and (email: [email protected]). we introduce a new kernel that normalizes pairwise keypoint 2 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. XX, NO. XX, XXXX 2021

A f f B set intersection or overlap, e.g. the Jaccard distance [8] i j first proposed in [24] or the Tanimoto distance [25], [26]. In medical image analysis, the Jaccard distance is typically used to assess pixel-level segmentation accuracy [27], and has proven highly effective for assessing pairwise neuroimage similarity from 3D SIFT keypoint data [9], [12], where hard set equivalence is determined by NN keypoint descriptor cor- respondences. Set theory is predicated on binary equivalence between set elements, which is difficult to justify in the case of keypoints extracted from image data. Soft set theory was been developed to account for non-binary equivalence [28]– [30], where set operations including intersection and union are defined in terms of non-binary equivalence ranging from [0,1]. Fig. 1. Illustrating two sets of 3D keypoints A = {fi} and B = {fj } Jaccard-like distance was used for retrieving near-duplicate 2D extracted from MRI brain volumes of monozygotic twins. Colored circles represent the 3D locations x¯ ∈ R3 and scales (radii) σ ∈ R1 of photos [31], where the ratio of soft intersection and union keypoints in 2D axial slices, and links are correspondences identified operators was used to account for document frequency. Our via the SIFT-Rank [11] algorithm, note the degree of similarity between work here extends the Jaccard distance proposed in [10] keypoint pairs. to include keypoint geometry in 3D Euclidean coordinates, and compares to the HSE Jaccard measure used in [9], [12], showing a significant improvement in predicting displacement by keypoint scale, accounting for localization family relationships. uncertainty in scale-space. Our work extends and compares The keypoint methodology has a number of advantages, against keypoint-based neuroimaging analysis methods [9], such as robustness to occlusion or missing correspondences, [10], [12], specifically the keypoint signatures approach [10], invariance to translation, scaling, rotation and intensity con- currently the only method reporting perfect accuracy at identi- trast variations between images. Our work adopts the 3D fying scans of the same individual in large public neuroimage SIFT-Rank [11] representation, a robust, general tool used MRI datasets, e.g. OASIS [13], ADNI [14] and HCP [15]. for a variety of imaging tasks, including registration [11], Here, following preliminary work [16], we demonstrate that [32], whole body segmentation [33], classification [23], [34], the soft Jaccard distance is specific enough to predict family regression [35], without the need for context-specific training relationships from medical images, and we present the first procedures or data. As in the original 2D SIFT approach, results for this new task using 1010 MRIs of 434 families keypoint geometry is represented as a location x¯ and scale from the HCP [15] dataset. The entire method is based on σ, these are detected as extrema of a difference-of-Gaussian highly efficient keypoint indexing, and scales to large datasets scale-space [20], as argmax |I(¯x, σ) − I(¯x, κσ)|, ap- of volumetric image data. x,σ¯ proximating the Laplacian-of-Gaussian [21], where I(¯x, σ) = I(¯x) ∗ N (¯x, σ2) represents the image convolved with the II.RELATED WORK Gaussian kernel N (¯x, σ2). The geometry of a keypoint may Our work is motivated both by the study of the link thus be characterised as an isotropic Gaussian density centered between genotype and the human brain phenotype from large on x¯ with variance σ2 representing spatial extent σ in 3D Eu- datasets [17], and by practical applications such as maintaining clidean space. Rotationally symmetric Gaussian filtering and accurate patient records in hospital Picture Archiving and uniformly sampled derivative operators lead to scale and rota- Communication Systems (PACS). We adopted a memory- tion invariance. The 3D SIFT-Rank descriptor is a 64-element based learning model where all data are stored in memory histogram of rank-ordered oriented gradients (HOG) [11], and accessed via highly-specific keypoint queries. Memory- sampled about a scale-normalized reference frame centered on based learning requires no explicit training procedure [18], and x¯ and quantized uniformly into 2x2x2=8 spatial locations x 8 approaches Bayes optimal performance as the number of data orientations. Rank transformation provides invariance to arbi- N becomes large [19]. 3D SIFT keypoints [11] derived from trary monotonic variations of descriptor element values [36]. Gaussian scale-space theory [20], [21] are invariant to global The 3D SIFT-Rank keypoint method we use is a robust similarity transforms and contrast variations, and thus account and well-established baseline, however our distance measure for misalignment and scanner variability between images. may generally be used with any keypoint representation. The Efficient indexing reduces the quadratic O(N 2) complexity of search for new keypoint detectors and descriptors remains an nearest neighbor (NN) keypoint lookup to O(N log N), e.g. active research focus, primarily in the context of 2D computer via O(log N) KD-tree indexing [22], ensuring that the method vision. We refer the reader to an extensive literature review of scales gracefully to large datasets, e.g. 7500 brain MRIs [10] traditional hand-crafted solutions [37] and a recent discussion or 20000 lung CTs [23]. of 2D SIFT matching technology [38]. Deep learning keypoint Our method seeks a pairwise distance measure between methods have been investigated, examples include LIFT [39], 3D keypoint sets. As different images generally contain dif- DISK [40], LF-Net [41], SuperPoint [42]. Nevertheless, vari- ferent numbers of elements, a natural choice for pairwise ants of the original SIFT histogram descriptor, including SIFT- distance between variable sized sets are measures based on Rank [36], DSP-SIFT [43] or RootSIFT [44] remain compet- CHAUVIN et al.: EFFICIENT PAIRWISE NEUROIMAGE ANALYSIS USING THE SOFT JACCARD INDEX AND 3D KEYPOINT SETS 3

itive with descriptors derived from training [45], [46], partic- |A| = |A∩A|, |B| = |B∩B| represent set cardinality operators ularly for non-planar objects [38], while requiring no training that count the numbers of elements present in each set, where and few hyperparameters. Learning-based approaches often |A ∩ B| ≤ min{|A|, |B|}. The Jaccard index is a similarity rely on hand-crafted keypoint detectors to generate training measure, and may be used to define various distance measures data [39], [42]. The challenges of developing deep keypoint including the Jaccard distance metric [8] 1 − JHSE(A, B) or architectures include training of individual components (i.e. the Tanimoto distance − log JHSE(A, B). keypoint detector, orientation estimator, descriptor), achieving In the case of sets of real data samples, for example keypoint invariance to rotation or scale changes, the need to retrain for descriptors, binary equivalence may be difficult to establish different imaging modalities and body parts, and avoiding bias due to noise or uncertainty in the measurement process, and we due to training data procedures [47]. seek to redefine JHSE(A, B) in Equation (1) such that it more In terms of neuroimage analysis, brain fingerprinting meth- accurately accounts for non-binary equivalence between set ods have been used to investigate the variability of individuals elements. This may generally be accomplished by redefining based on neuroimage-specific preprocessing and data, e.g. the standard set intersection cardinality operator |A ∩ B| brain segmentations from structural MRI [48], [49], neu- in Equation (1) by an analogous soft cardinality operator roanatomic parcellations and functional MRI [50], fiber tracts µ(A∩B), leading to the following expression for our proposed from diffusion MRI [51]. None of these methods have been Jaccard index based on soft set equivalence JSSE(A, B) sufficiently specific to achieve perfect accuracy at individual µ(A ∩ B) identification, and none have been used for the more diffi- JSSE(A, B) = . (2) cult task of predicting family members. Heritability studies µ(A) + µ(B) − µ(A ∩ B) have investigated correlations in MRIs of siblings, including In Equation (2), JSSE(A, B) is defined by the soft structural [52], [53] and functional MRI [54], however these cardinality operator µ(A ∩ B) as described in the following are not designed for prediction. In contrast, the keypoint section, including cardinality operators µ(A) = µ(A ∩ A), methodology we adopt may be applied to generic imaging µ(B) = µ(B ∩ B). In order to ensure that JSSE remains data with no preprocessing, representing distinctive patterns bounded to the range [0, 1], it is important that µ(A ∩ B) potentially present only in images of specific individuals be upper bounded by the minimum of the individual soft set or immediate family members. Our work here extends and cardinalities µ(A ∩ B) ≤ min {µ(A), µ(B)}. compares against the keypoint-based neuroimage signatures method [10], the only method reporting perfect accuracy Defining µ(A∩B): The cardinality of the soft set intersection at same-subject image prediction from standard neuroimage µ(A ∩ B) is intended to reflect the uncertainty in equivalence datasets (e.g. OASIS [13], ADNI [14], HCP [15]). between set elements fi ∈ A and fj ∈ B. We define Our results here are the first in the literature for the task soft equivalence between elements fi ∈ A and fj ∈ B of family member prediction from volumetric medical image as a similarity function or kernel K(fi, fj) ranging from data. Family member prediction has been investigated in the [0, 1], where K(fi, fj) = 1 indicates exact equivalence and context of 2D face photographs, based on clustering [55], K(fi, fj) = 0 represents the absence of equivalence. We then conditional random fields [56], deep neural networks [57], define a measure µ(A → B) over a mapping A → B from discriminant analysis [58]. These solutions are not directly set A to set B as: related to our work, as they generally address aspects and |A| challenges specific to the context of 2D photography, e.g. X µ(A → B) = max K(fi, fj). (3) fj ∈B facial expression and lighting conditions [59], multiple images i per person, group photos of relatively few families (e.g. 16) [56]. Accuracy is not generally available for siblings based The similarity function K(fi, fj) in Equation (3) may on genotyping, and near 100% accuracy for monozygotic twin generally be defined according to the specific representa- identification from face photographs has not been reported. tion of set elements fi and fj. The maximum operator maxfj ∈B K(fi, fj) ensures the bound µ(A → B) ≤ µ(A) III.METHOD via a partial injective mapping from A to B, embodying the intuition that each element f ∈ A has at most one We seek a pairwise distance measure between two sets A = i counterpart f ∈ B. Note also that max operator may be {f } and B = {f } that can be used to estimate proximity and j i j computed via computationally efficient NN indexing methods. thus genetic relationships between subjects from image data. The soft intersection cardinality µ(A ∩ B) may be defined as We first describe our method in terms of general set theory, a symmetrized version of µ(A → B): then later include details pertaining to set elements fi defined as 3D image keypoints. Our measure begins with the Jaccard µ(A ∩ B) = min { µ(A → B), µ(B → A) } , index or intersection-over-union JHSE(A, B) defined as  |A| |B|  X X  |A ∩ B| = min max K(fi, fj), max K(fj, fi) . JHSE(A, B) = , (1) fj ∈B fi∈A |A| + |B| − |A ∩ B|  i j  (4) where in Equation (1), JHSE(A, B) is defined by binary or hard set equivalence (HSE) between set elements. A ∩ B is Note that the Jaccard distance measure derived using the the intersection operator between sets A and B, and |A ∩ B|, cardinality operator µ(A ∩ B) as defined by Equations (3) 4 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. XX, NO. XX, XXXX 2021

and (4) is symmetric and follows the identity of indiscernibles, These kernels are defined as follows and thus may be considered to be a semi-metric. The triangle  2  kx¯i − x¯jk2 inequality holds in the case of binary equivalence [8] but K(x ¯i, x¯j; σi, σj) = exp − (9) σiσj not generally for soft equivalence relationships. Equation (4)    2 σi may be used to define the cardinality of a single set K(σi, σj) = exp − log (10) A as µ(A) = µ(A ∩ A) = |A| both for hard and soft σj equivalence since, since for a given element fi ∈ A, Kernel K(x ¯i, x¯j; σi, σj) in Equation (9) penalizes the squared max K(f , f ) = K(f , f ) = 1. The standard Jaccard fj ∈A i j i i distance between keypoint coordinates within a local reference index in Equation (1) is thus a special case of the soft Jaccard frame, normalized by a variance proportional to the product of index in the case of a binary-valued kernel K(fi, fj) ∈ {0, 1}. keypoint scales σiσj. This variance embodies uncertainty in keypoint location due to scale, and has a computational form Defining K(fi, fj ): The soft set intersection cardinality that is reminiscent of mass in Newton’s law of gravitation µ(A ∩ B) may be adapted to a specific task by defining or electric charge magnitude in Coulomb’s law. Figure 2 K(fi, fj) according to the representation of elements fi. In the demonstrates how this kernel variance normalizes higher local- context of this article, a set element fi is a 3D scale-invariant ization error associated with keypoints of larger scale. Kernel keypoint fi = {a¯i, g¯i} as described in [11], where g¯i and a¯i K(σi, σj) in Equation (10) penalizes multiplicative difference are descriptors of local keypoint geometry and appearance, between keypoint scales (σi, σj). respectively. Keypoint geometry g¯i = {x¯i, σi} consists of 3D location x¯i and scale σi, and appearance a¯i is a vector of local image information, i.e. a rank-ordered histogram of oriented gradients (HOG) [36].

The kernel K(fi, fj) operates on keypoint elements fi = {a¯i, g¯i} ∈ A and fj = {a¯j, g¯j} ∈ B. Here we relax the assumption of hard equivalence using squared exponential kernels with non-zero support, factored into kernels operating separately on local keypoint appearance K(a ¯i, a¯j) and geom- etry K(g ¯i, g¯j) variables:

K(fi, fj) = K(a ¯i, a¯j)K(g ¯i, g¯j) (5) The factorization in Equation (5) is due to the use of descrip- tors a¯i that are invariant to 7-parameter similarity transforms of the 3D image coordinate system from which geometry g¯i is derived. The two kernels in Equation (5) are defined as squared exponential functions as follows. The appearance kernel K(a ¯ , a¯ ) is defined by the squared Fig. 2. Graph of spatial localization error kx¯ − x¯ k vs. geometric i j √ i j 2 2 mean scale σ σ . Lines represent the linear regression based on Euclidean distance ka¯i − a¯jk2 between appearance vectors a¯i i j each point set. Note how location error increases with scale (black) due and a¯j: to uncertainty, however this effect is reduced by scale normalization  ka¯ − a¯ k2  (green) as in Equation (9). This visualization is based on keypoint K(a ¯ , a¯ ) = exp − i j 2 (6) correspondences between 10 MRI aligned volumes. i j α2 where in Equation (6), α is a bandwidth parameter that may Note that while the appearance kernel in Equation (6) is be estimated adaptively as invariant to global similarity transforms due to descriptor 2 2 invariance, the geometry kernel in Equation (8) measures zero- α = minka¯i − a¯jk2, s.t. ka¯i − a¯jk2 > 0, (7) fj ∈Ω mean keypoint displacement and thus requires data to be aligned within a common spatial reference frame. Alignment the minimum Euclidean distance between appearance descrip- may be established via standard subject-to-atlas registration tor a¯ ∈ A and the nearest descriptor a¯ ∈ Ω \ A within i j prior to keypoint extraction or from keypoint correspondences the entire available dataset Ω excluding A. Note this choice after extraction using feature-based alignment [11]. The fol- of estimator is not strictly symmetric, however it allows lowing workflow is used to compute pairwise distances for a the kernel to adapt to arbitrary dataset sizes, shrinking the set of N images: resolution of prediction as the number of data grows large, and does not affect the symmetry of Equation 4. 1) Extract keypoints from each image The geometry kernel K(g ¯i, g¯j) is novel to this work, and is 2) Align keypoints to an atlas template (optional for pre- defined as the product of two kernels, one modeling keypoint aligned images) location conditional on keypoint scale K(x ¯i, x¯j; σi, σj) and 3) Identify k-NN correspondences (a ¯i, a¯j) minimizing the the K(σi, σj) modeling scale alone Euclidean distance between descriptors ka¯i − a¯jk 4) Compute pairwise image distances by evaluating kernels K(g ¯i, g¯j) = K(x ¯i, x¯j; σi, σj)K(σi, σj) (8) from k-NN correspondences CHAUVIN et al.: EFFICIENT PAIRWISE NEUROIMAGE ANALYSIS USING THE SOFT JACCARD INDEX AND 3D KEYPOINT SETS 5

Figure 3 shows the effect of our proposed geometry kernel A. Data and Computational Details in regularizing and favoring geometrically plausible correspon- Our data set consists of MRI scans of N = 1010 unique dences for an example twin sibling pair. subjects from the Human Connectome Project (HCP) Q4 release [4], aged 22-36 years (mean 29 years), acquired from Appearance Kernel a diverse population including 468 males and 542 females, and 434 unique families. The MRI data are provided as T1w volumes at isotropic 0.7mm voxel resolution and preprocessed via a standard neuroimaging pipeline, including rigid subject- to-atlas registration and skull-stripping. Keypoint extraction requires approximately 3 sec. / per image and results in an average of 1,400 keypoints per image for a total of 1,488,065 keypoints. We use a GPU implementation [60] of the 3D SIFT- Rank keypoint algorithm [11] that produces identical keypoints at a 7× speedup. Approximate kNN correspondences between Appearance+Geometry Kernel appearance descriptors are identified across the entire database using efficient KD-tree indexing [22], where lookup requires 0.8 sec. / subject for k=200 nearest neighbors on an i7- [email protected] machine with 16 GB RAM (1.64 GB used). Given N=1010 images, there are a total of N(N − 1)/2 = 509545 pairwise relationships to be evaluated. Each sub- ject pair is assigned by one of five possible relationship labels L = {MZ,DZ,FS,HS,UR} for monozygotic twins (MZ), dizygotic twins (DZ), full non-twin siblings (FS), half- siblings (HS) and unrelated non-siblings (UR), for totals Fig. 3. Visualizing the keypoint correspondences driving classification of {134, 71, 607, 44, 508689} relationships per label. Family between a pair of identical twin brains, showing the 100 correspon- relationship labels are based on mother and father identity and dences with the highest SSE kernel values. The appearance kernel alone (upper, Equation (6)) may include geometrically inconsistent zygoticy (for twins), confirmed via genome-wide single nu- matches (diagonal lines), however these are down weighted when com- cleotide polymorphism (SNP) genotyping [61]. Note the sparse bined with the geometry kernel (lower, Equations (9) and (10)), leading structure of family relationship labels, where each sibling label to geometrically consistent matches (horizontal lines) and improved prediction performance. (MZ, DZ, FS, HS) is unique to one of 434 families and the vast majority of pairs are UR (0.998 = 508689/509545). The nominal probability of randomly guessing the correct family is approximately 0.0023 = 1/434, and the challenge is IV. EXPERIMENTS thus to efficiently and accurately identify the small number Experiments investigate the ability of the Jaccard distance to of family relationships. Note that evaluating the similarity predict pairwise relationships between whole-brain MRI scans, of N(N − 1)/2 pairs via brute force image matching or where relationships include 1) close genetic links between registration becomes computationally intractable for large N. siblings sharing 25-100% of their polymorphic genes and Image alignment is fundamental to the functioning of our 2) broad genetic links between non-siblings sharing nominal proposed geometry kernel, which measures zero-mean devia- genetic information due to sex and common racial ancestry. tions in keypoint location within a common spatial reference We hypothesize that closer genetic proximity will be reflected frame. We thus evaluated prediction following two different in higher pairwise similarity and thus lower Jaccard distance. subject-to-atlas registration methods: the original 3D rigid We expect that pairwise distance based on our proposed transform provided by the HCP mapping images to the stan- soft set equivalence (SSE) dJ (A, B) = − log JSSE(A, B) as dard MNI atlas using the FLIRT algorithm [62], [63], and defined in Equations (2) and (4) and our geometry kernel feature-based alignment (FBA) [11] using keypoint correspon- as defined by Equations (9) and (10) will lead to improved dences to estimate a 3D similarity transform mapping images identification of pairwise relationship labels. to a subject atlas, for 20 different randomly selected atlases. Prediction is based on pairwise Jaccard distance and a leave- All alignment solutions were very close to the reference MNI one-out protocol, analogous to querying a subject MRI in a alignment (rotation differed by 4.31◦ ± 2.84, translation by hospital PACS. Note that this is a deterministic procedure with 3mm ± 3.5), reflecting a degree of inter-subject variability no explicit training stage, where the primary hyperparameter given different atlases. Most notably, however, was that AUC is the number of nearest neighbors k per keypoint. Our values were virtually identical for all prediction trials, with computational workflow, as previously described, follows three a maximum standard deviation of  = 0.0017 (see Table I), steps 1) 3D SIFT-Rank keypoints are extracted from individual indicating that prediction was not sensitive to the alignment pre-processed images, 2) kNN correspondences are identified solution used. The results we report are thus consistent with between keypoints from all N images, and 3) Jaccard distances the reference MNI alignment provided with the HCP data. are evaluated between all image pairs from kernels and kNN correspondences. 6 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. XX, NO. XX, XXXX 2021

B. Close genetic proximity: Siblings pairs of unrelated individuals from different populations may Figure 4 shows distributions of pairwise Jaccard distance often exhibit higher genetic similarity than those from the conditioned on pairwise relationships. Our proposed kernel same population [64]. based on appearance and geometry (Figure 4, green) generally Age difference between subjects is a potential confounding increases separation between different sibling relationships in factor in whole brain Jaccard distance, and we plotted the comparison to appearance only (Figure 4, orange) as in [10]. variation of distance vs. age difference in Figure 7. Distance A two-tailed Kolmogorov-Smirnoff (KS) test shows all dis- distributions are virtually identical across the HCP subject tributions to be significantly different (p − value < 1e − 10) age range spanning 22-36 years of age, indicating that age except those of DZ and FS siblings sharing approximately 50% difference is not a major confound in this relatively young of their genes (p − value = 0.0199). Furthermore, pairwise and healthy HCP cohort where brain morphology is relatively KS statistics were all lower for appearance+geometry kernels stable. Note that the Jaccard distance was found to increase compared to appearance only, indicating that the geometry with age difference in older subjects due to natural aging and kernels lead to increased separation of relationship categories. neurodegenerative disease in [10]. The Jaccard distance may be rapidly evaluated, and pairs which are outliers from their expected distributions may be D. Group Prediction: Sex easily flagged and inspected for irregularities. For example, the While the pairwise Jaccard distance is highly informative highest Jaccard distance outliers indicated noticeable spatial regarding sibling relationships, it is insufficient for predicting misalignment between a small number of subject pairs (e.g. group labels such as sex, age or disease. A simple modification Figure 4a. The lowest distance outliers for UR pairs may can be used to predict group labels, by evaluating the distance indicate potential sibling relationships. The blue circles near between a single keypoint set A and supersets formed by the Figure 4b are pairs involving 3 subjects flagged as UR due to union of group members (excluding subject A), e.g. keypoints genotyping errors (including a MZ twin) [61]. Here, nearest for all Males dJ (A, M) and all Females dJ (A, F). Supersets neighbor Jaccard distance was used to correctly predict the M and F are solely composed of unrelated subjects to avoid self-reported families for all three cases, a result confirmed by potential biases due to family relationships. These distances the HCP. are combined in a basic linear classifier with a single threshold As sibling pairs exhibit significantly lower distance than parameter τ to adjust for differences in the numbers of UR pairs, we investigate the degree to which they can be keypoints per group based on the following equation: distinguished from unrelated pairs based on a simple distance ( F emale if d (A, F) − d (A, M) + τ > 0 threshold. Figure 5 shows the Receiver Operating Character- Class(A) = J J istic (ROC) curves for sibling relationships based on distance, Male otherwise. comparing our SSE kernels for appearance and geometry, ap- (11) pearance only and the binary HSE kernel. Table I quantifies the Figure 8 shows ROCs curves for sex prediction obtained improvement of SSE vs HSE, for various numbers of keypoint by varying τ over the range (−∞, ∞), comparing Jaccard nearest neighbors (20, 100, 200), where the highest area- distance computed with binary, appearance only and combined under-curve (AUC) values are obtained for SSE with k=200. appearance and geometry kernels. A possible confound is The performance of SSE kernels generally increases with the brain size, which is on average slightly larger for males number of NNs, and is always superior for combined ap- than females [65]. As the appearance kernel is invariant pearance and geometry kernels. Conversely, HSE classification to image scale, the AUC=0.93 reflects prediction accuracy performance decreases, even falling below the diagonal in the independently of image size. Our proposed combined approach most challenging case of HS pairs in Figure 5d, as additional achieves the highest AUC=0.97, again outperforming other hard correspondences appear to accumulate systematically on options. a subset of typical but unrelated subjects. For completeness, the highest UR prediction result was AUC= 0.951 for SSE, V. DISCUSSION appearance and geometry kernels, k=200. In this paper we propose a novel, highly specific distance measure between volumetric images represented as invariant C. Distant genetic proximity: unrelated subjects keypoints. Our measure generalizes the Jaccard index to ac- Unrelated (UR) subject pairs share nominal amounts of ge- count for soft equivalence between keypoints, and a novel ker- netic information, with subtle similarities due to demographic nel estimator is proposed to model keypoint geometry in terms factors such as common sex and ancestral race. We thus expect of location and scale within normalized image space. The soft whole-brain distance to be lowest for pairs of the same race Jaccard index is used as a distance measure to predict pairwise and sex (R,S), highest for different race and sex (R,S), and relationships between MRI brain scans of 1010 subjects from intermediate for either same race (R,S) or sex (R,S). While 434 families, including siblings and twins, and significantly the mean distances for conditional distributions in Figure 6 improves upon previous work based on binary equivalence generally increase with differences in demographic labels, and appearance descriptors [10], [12]. We report the first there are many exceptions where pairs with different labels results for predicting relationships from medical images, a new exhibit lower distance than those with the same labels. We task, where monozygotic twins can be identified with virtually note this is consistent with pairwise genetic differences, where perfect accuracy. A minor modification allows the Jaccard CHAUVIN et al.: EFFICIENT PAIRWISE NEUROIMAGE ANALYSIS USING THE SOFT JACCARD INDEX AND 3D KEYPOINT SETS 7

(a) Siblings (Relative Shared DNA %) Shared Siblings (Relative

(b)

-log JSSE (A,B)

Fig. 4. Distributions of Jaccard distances − log JSSE (A, B) conditioned on pairwise labels L = {MZ,DZ,FS,HS,UR}, comparing kernels with and without geometry (green vs. orange). Note that the distance generally increases with genetic separation, and that geometry kernels increase separation between sibling relationship labels. Outliers, i.e. pairs outside of their expected relationship distance distributions, can be easily identified and inspected for irregularity, e.g. (a) unusually high distances may indicate spatial misalignment (red circles) and (b) unusually low UR distances may indicate cases of incorrect family labels (blue circles).

Fig. 5. ROC curves for sibling identification based on Jaccard distance for (a) MZ, (b) DZ, (c) FS and (d) HS pairs, comparing three kernels for k=200 NNs. Note that the prediction AUC is always highest for our proposed kernel combining appearance and geometry (green), and decreases as expected with the degree genetic and developmental difference, e.g. in the order MZ, DZ, FS, HS. distance to predict group labels such as sex with high accuracy. trials involving various linear subject-to-atlas alignment so- Our geometry kernel requires spatial normalization, however lutions including robust keypoint-based alignment [11] show 8 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. XX, NO. XX, XXXX 2021

Kernel k = 20 k = 100 k = 200

Label HSE [9] SSEApp.[10] SSEApp. + Geo. HSE [9] SSEApp.[10] SSEApp. + Geo. HSE [9] SSEApp.[10] SSEApp. + Geo. Monozygotic 0.9983 0.9993 0.9996 ±  0.9544 0.9995 0.9998 ±  0.9272 0.9994 0.9999 ±  Dizygotic 0.8922 0.8825 0.9044 ±  0.8018 0.9034 0.9250 ±  0.7541 0.9181 0.9391 ±  Full-Sibling 0.8423 0.8433 0.8753 ±  0.7611 0.8569 0.8888 ±  0.7358 0.8680 0.8989 ± 

TABLE I AREA UNDERTHE CURVE (AUC) VALUES FOR SIBLING CLASSIFICATION, COMPARINGKERNELSANDNUMBERSOFNEARESTNEIGHBORS (K).  = 0.0017 ISTHEMAXIMUM AUC STANDARD DEVIATION OBSERVED FROM 21 DIFFERENT SUBJECT-TO-ATLAS REGISTRATION TRIALS.NOTE SSEApp+Geo KERNELS SIGNIFICANTLY OUTPERFORM SSEApp AND HSE FOR THE MORE DIFFICULT DZ AND FS PAIRS.

(a)

(b)

-log JSSE (A,B)

Fig. 6. Jaccard distance − log JSSE (A, B) distributions between unrelated pairs conditioned on shared demographic information. (a) same race, same sex (R, S); different race, same sex (R,¯ S); same race, different sex (R, S¯), and different race, different sex (R,¯ S¯) and (b) Female- Female (F,F ), Male-Male (M,M), Male-Female (M,F ). Image pairs corresponding to minimum and maximum distances for each distribution are shown for visualization. that the prediction performance to be insensitive to the specific distance may thus be used to rapidly validate relationship alignment solution used. The improved prediction afforded by labels, e.g. unexpectedly low distance may indicate related our geometry kernel thus suggests that family members tend individuals and unexpectedly high distance may indicate un- to align in a consistent manner regardless of the solution used. related individuals or spatially misaligned pairs. Previous work Our method promises to be a useful tool for curating large identified mislabelled scans of individuals in large neuroimage medical image datasets for precision medicine and research datasets [10], here our improved method allowed us predict the purposes. A memory-based model using efficient and robust self-reported families of three subjects labelled as unrelated algorithms for 3D keypoint extraction and indexing [11], [66] due to inconclusive genotyping [61], on the basis of nearest allows for fine-grained comparisons between O(N 2) image neighbor soft Jaccard distance. This exceptional result was pairs in O(N log N) computational complexity. The Jaccard confirmed by the Human Connectome Project, and serves as a CHAUVIN et al.: EFFICIENT PAIRWISE NEUROIMAGE ANALYSIS USING THE SOFT JACCARD INDEX AND 3D KEYPOINT SETS 9

9 9 domain-specific keypoint models [39], [42]. Auto-encoders used for anomaly detection could potentially be adapted to 8.5 8.5 generate subject-specific codes between pairs of images [67]. Analysis beyond pairs of subjects could be generalized via 8 8 graph theory to model the clique structure of families [68]. All code required to reproduce our results may be obtained at 7.5 7.5 https://github.com/3dsift-rank. ( A , B ) 7 7 J SSE

d REFERENCES 6.5 6.5 [1] M. A. Hamburg and F. S. Collins, “The path to personalized medicine,”

-log J 6 6 New England Journal of Medicine, vol. 363, no. 4, pp. 301–304, 2010. [2] G. J. Annas and S. Elias, “23andme and the fda,” New England Journal 5.5 5.5 of Medicine, vol. 370, no. 11, pp. 985–988, 2014. [3] A. Auton et al., “A global reference for human genetic variation,” 5 5 Nature, vol. 526, no. 7571, p. 68, 2015. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 [4] D. Van Essen et al., “The Human Connectome Project: A data acquisi- tion perspective,” NeuroImage, vol. 62, no. 4, pp. 2222–2231, 2012. Δt (years) [5] S. Gerber, T. Tasdizen, P. Thomas Fletcher, S. Joshi, R. Whitaker, and Alzheimer’s Disease Neuroimaging Initiative (ADNI), “Manifold Fig. 7. Distributions of Jaccard distance − log JSSE (A, B) condi- modeling for brain population analysis.” Medical Image Analysis (MIA), tioned on age difference ∆t between unrelated pairs (A, B) condi- vol. 14, no. 5, pp. 643–53, 2010. tioned on age difference ∆t. The mean (solid red line) and standard [6] T. Brosch, P. Gmbh, I. Technologies, and R. Tam, “Manifold Learning deviation (dashed red lines) are plotted for each ∆t. Age difference has of Brain MRIs by Deep Learning Manifold Learning of Brain MRIs,” no significant impact on the distance for unrelated healthy young adult International Conference on Medical Image Computing and Computer- brains (age 22-36 years), eliminating a possible confounding factor. Assisted Intervention (MICCAI), vol. 16, no. September, 2013. [7] M. Ono, S. Kubik, and C. D. Abernathey, Atlas of the cerebral sulci. Thieme Medical Publishers, 1990. [8] M. Levandowsky and D. Winter, “Distance between sets,” Nature, vol. 234, no. 5323, p. 34, 1971. [9] M. Toews and W. Wells, “How are siblings similar? How similar are siblings? Large-scale imaging genetics using local image features,” in International Symposium on Biomedical Imaging (ISBI). IEEE, 4 2016, pp. 847–850. [10] L. Chauvin et al., “Neuroimage signature from salient keypoints is highly specific to individuals and shared by close relatives,” NeuroImage, vol. 204, no. 20, 2020. [11] M. Toews and W. Wells, “Efficient and robust model-to-image alignment using 3D scale-invariant features,” Medical Image Analysis (MIA), vol. 17, no. 3, pp. 271–282, 2013. [12] K. Kumar, M. Toews, L. Chauvin, O. Colliot, and C. Desrosiers, “Multi- modal brain fingerprinting: a manifold approximation based framework,” NeuroImage, 2018. [13] D. S. Marcus, T. H. Wang, J. Parker, J. G. Csernansky, J. C. Morris, and R. L. Buckner, “Open Access Series of Imaging Studies (OASIS): cross- sectional MRI data in young, middle aged, nondemented, and demented older adults,” Journal of cognitive neuroscience, vol. 19, no. 9, pp. 1498– 1507, 2007. [14] C. R. Jack Jr et al., “The Alzheimer’s disease neuroimaging initiative Fig. 8. ROC curves and AUC for sex prediction based on individual- (ADNI): MRI methods,” Journal of the International Society for Mag- to-group Jaccard distance, comparing binary, appearance only, and netic Resonance in Medicine (ISMRM), vol. 27, no. 4, pp. 685–691, combined appearance and geometry kernels. 2008. [15] D. C. Van Essen et al., “The WU-Minn human connectome project: an overview,” NeuroImage, vol. 80, pp. 62–79, 2013. concrete example as to how our method can be used to validate [16] L. Chauvin, K. Kumar, C. Desrosiers, J. De Guise, W. Wells, and M. Toews, “Analyzing Brain Morphology on the Bag-of-Features Mani- labels associated with medical imaging data. fold,” in International Conference on Information Processing in Medical The task of pairwise relationship prediction may be viewed Imaging (IPMI), vol. 11492 LNCS. Springer Verlag, 2019, pp. 45–56. as the finest grain of categorization, where the number of [17] M. R. Sabuncu et al., “Morphometricity as a measure of the neu- roanatomical signature of a trait,” Proceedings of the National Academy unique pairwise relationship labels is linear O(N) in the of Sciences (PNAS), vol. 113, no. 39, pp. E5749–E5756, 2016. number of data N, e.g. 434 families from N=1010 images, [18] O. Boiman, E. Shechtman, and M. Irani, “In defense of nearest-neighbor as opposed to typical classification where all data are as- based image classification,” in Proceedings of the IEEE Conference on and Pattern Recognition (CVPR). IEEE, 2008, pp. sociated with a small number of labels (e.g. male, female). 1–8. Pairwise prediction thus represents a challenge for ubiquitous [19] T. Cover and P. Hart, “Nearest neighbor pattern classification,” IEEE deep neural network methods, which generally require large Transactions on Information Theory (TIT), vol. 13, no. 1, pp. 21–27, 1967. numbers of training data per category, and training for specific [20] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” modalities and body parts. In contrast, generic 3D SIFT International Journal of Computer Vision (IJCV), vol. 60, no. 2, pp. 91– keypoints may be used as-is with arbitrary imaging modalities 110, 2004. [21] T. Lindeberg, “Feature Detection with Automatic Scale Selection,” and contexts with no training. In future work, generic key- International Journal of Computer Vision (IJCV), vol. 30, no. 2, pp. point correspondences here could potentially be used to train 79 – 116, 1998. 10 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. XX, NO. XX, XXXX 2021

[22] M. Muja and D. Lowe, “Scalable nearest neighbor algorithms for high in Proceedings of the IEEE Conference on Computer Vision and Pattern dimensional data,” IEEE Transactions on Pattern Analysis and Machine Recognition (CVPR), 2017. Intelligence (TPAMI), pp. 2227–2240, 2014. [46] J. L. Schonberger,¨ H. Hardmeier, T. Sattler, and M. Pollefeys, “Com- [23] M. Toews, C. Wachinger, R. S. J. Estepar, and W. Wells, “A Feature- parative Evaluation of Hand-Crafted and Learned Local Features,” in Based Approach to Big Data Analysis of Medical Images.” International Proceedings of the IEEE Conference on Computer Vision and Pattern Conference on Information Processing in Medical Imaging (IPMI), Recognition (CVPR), 2017. vol. 24, pp. 339–350, 2015. [47] R. Geirhos, C. Michaelis, F. A. Wichmann, P. Rubisch, M. Bethge, and [24] E. Marczewski and H. Steinhaus, “On a certain distance of sets and W. Brendel, “Imagenet-trained CNNs are biased towards texture; in- the corresponding distance of functions,” in Colloquium Mathematicum, creasing shape bias improves accuracy and robustness,” in International vol. 6, no. 1. Institute of Mathematics Polish Academy of Sciences, Conference on Learning Representations (ICLR), 2019. 1958, pp. 319–327. [48] C. Wachinger et al., “BrainPrint: A discriminative characterization of [25] D. Bajusz, A. Racz,´ and K. Heberger,´ “Why is Tanimoto index an brain morphology,” NeuroImage, vol. 109, pp. 232–248, 2015. appropriate choice for fingerprint-based similarity calculations?” Journal [49] S. A. Valizadeh, F. Liem, S. Merillat,´ J. Hanggi,¨ and L. Jancke,¨ “Iden- of Cheminformatics, vol. 7, no. 1, 2015. tification of individual subjects on the basis of their brain anatomical [26] D. J. Rogers and T. T. Tanimoto, “A computer program for classifying features,” Scientific reports, vol. 8, no. 1, p. 5611, 2018. plants,” Science, vol. 132, no. 3434, pp. 1115–1118, 1960. [50] E. S. Finn et al., “Functional connectome fingerprinting: identifying [27] Y. Yuan, M. Chao, and Y. C. Lo, “Automatic Skin Lesion Segmentation individuals using patterns of brain connectivity.” Nature neuroscience, Using Deep Fully Convolutional Networks with Jaccard Distance,” IEEE vol. 18, no. 11, pp. 1664–71, 2015. Transactions on Medical Imaging (TMI), vol. 36, no. 9, 2017. [51] K. Kumar, C. Desrosiers, K. Siddiqi, O. Colliot, and M. Toews, [28] D. Molodtsov, “Soft set theory—first results,” Computers & Mathematics “Fiberprint: A subject fingerprint based on sparse code pooling for white with Applications, vol. 37, no. 4-5, pp. 19–31, 1999. matter fiber analysis,” NeuroImage, vol. 158, pp. 242–259, 2017. [29] J. H. Park, O. H. Kim, and Y. C. Kwun, “Some properties of equivalence [52] P. M. Thompson et al., “Genetic influences on brain structure,” Nature soft set relations,” Computers & Mathematics with Applications, vol. 63, neuroscience, vol. 4, no. 12, p. 1253, 2001. no. 6, pp. 1079–1088, 2012. [53] S. J. van der Lee et al., “Gray matter heritability in family-based [30] A. Gardner, J. Kanno, C. A. Duncan, and R. Selmic, “Measuring and population-based studies using voxel-based morphometry,” Human Distance Between Unordered Sets of Different Sizes,” in Proceedings Brain Mapping (HBM), vol. 38, no. 5, 2017. of the IEEE Conference on Computer Vision and Pattern Recognition [54] G. L. Colclough et al., “The heritability of multi-modal connectivity in (CVPR), 2014, pp. 137–143. human brain activity.” eLife, vol. 6, p. e20178, 2017. [31] O. Chum, J. Philbin, and A. Zisserman, “Near Duplicate Image De- [55] M. R. Kiley and M. S. Hossain, “Who are my family members? a tection: min-Hash and tf-idf Weighting,” in The British Machine Vision solution based on image processing and machine learning,” International Conference (BMVC), 2008. Journal of Image and Graphics, vol. 20, no. 04, p. 2050033, 2020. [32] I. Machado et al., “Non-rigid registration of 3D ultrasound for neuro- [56] Q. Dai, P. Carr, L. Sigal, and D. Hoiem, “Family member identification 2015 IEEE Winter Conference on Applica- surgery using automatic feature detection and matching,” International from photo collections,” in tions of Computer Vision Journal of Computer Assisted Radiology and Surgery (IJCARS), vol. 13, . IEEE, 2015, pp. 982–989. no. 10, pp. 1525–1538, 2018. [57] B. Ahmad, M. Usama, J. Lu, W. Xiao, J. Wan, and J. Yang, “Deep con- volutional neural network using triplet loss to distinguish the identical [33] C. Wachinger, M. Toews, G. Langs, W. Wells, and P. Golland, “Key- twins,” in 2019 IEEE Globecom Workshops (GC Wkshps). IEEE, 2019, point transfer for fast whole-body segmentation,” IEEE Transactions on pp. 1–6. Medical Imaging (TMI), 2018. [58] F. Juefei-Xu and M. Savvides, “An augmented linear discriminant [34] M. Toews, W. Wells, D. L. Collins, and T. Arbel, “Feature-based mor- analysis approach for identifying identical twins with the aid of fa- phometry: Discovering group-related anatomical patterns,” NeuroImage, cial asymmetry features,” in Proceedings of the IEEE Conference on vol. 49, no. 3, pp. 2318–2327, 2010. Computer Vision and Pattern Recognition (CVPR), 2013. [35] M. Toews, W. M. Wells, and L. Zollei,¨ “A feature-based developmental [59] J. R. Paone et al., “Double trouble: Differentiating identical twins by face International Conference model of the infant brain in structural mri,” in recognition,” IEEE Transactions on Information Forensics and Security on Medical Image Computing and Computer-Assisted Intervention . (TIFS), vol. 9, no. 2, 2014. Springer, 2012, pp. 204–211. [60] E.´ Pepin, J.-B. Carluer, L. Chauvin, M. Toews, and R. Harmouche, [36] M. Toews and W. Wells, “Sift-rank: Ordinal description for invariant “Large-scale unbiased neuroimage indexing via 3d gpu-sift filtering and feature correspondence,” in Proceedings of the IEEE Conference on keypoint masking,” in Machine Learning in Clinical Neuroimaging and Computer Vision and Pattern Recognition (CVPR), 2009, pp. 172–177. Radiogenomics in Neuro-oncology. Springer, 2020, pp. 108–118. [37] D. Mukherjee, Q. M. Jonathan Wu, and G. Wang, “A comparative [61] H. WU-Minn, “1200 subjects data release reference manual,” 2017. experimental study of image feature detectors and descriptors,” Machine [Online]. Available: https://www.humanconnectome.org Vision and Applications, vol. 26, no. 4, pp. 443–466, 2015. [62] M. F. Glasser et al., “The minimal preprocessing pipelines for the [38] F. Bellavia and C. Colombo, “Is there anything new to say about sift Human Connectome Project,” NeuroImage, vol. 80, pp. 105–124, 2013. matching?” International Journal of Computer Vision, vol. 128, no. 7, [63] M. Jenkinson, P. Bannister, M. Brady, and S. Smith, “Improved Opti- 2020. mization for the Robust and Accurate Linear Registration and Motion [39] K. M. Yi, E. Trulls, V. Lepetit, and P. Fua, “LIFT: Learned invariant Correction of Brain Images,” NeuroImage, pp. 825–841, 2002. feature transform,” in Proceedings of the European Conference on [64] D. J. Witherspoon et al., “Genetic similarities within and between human Computer Vision (ECCV), vol. 9910 LNCS. Springer Verlag, 2016, populations,” Genetics, vol. 176, no. 1, pp. 351–359, 2007. pp. 467–483. [65] L. Eliot, A. Ahmed, H. Khan, and J. Patel, “Dump the “dimorphism”: [40] M. J. Tyszkiewicz, P. Fua, and E. Trulls, “DISK: Learning local features Comprehensive synthesis of human brain studies reveals few male- with policy gradient,” in Advances in Neural Information Processing female differences beyond size,” Neuroscience & Biobehavioral Re- Systems (NeurIPS), 2020. views, 2021. [41] Y. Ono, E. Trulls, P. Fua, and K. Moo Yi, “LF-Net: Learning Local [66] M. Toews, L. Zollei,¨ and W. Wells, “Feature-based alignment of volu- Features from Images,” in Advances in Neural Information Processing metric multi-modal images,” in International Conference on Information Systems (NeurIPS), 2018. Processing in Medical Imaging (IPMI), vol. 23. Springer, 2013, pp. [42] D. Detone, T. Malisiewicz, and A. Rabinovich, “SuperPoint: Self- 25–36. Supervised Interest Point Detection and Description,” in Proceedings [67] C. Baur, S. Denner, B. Wiestler, N. Navab, and S. Albarqouni, “Autoen- of the IEEE Conference on Computer Vision and Pattern Recognition coders for unsupervised anomaly segmentation in brain MR images: A (CVPR) Workshops, 2018. comparative study,” Medical Image Analysis (MIA), vol. 69, 2021. [43] J. Dong and S. Soatto, “Domain-size pooling in local descriptors: DSP- [68] J. R. Blair and B. Peyton, “An introduction to chordal graphs and clique SIFT,” Proceedings of the IEEE Conference on Computer Vision and trees,” in Graph theory and sparse matrix computation. Springer, 1993, Pattern Recognition (CVPR), 2015. pp. 1–29. [44] R. Arandjelovic´ and A. Zisserman, “Three things everyone should know to improve object retrieval,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012. [45] V. Balntas, K. Lenc, A. Vedaldi, and K. Mikolajczyk, “HPatches: A benchmark and evaluation of handcrafted and learned local descriptors,”