<<

Preprint (accepted) / Advances in Cognitive Systems (2018) CoDesign Lab EU. www.codesign-lab.org

SEMANTIC ANALYSIS OF (REFLECTIONAL) VISUAL SYMMETRY

A Human-Centred Computational Model for Declarative Explainability

JAKOB SUCHAN [email protected] MEHUL BHATT SRIKRISHNA VARADARAJAN CoDesign Lab EU / Cognitive Vision www.codesign-lab.org / www.cognitive-vision.org Orebro¨ University, Sweden — University of Bremen, Germany

SEYED ALI AMIRSHAHI Norwegian Colour and Visual Computing Laboratory Norwegian University of Science and Technology, Norway

STELLA YU ICSI Vision Group, UC Berkeley, United States

ABSTRACT We present a computational model for the semantic interpretation of symmetry in naturalistic scenes. Key features include a human-centred representation, and a declarative, explainable inter- pretation model supporting deep semantic question-answering founded on an integration of meth- ods in knowledge representation and deep learning based computer vision. In the backdrop of the visual arts, we showcase the framework’s capability to generate human-centred, queryable, rela- tional structures, also evaluating the framework with an empirical study on the human perception of visual symmetry. Our framework represents and is driven by the application of foundational, integrated Vision and Knowledge Representation and Reasoning methods for applications in the arts, and the psychological and social sciences. arXiv:1806.07376v2 [cs.CV] 14 Sep 2018 1. Introduction

Visual symmetry as an aesthetic and stylistic device has been employed by artists across a spectrum of creative endeavours concerned with visual imagery in some form, e.g., painting, photography, architecture, film and media design. Symmetry in (visual) art and beyond is often linked with ele- gance, beauty, and is associated with attributes such as being well-proportioned and well-balanced (Weyl, 1952). Closer to the “visual imagery” and “aesthetics” centred scope of this paper, symmetry has been employed by visual artists going back to the masters Giorgione, Titian, Raphael, da Vinci, and continuing till the modern times with Dali and other contemporary artists (Figure1).

c Cognitive Vision http://www.cognitive-vision.org (all rights reserved). — LATEX Style adapted from ACS 2018 Template. J. Suchan, M. Bhatt, S. Varadarajan, A. Amirshahi, S. Yu

(a) (b)

Figure 1: The perception of symmetry. (a) Symmetry perception influenced by visual features, conceptual categories, se- mantic layering, and nuances of individual differences in perception, and (b) examples for symmetry in visual arts: “Delivery of the Keys” (ca.1481) by Perugino, “The Last Supper” (1495-98) by Leonardo Da Vinci, “View of the grand staircase at La Rinascente in Rome, designed by Franco Albini and Franca Helg” (1962) by Giorgio Casali, and “The Matrix” (1999) by The Wachowski Brothers.

Visual Symmetry: Perception and Semantic Interpretation There exist at least four closely related points of view pertaining to symmetry, namely, the physical, mathematical, pyschological, and aesthetical points of view (Molnar & Molnar, 1986). As Molnar & Molnar(1986) articulate:

“But perceptual symmetry is not always identical to the symmetry defined by the math- ematicians. A symmetrical picture is not necessarily symmetrical in the mathematical sense...Since the aesthetical point of view is strictly linked to the perceptive system, in examining the problems of aesthetics we find ourselves dealing with two distinct groups of problems: (1) the problem of the perception of symmetry; (2) the aesthetical effect of the perception of a symmetrical pattern.”

Indeed, the high-level semantic interpretation of symmetry in naturalistic visual stimuli by humans is a multi-layered perceptual phenomena operating at several interconnected cognitive levels involv- ing, e.g., spatial organisation, visual features, semantic layers, individual differences (Section 2.1; and Figure 1a). Consider the select examples from movie scenes in Figure2:

• in the shot from “2001: A Space Odyssey” (Figure 2a) a centre-perspective is being applied for staging the scene. The symmetry here is obtained by this, as well as by the layout of the room, the placement of the furniture, and the decoration of the room. In particular, the black obelisk in the centre of the frame is emphasising the centre-perspective regularly used by Kubrick, with the bed (and person) being positioned directly on the central axis.

2 Human-Centred Explainability of Visuo-Spatial Symmetry

(a) (b) (c)

Figure 2: Symmetrical structure in visual arts. Select scenes from films: (a) “2001: A Space Odyssey” (1968) by Stanley Kubrick, (b) “The Royal Tenenbaums” (2001) by , and (c) “The Big Lebowski” (1998) by Joel and Ethan Coen.

• Wes Anderson is staging his shot from “The Royal Tenenbaums” (Figure 2b) around a central point, but unlike Kubrick’s shot, Anderson focuses on the people involved in it. Even though the visual appearance of the characters differs a lot, the spatial arrangement and the semantic similarity of the objects in the shot creates symmetry. Furthermore, the gazing direction of the characters, i.e., people on the right facing left and people on the left facing right, adds to the symmetrical appearance of the shot.

• In “The Big Lebowski” (Figure 2c), Joel and Ethan Coen use symmetry to highlight the surreal character of a dream sequence; the shot in Figure 2c uses radial symmetry composed of a group of dancers, shot from above, moving around the centre of the frame in a circular motion. This is characterised by moving entities along a circular path and centre-point, and the perceptual similarity in the appearance of the dancers.

The development of computational cognitive models focussing on a human-centred –semantic, ex- plainable– interpretation of visuo-spatial symmetry presents a formidable research challenge de- manding an interdisciplinary —mixed-methods— approach at the interface of cognitive science, vision & AI, and visual perception focussed human-behavioural research. Broadly, our research is driven by addressing this interdisciplinarity, with an emphasis on developing integrated KR-and- vision foundations for applications in the psychological and social sciences, e.g., archival, automatic annotation and pre-processing for qualitative analysis, studies in visual perception. Key Contributions The core focus of the paper is to present a computational model with the capa- bility to generate semantic, explainable interpretation models for the analysis of visuo-spatial sym- metry. The explainability is founded on a domain-independent, mixed qualitative-quantitive repre- sentation of visuo-spatial relations based on which the symmetry is declaratively characterised. We also report on a qualitative evaluation with human-subjects, whereby human subjects rank their sub- jective perception of visual symmetry for a set stimuli using (qualitative) distinctions. The broader implications are two-fold: (1) the paper demonstrates the integration of vision and semantics, i.e., knowledge representation and reasoning methods with low-level (deep learning based) visual pro- cessing methods; and (2). from an applied viewpoint, the developed methodology can serve as the technical backbone for assistive and analytical technologies for visual media studies, e.g., from the viewpoint of psychology, aesthetics, cultural heritage.

3 J. Suchan, M. Bhatt, S. Varadarajan, A. Amirshahi, S. Yu

beacon 1.34174 tower 1.32203 beacon 1.72568 ImageNet structure 1.03539 Classi cation tower 1.70033 structure 1.01955 Object-Semantics

Cosine Similarity w dist h

CNNs Spatial-Con guration Image Elements Region Proposals Parallel Object Detection Pose Estimation

Image-Features

Figure 3: A computational model of multi-level semantic symmetry.

2. The Semantics of Symmetry

Symmetry in visual imagery denotes that an image is invariant to certain types of transformation of the image, e.g., reflectional symmetry is the case where the image does not change, when it is mirrored along a specific symmetry-axis. Besides reflectional symmetry, there are various types of symmetry, including rotational symmetry, and translational symmetry. Perfect symmetry can be easily detected based on image level features, by comparing pixel in the image; however, in natural images, e.g., coming from the visual arts, perfect symmetry is a very rare case and mostly variations of symmetry are used as a stylistic device, with it being present only in some parts of the image. To address this, we focus on developing a semantic model capable of interpreting symmetrical structures in images.

2.1 A Multi-Level Semantic Characterisation From the viewpoint of perceptual and aesthetic considerations, key aspects for interpreting visual- spatial symmetry (in scope of the approach of this paper) include (S1–S4; Figure1):

(S1) Spatial organisation: High-level conceptual categories identifiable from geometric construc- tions by way of arbitrary shapes, relative orientation and placement, size of geometric entities, relative distance, and depth (S2) Visual features: Low-level visual features and artefacts emanating directly from color, texture, light, and shadow (S3) Semantic layers: Semantic-spatial layering and grouping based on natural scene character- istics involving, for instance, establishing foreground-background, clustering based on conceptual similarity, relative distance, and perceived depth, and application of commonsense knowledge pos- sibly not directly available in the stimulus (S4) Individual differences: Grounding of the visual features in the socio-cultural semiotic land- scape of the perceiver (i.e., contextual and individualised nuances in perception and sensemaking).

4 Human-Centred Explainability of Visuo-Spatial Symmetry

We develop a multi-level characterisation of symmetry aimed at analysing (reflectional) symmetry. Visual symmetry —in this paper— encompasses three layers (L1–L3; Figure3): L1. Symmetrical (spatial) composition: Spatial arrangement of objects in the scene with respect to a structural representation of a wrt. position, size, orientation, etc.; L2. Perceptual similarity: Perceptual similarity of features in symmetrical image patches, based on the low-level feature based appearance of objects, e.g., colour, shape, patterns, etc.; L3. Semantic similarity: Similarity of semantic categories of the objects in symmetrical image patches, e.g., people, object types, and properties of these objects, such as peoples gazing direction, foreground / background etc. The proposed characterisation serves as the foundation for analysing and interpreting symmetrical structures in the images; in particular it can be used to identify the elements of the image supporting the symmetrical structure, but also those parts of the image that are not in line with the symmetry, e.g., elements breaking the symmetry. This may be used for investigating the use of balance and in-balance in visual arts, and for analysing how this can be used to guide peoples attention in the context of visual saliency.

2.2 A Model of Reflectional Symmetry For the computational model presented in this paper (Figure3), we focus on reflectional symmetry in the composition of the image based on layers L1–L3 (Section 2.1), i.e., we investigate image prop- erties based on spatial configuration, low-level feature similarity, and semantic similarity. Towards this we extract image elements E1 2 3 = {e0, ..., en} of the image: ∪ ∪ (E1) Image patches are extracted using selective search as described in Uijlings et al.(2013); re- sulting in structural parts of the image, potential objects and object parts;

(E2) People and objects are detected in the image using YOLO object detection (Redmon et al., 2016);

(E3) Human body pose consisting of body joints and facing direction is extracted using human pose estimation (Cao et al., 2017). Potential symmetrical structures in the image are defined on the image elements E using a model of symmetry to identifying pairs of image elements (symmetry pairs) as well as single elements that are constituting a symmetrical configuration. We consider compositional structure (C1) of images, and similarity (C2) of constituent elements, in particular perceptual similarity in the low-level features, and semantic similarity of objects and regions. The resulting model of symmetrical structure in the image consists of a set of image elements, and the pair-wise similarity relations between the elements.

(C1) Compositional Structure Symmetrical composition in the case of reflectional symmetry consists of symmetrically arranged pairs of image elements, where one element is on the left and one is on the right of the symmetry

5 J. Suchan, M. Bhatt, S. Varadarajan, A. Amirshahi, S. Yu

axis, and single centred image elements, which are placed on the symmetry axis. To model this, we represent the extracted image elements as spatial entities, i.e. points, axis-aligned rectangles, and line-segments and define constraints on the spatial configuration of the image elements, using the following spatial properties of the spatial entities:

• position: the centre-point of a rectangle or position of a point in x, y coordinates; • size: the width and height of a rectangle w, h; • aspect ratio: the ratio r between width and height of a rectangle; • distance: euclidian distance d between two points p and q; • rotation: the yaw, pitch, and roll angles between two line-segments in 3D space.

Symmetrical Spatial Configuration We use a set of spatial relations holding between the image elements to express their spatial configuration; spatial relations (e.g., left, right, and on)1 holding between points and lines describe the relative orientation of image elements with respect to the symmetry axis. Towards this, we use the relative position (rel-pos) of an image element with respect to the symmetry axis, which is defined as the distance to the symmetry axis and the y coordinate of the element. — Image Patches and Objects Symmetrical configuration of image elements is defined based on their spatial properties using the following two rules. In the case of a single element e the centre of the rectangle has to be on the symmetry axis.

symmetrical(e) ⊃ orientation(on, position(e), symmetry-axis). (1)

In the case of pairs of elements ei and ej these have to be on opposite sites of the symmetry axis, and have same size and aspect ratio, further the position of ei and ej has to be reflected.

symmetrical(pi, pj) ⊃

orientation(left, position(pi), symmetry-axis)∧

orientation(right, position(pj), symmetry-axis)∧ (2)

equal(aspect-ratio(pi), aspect-ratio(pj))∧

equal(size(pi), size(pj)) ∧ equal(rel-pos(pi), rel-pos(pj)).

The model of symmetry serves as a basis for analysing symmetrical structures and defines the at- tributes that constitute a symmetrical configuration. Additionally to this basic definition of symmet- rical configuration of arbitrary image elements, we define rules for symmetry in the placement and layout of humans in the images.

1. The semantics of spatial relations is based on specialised polynomial encoding as suggested in Bhatt et al.(2011) within constraint logic programming (CLP) (Jaffar & Maher, 1994); CLP is also the framework being used to demon- strate Q/A later in this section.

6 Human-Centred Explainability of Visuo-Spatial Symmetry

symmetry axis

image_patch(pj)

image_patch(pi) image_patch(pk)

center(pk) center(pj)

center(pi)

Figure 4: Symmetric composition for pairs of image patches, and centering of single image patches.

— Human Body Pose is given by a set of joints j, represented as points, i.e. pose = {j0, ..., jn}. The pose can be either symmetrical within itself, or two people can be arranged in a symmetrical way. Symmetrical body pose is analysed by defining joint pairs JP = {(jk, jl), ..., (jm, jn)}, such as (left shoulder, right shoulder), (left elbow, right elbow), etc. and compare the relative position of these pairs with respect to the centre of the person cp.

symmetrical(pose(p)) ⊃ ∀(jk, jl) equal(rel-pos(jk, cp), rel-pos(jl, cp)) (3)

Accordingly, pose of two persons is analysed by defining joint pairs associating each joint of one person to the corresponding joint of the other person, e.g., the left hand of person 1 gets associated to the right hand of person 2. Further we define symmetrical facing directions of two people based on the rotation of their heads. Towards this we use the yaw, pitch, and roll angles of a persons head hp, relatively to a front facing head, and define that the facing direction is symmetrical if the pitch rotation is the same, and the yaw, and roll rotation are opposite.

symmetrical(facing dir(p1), facing dir(p2)) ⊃ (4) equal(pitch(hp1 ), pitch(hp2 )) ∧ equal(yaw(hp1 ), −yaw(hp2 ) ∧ equal(roll(hp1 ), −roll(hp2 )).

Divergence from Symmetrical Configuration To account for configurations that are only symmet- rical in some aspects, as it typically occurs in naturalistic scenes, we calculate the divergences of the configuration from the symmetry model. For each element of the symmetry structure we cal- culate the divergence from the defined symmetry model, i.e., we focus on divergence with respect to position, size, aspect ration, and pose (involving configuration of body parts and joints). We use thresholds on the average of these values to identify hypotheses on (a)symmetrical structures.

7 J. Suchan, M. Bhatt, S. Varadarajan, A. Amirshahi, S. Yu

Predicate Description

symmetrical element(E) Symmetrical elements E. non symmmetrical element(E) Non-symmetrical elements E. symmetrical objects(SO) Symmetrical objects SO. non symmetrical objects(NSO) Non-symmetrical objects NSO. symmetrical body pose(SP ,SBP ) Symmetrical person SP (pair or single object), and symmet- rical parts of body-pose SBP . non symmetrical body pose(SE,NSP ) Symmetrical person SP (pair or single object), and non- symmetrical parts of body-pose SBP . symmetry stats(NP,NSP,MD,MS) Basic stats on symmetrical structure: number of patches NP , number of symmetrical patches NSP , mean diver- gence MD, and mean similarity MS. symmetrical objects stats(NO,NSO,MD,MS) Stats on symmetrical structure of objects: number of objects NO, number of symmetrical objects NSO, mean divergence MD, and mean similarity MS.

Table 1: Sample predicates for querying interpretation model.

(C2) Similarity Measures Visual Symmetry is also based on similarity of image features; we assess similarity of image patches using CNN features, e.g, obtained from AlexNets (Krizhevsky et al., 2012), or ResNets (He et al., 2016), pre-trained on the ImageNet Dataset (Deng et al., 2009), i.e., we use the extracted features to evaluate perceptual similarity and use ImageNet classifications to evaluate semantic similarity of image patches. Perceptual Similarity Visual Symmetry is based in perceptual similarity of image features, this denotes the similarity in visual appearance of the image patches. To analyse the perceptual similarity of image patches we use the feature vectors obtained from the network and use cosine similarity to evaluate the similarity of the feature vectors of two image patches. For the case of reflectional symmetry we compare the image patches of all potential symmetry pairs by comparing the features of one patch to the features of the mirrored second patch. Semantic Similarity On the semantical level, we classify the image patches and compare their content for semantic similarities, i.e. we compare conceptual similarity of the predicted categories. Towards this we use the weighted ImageNet classifications for each image patch with WordNet (Miller, 1995), which is used as an underlying structure in ImageNet, to estimate conceptual sim- ilarity of the object classes predicted for the image patches in each symmetry pair. In particular, we use the top five predictions from the AlexNet classifiers and estimate similarity of each pair by calculating the weighted sum of the similarity values for each pair of predicted object categories.

2.3 Declarative Symmetry Semantics The semantic structure of symmetry is described by the model in terms of a set of symmetry pairs and their respective similarity values with respect to the three layers of our model, i.e. for each symmetry pair it provides the similarity measures based on semantic similarity, spatial-arrangement,

8 Human-Centred Explainability of Visuo-Spatial Symmetry

Advances in Cognitive Systems X (2018) 1-16 Submitted X/2018; published X/2018 and low-level perceptual similarity (Table2). This results in a declarative model of symmetrical structure, which is used for fine-grained analysis of symmetry features and question-answering aboutAdvances symmetrical in Cognitive Systems configuration X (2018) 1-16 in images, i.e., using our framework, it Submitted is possible X/2018; to published define high-X/2018 Advances in Cognitive Systems X (2018) 1-16 Submitted X/2018; published X/2018 level rules and execute queries in (constraint) logic programming (Jaffar & Maher, 1994) (e.g., using AdvancesAdvancespatch(id in in( Cognitive0 Cognitive), rectangle Systems Systems(point X X( (2018) (2018)233, 53 1-16 1-16), 107, 466) ). Submitted Submitted X/2018; X/2018;query published published X/2018 X/2018 SWI-PrologAdvancesobject( inid Cognitive(0), (Wielemakertype( Systems’person’ X), (2018)rectangle et al. 1-16, 2012(point))(392 to,106 reason),261,381 about)). symmetry and Submitted directly X/2018; publishedsymmetrical X/2018 person(id(0), joint(id(02)), point((582, 159))). features... of the image. patch(id(0), rectangle(point(233, 53), 107, 466) ). Symmetricalpatchcategoryobject(id((0patchid), (rectangle0( Structure),id(0type)),(point [(("file,’person’ of(233 the, file53 Image),), cabinet,rectangle107, 466 filingAs) ). an(point cabinet" example(392, 0.,248048 consider106),),261 ..., the,381 ("desk" image)). , 0 in.166062 Table)]).2. Based on the object... (id(0), type(’person’),rectangle(point(392,106),261,381)). personsimilarityperson(id((id0(),perceptual(0joint), joint(id,(0pair)),(id(pointid((0636)),((),582idpoint,(1591537))).)),((5820.429507, 159).))). symmetricalpatch(id(0), rectangle structure(point extracted(233, 53), from107, the466) image, ). the underlying interpretation model is queryable ...similarity...patch(id((semantic0), rectangle, pair(id(636(point), id((1537233)),, 530.076923), 107). , 466) ). ...objectpatchobject((idid((id0(),0(),0type),rectangle(type’person’(’person’),rectangle(point),((233rectanglepoint,(39253,),106(),107point261,,381466(392)).), ).106),261,381)). usingperson utility(id(0), predicatesjoint(id(0)), (seepoint(( sample582, 159 predicates))). in Table1). The symmetry model as defined in objectperson(id(0), typejoint(’person’(id(0)),),pointrectangle((582(,point159))).(392,106),261,381)). Sectioncategory...category 2.2(patch( canpatch(id( be0()),id used( [(0"file,)), to query file cabinet, symmetrical filing cabinet" (and non-symmetrical), 0.248048), ..., ("desk" elements, 0.166062 of)]). the image using ...person... (id(0), joint(id(0)), point((582, 159))). similaritypatch(id([((perceptual0"file,), rectangle, filepair(id cabinet,((point636), id(233(1537 filing,)),530),.429507 cabinet"107)., 466,)0. ).248048), ..., ("desk", 0.166062)]). thecategory... following(patch( rules:id(0)), [("file, file cabinet, filing cabinet", 0.248048), ..., ("desk", 0.166062)]). similarity...object(id(semantic(0), type, pair((’person’id(636), id),(1537rectangle)), 0.076923(point). (392,106),261,381)). ...similaritycategory(patch(perceptual(id(0)),, pair(id(636), id(1537)), 0.429507). symmetrical_elementsimilaritycategoryperson(id((perceptualimg_patch(0), joint, (pairE()170(id( :-id),((0636symmetrical)),), idpoint(1537)),(((582E0.).429507, 159).))). similaritysimilarity([(semantic("file,semantic, pair file(,id(pair636 cabinet,),(id((1537636 filing)),), 0id.076923(1537 cabinet"). )), 0,.0769230.248048).), ..., ("desk", 0.166062)]). symmetrical_element... [category((E"window) :- symmetrical shade", 0(.E456657, _); ),symmetrical ..., category(_, E()."television", 0.0901952)]). SYMETRICAL...... = [0, 2, 8, 10, 11, 12, 14, 15, 17|...] categorysimilarity(img_patch(perceptual(200,),pair(id(636), id(1537)), 0.429507). Aggregatingcategory(img_patch results for(170 the),symmetrical element(E) predicate for the example image results in a list non_symmetrical_elementsimilarity[category(semantic("shoji", (pairP) :-,(id0image_element.(455987636), ),id( ...,1537(P)),),categorynot0.076923(symetrical_element("window). shade", (0P.)).0961778)]). symmetrical_element[category((E"window) :- symmetrical shade", 0(.E456657). ), ..., category("television", 0.0901952)]). of... all symmetrical image elements (depicted in the results in Table2): symmetrical_elementNON_SYMETRICALsimilaritycategory(img_patch(perceptual= [1(,E()3200, :-,4),,pairsymmetrical5, (6170, 7,, 2009(,E),,13_,0);.1670728711298symmetrical|...]. ).(_, E). SYMETRICALsymmetrical_elementsymmetrical_elementsimilarity[=category([semantic0, 2,((8EE"shoji",)), 10 :- :-pair,symmetricalsymmetrical11(,170,0.12,455987,20014),,((),EE15).).0. ...,,666666666666666617|...]category("window). shade", 0.0961778)]). ?- divergence(symmetrical(id(P1), id(P2)), Div_Size, Div_AR, Div_Pos). symmetrical_elementsymmetrical_element... ((EE)) :- :- symmetricalsymmetrical((EE,, __);); symmetricalsymmetrical((__,, EE).). non_symmetrical_elementP1 ?-similarity= 2divergence, P2 = (243perceptual(,symmetrical(,P)pair :-(id(image_element(170170,),200id),(2000.()),70728711298P), DivSizenot(symetrical_element,).DivAR, DivPos).(P)). SYMETRICALSYMETRICAL == [[00,, 22,, 88,, 1010,, 1111,, 1212,, 1414,, 1515,, 1717|...]|...] Div_SizeSimilarlyDivSizesimilarity= we=div_sizediv_size( cansemantic query(6(.9, for0.,0pair19, the.180(),.170 non0),, symmetrical200DivAR), 0=.6666666666666666div_ar elements(0.0595 of), the). imageDivPos using= div_pos the following(3.9051) rule: NON_SYMETRICALsymmetrical_element= [1,(E3), :-4, symmetrical5, 6, 7, 9,(E13)., 16|...]. Div_ARnon_symmetrical_elementnon_symmetrical_elementsymmetrical_element... = div_ar(0.025384207737148917(E)(( :-PP))symmetrical :- :- image_elementimage_element),(E, _);((PP),),symmetricalnotnot((symetrical_elementsymetrical_element(_, E). ((PP)).)). Div_Pos?- divergence= div_pos(symmetrical(7.5 ); (id(170), id(200)), DivSize, DivAR, DivPos). ?-NON_SYMETRICALSYMETRICALdivergence= ([symmetrical0=, [21,,83,,104(,,id511(,P1,6),,127,id,14(9P2,, )),1315,, Div_Size1617|...].|...] , Div_AR, Div_Pos). P1?-NON_SYMETRICALDivSize=similarity2, P2 == div_size243(pair=,[(1id,((39P1,.0),4,,18id5,.(0P26),,)),7,DivARPercept_Sim9, 13=,div_ar16|...]., (Semantic_Sim0.0595), DivPos). = div_pos(3.9051) P1 = 2, P2 = 243, Div_Size?-?-non_symmetrical_elementdivergencedivergence= div_size((symmetricalsymmetricalTable(6 1:.0 Computational,(19P(().idid0 :-),((P1P1image_element),), idid steps((P2P2)),)), to generateDiv_SizeDiv_Size(P), not the,,(symetrical_element semanticDiv_ARDiv_AR,, symmetryDiv_PosDiv_Pos).). model.(P)). Percept_SimP1Divergence= 2, P2 ==2430.,5846066958174241, Div_ARP1NON_SYMETRICAL= 2,=P2div_ar= 243The(=0,.[ divergence0253842077371489171, 3, 4, 5 of, 6 a, image7, ),9, elements13, 16|...]. from the optimal symmetrical configuration can Div_PosSemantic_SimDiv_SizeDiv_Size===div_posdiv_sizediv_size= 0.(083333333333333337.((566..);00,,1919..00),), ; ?-be directlysymmetrical_objects queried using(SymObj the divergence). predicate: ?-Div_ARDiv_AR?- similaritydivergence== div_ardiv_ar((pairsymmetrical(Table(00..(025384207737148917025384207737148917id 1:(P1 Computational),(idid((P1P2),)),id stepsPercept_Sim),),(P2)), to generateDiv_Size, Semantic_Sim the, semanticDiv_AR, symmetry).Div_Pos). model. ?-P1Div_PosP1 divergence== 22,, P2=P2div_pos== (243243symmetrical,,(7.5 ); (id(P1), id(P2)), Div_Size, Div_AR, Div_Pos). ...SymObjDiv_Pos==pairdiv_pos(id(1(),7.5id);(2)). Percept_Sim?-?-Div_Sizesimilaritysimilarity= div_size=((0pairpair.5846066958174241((idid(6((.P1P10,),),19.idid0),((P2P2,)),)), Percept_SimPercept_Sim,, Semantic_SimSemantic_Sim).). P1 = 170, P2 = 200, Semantic_Sim?-P1P1Div_AR=non_symmetrical_objects= 22,,=P2P2div_ar== =2432430(.,,008333333333333333.025384207737148917(NonSymObj; ).), ?- divergence(symmetrical(id(P1), id(P2)), Div_Size, Div_AR, Div_Pos). DivSize?-Percept_SimDiv_Possymmetrical_objects== div_sizediv_pos= 0.5846066958174241((79..50,();SymObj18.0),). , NonSymObj...Percept_Sim= id= (00.).5846066958174241, DivARSemantic_Sim?- similarity= div_ar=((pair0.059520691461498308333333333333333(id(P1), id(P2)),),; Percept_Sim, Semantic_Sim). SymObjsymmetrical_body_poseP1Semantic_Sim= 170=,pairP2 (=id2000(.108333333333333333),, id(pair(2)).(P,Q), SymPose; ) :- DivPos?-?-P1 symmetrical_objectssymmetrical_objects= 2,= P2div_pos= 243(,3.905124837953327((SymObjSymObj).). ); DivSize symmetrical_objects= div_size(9.0, 18.(0pair), (P, Q)), ...Percept_Sim = 0.5846066958174241, ?-SymObjDivARnon_symmetrical_objects==typediv_arpair((Pid,(0(’person’.10595206914614983), id(2),)).(NonSymObjtype(Q,),’person’). ), ?-SymObjSemantic_Simsimilarity= pair((=pairid0(.108333333333333333(),id(idP1(),2)).id(P2)),; Percept_Sim, Semantic_Sim). ...DivPos =symmetricaldiv_pos(3.(905124837953327pose(pair(P, Q)),); SymPose). NonSymObj?-Similarity?- non_symmetrical_objectssymmetrical_objects= Perceptualid(0). and(SymObj semantic(NonSymObj). similarity). of image elements are queried as follows: P1symmetrical_body_poseP...?-==non_symmetrical_objectsid170(1,),P2Q = 200id(,2),(pair((PNonSymObj,Q), SymPose). ) :- Percept_SimSymPoseNonSymObj?-SymObjsimilarity=symmetrical_objects=pair= =id[’upperbody’((0(pairid.070728711298).(1(),id(idP1(].),2)).,(idpair(P2()),P, QPercept_Sim)), , Semantic_Sim). ...NonSymObj = id(0). Semantic_Sim?-symmetrical_body_posesymmetrical_body_posenon_symmetrical_body_posetype(=P,0.’person’6666666666666666((pairpair),((typePP,(,QpairQ),),(Q;,SymPose(SymPoseP’person’, Q),))NonSymPose :- :-), ). P1?- =non_symmetrical_objects170, P2 = 200, (NonSymObj). ...P = id(1symmetricalsymmetrical_objects),symmetrical_objectsQ = id(2(),pose(pair((pairpair(P,((QPP)),,, QQ)),)),SymPose). Percept_Sim = 0.70728711298, PNonSymPoseNonSymObj= id(1),typetype=Q=((id=PP,[,(id’facing0’person’’person’).(2), direction’),), typetype((QQ,, ’person’’legs’’person’].),), Semantic_Sim = 0.6666666666666666; SymPose?-symmetrical_body_posesymmetry_stats=symmetricalsymmetrical[’upperbody’(NumImgPatches((posepose(pair].((pairpair(P,((QPP),,, QNumSymPatchesQSymPose)),)), SymPoseSymPose) :- ).)., MeanDiv, MeanSim). ... ?-NumImgPatchesPP ==non_symmetrical_body_poseidid((11),),symmetrical_objectsQQ ===idid359((22),,), ((pairpair((PP,, QQ)),), NonSymPose). PNumSymPatchesSymPoseTheSymPose= id above(1),==type predicatesQ[[(=’upperbody’’upperbody’P=,id’person’(402),, provide].].), thetype basis(Q, ’person’ for the semantic), analysis of symmetry structures in the image NonSymPoseMeanDiv?-?- non_symmetrical_body_posenon_symmetrical_body_pose=symmetrical[=div_w[’facing(12.(394pose direction’),(pairdiv_h((pairpair(P(,7,(.(QP394P)),’legs’,, Q),Q),),SymPosediv_ar].NonSymPoseNonSymPose).(0.944).).), div_pos(8.32)], ?-MeanSimPasP == describedsymmetry_statsidid((11),=),0Q.Q in8162167312386968== theidid(( following.NumImgPatches22),), . , NumSymPatches, MeanDiv, MeanSim). NumImgPatches?-NonSymPoseNonSymPoseSymPosesymmetrical_object_stats= ==[’upperbody’=[[’facing’facing359, direction’ direction’]. (NumObj,,,’legs’’legs’NumSymObj].]. , MeanDiv, MeanSim). NumObj = 3, NumSymPatches?-Symmetrical?- symmetry_statssymmetry_statsnon_symmetrical_body_pose Structure= 40((NumImgPatchesNumImgPatches, of Objects(pair,, NumSymPatchesNumSymPatches( andP, Q), PeopleNonSymPose,,SymmetricalMeanDivMeanDiv). ,, MeanSimMeanSim structures).). in the configuration of MeanDivNumSymObjNumImgPatchesNumImgPatchesP = id(1=),[=div_wQ2==,=id(35935912(2.,,),394), div_h(7.394), div_ar(0.944), div_pos(8.32)], MeanSimMeanDivNumSymPatchesobjectsNumSymPatchesNonSymPoseand= 0[people.div_w=8162167312386968==[’facing(402140in,., the0), imagedirection’div_h.( can95. be0,),’legs’ querieddiv_ar]. using(0.181 the), predicatdiv_pos(symmetrical50.562)], objects to get symmet- ?-MeanSimMeanDivMeanDiv?- symmetrical_object_statssymmetry_stats== [0[div_w.div_w6666666666666666((1212(NumImgPatches..394394),), div_hdiv_h(NumObj. ((,77..NumSymPatches,394394NumSymObj),), div_ardiv_ar,((,MeanDiv00..MeanDiv944944),),,div_posdiv_pos,MeanSimMeanSim(().88..).3232)],)], NumObjNumImgPatches= 3, = 359, MeanSimMeanSim2. Within== the00.. (constraint)81621673123869688162167312386968 logic programming.. language PROLOG,‘ , ’ corresponds to conjunction, ‘ ; ’ to a disjunction, NumSymObj?-NumSymPatchessymmetrical_object_stats= 2 ,= 40, (NumObj, NumSymObj, MeanDiv, MeanSim). ?- andsymmetrical_object_stats ‘a :- b, c.’ denotes a rule where(NumObj ‘a’, isNumSymObj true if both, ‘b’MeanDiv and ‘c’, areMeanSim true; capitals). are used to denote variables, MeanDivNumObjNumObjMeanDiv====33[,,[div_wdiv_w((2112..0394), ),div_hdiv_h(95(.70.),394div_ar), div_ar(0.181(0.),944div_pos), div_pos(50.(5628.32)],)], MeanSimNumSymObjNumSymObjMeanSimwhereas== lower-case0==0..6666666666666666228162167312386968,, refers to constants;.. ‘ ’ (i.e., the underscore) is a “dont care” variable, i.e., denoting placeholders MeanDivMeanDiv?- forsymmetrical_object_stats variable== [[div_wdiv_w in cases((2121 where..00),), onediv_hdiv_h doesn’t(NumObj((9595.. care00),),, NumSymObj fordiv_ardiv_ar a resulting((00..,181181MeanDiv value.),), div_posdiv_pos, MeanSim((5050..562562). )],)], MeanSimMeanSimNumObj ===300,..66666666666666666666666666666666.. NumSymObj = 2 , MeanDiv = [div_w(21.0), div_h(95.0), div_ar(09.181), div_pos(50.562)], MeanSim = 0.6666666666666666.

c 2018 Cognitive Systems Foundation. All rights reserved.

c 2018 Cognitive Systems Foundation. All rights reserved. c 2018 Cognitive Systems Foundation. All rights reserved. cc 20182018 Cognitive Cognitive Systems Systems Foundation. Foundation. All All rights rights reserved. reserved. c 2018 Cognitive Systems Foundation. All rights reserved. J. Suchan, M. Bhatt, S. Varadarajan, A. Amirshahi, S. Yu

Step 1) Extracting Image Elements extract image elements consisting of 1, 2, and 3 : E E E E ( 1) Image Patches are extracted using selective search as described in Uijlings et al.(2013); E ( 2) People and Objects are detected in the image using YOLO object detection (Redmon et al., 2016); E ( 3) Human Body Pose consisting of body joints and facing direction is extracted using human pose estimation (Cao et al., 2017). E Advances in Cognitive Systems X (2018) 1-16 Submitted X/2018; published X/2018

Advances in Cognitive Systems X (2018) 1-16 Submitted X/2018; published X/2018 Advances in Cognitive Systems X (2018) 1-16 Submitted X/2018; published X/2018

Advances in Cognitive Systems X (2018) 1-16 Submitted X/2018; published X/2018 patch(id(0), rectangle(point(233, 53), 107, 466) ). Advances in Cognitive Systems X (2018) 1-16 Submitted X/2018; published X/2018 object(id(0), type(’person’),rectangle(point(392,106),261,381)). person(id(0), joint(id(0)), point((582, 159))). patch... (id(0), rectangle(point(233, 53), 107, 466) ). object(id(0), type(’person’),rectangle(point(392,106),261,381)). Step 2) Semantic and Perceptual Similarity Compute semantic and perceptual personcategory(id((patch0), joint(id(0()),id(0)), point((582, 159))). patchsimilarity(id for(0 each), rectangle pair of image( elementspoint(233ei and, 53ej ), 107based, 466 on features) ). from ... [("file, file cabinet, filing∈ cabinet" E , 0.248048), ..., ("desk", 0.166062)]). patchobjectCNN layers.(id(id(0(),0),rectangletype(’person’(point),(233rectangle, 53),(107point, 466(392), ).106),261,381)). object... (id(0), type(’person’),rectangle(point(392,106),261,381)). categoryperson–similarity compute(idsemantic(patch(0(),perceptual similarity(jointid(0)),(idbased,(0pair)), on ImageNet(pointid(636(( classification),582id,(1591537 of))).)), image0. patches429507). person(id(0), joint(id(0)), point((582, 159))). ...–similarity compute perceptual[("file,(semantic similarity file, pairbased cabinet,( onid cosign(636 filing), similarityid( cabinet"1537 between)), CNN,0.0076923. features248048).), ..., ("desk", 0.166062)]). ...patch(id(0), rectangle(point(233, 53), 107, 466) ). ...of image elements object(id(0), type(’person’),rectangle(point(392,106),261,381)). similaritycategory(patch(perceptual(id(0)),, pair(id(636), id(1537)), 0.429507). categoryperson(id(img_patch(0), joint(170(id),(0)), point((582, 159))). similarity[(("file,semantic file, pair cabinet,(id(636 filing), id(1537 cabinet")), 0,.0769230.248048).), ..., ("desk", 0.166062)]). ... [category("window shade", 0.456657), ..., category("television", 0.0901952)]). ... categorysimilarity(img_patch(perceptual(200,),pair(id(636), id(1537)), 0.429507). category(img_patch(170), symmetrical_elementsimilarity[category(semantic(E"shoji"), :-pairsymmetrical,(id0.(455987636), (),idE).( ...,1537)),category0.076923("window). shade", 0.0961778)]). [category("window shade", 0.456657), ..., category("television", 0.0901952)]). symmetrical_element... (E) :- symmetrical(E, _); symmetrical(_, E). SYMETRICALsimilaritycategory(=img_patch([perceptual0, 2, 8(,20010,),,pair11,(17012, 14200,),15,0.1770728711298|...] ). symmetrical_elementsimilarity[category(semantic(E"shoji"), :-pairsymmetrical(,1700.,455987200),(),E).0. ...,6666666666666666category("window). shade", 0.0961778)]). symmetrical_elementnon_symmetrical_element... (E) :-(P)symmetrical :- image_element(E, _);(Psymmetrical), not(symetrical_element(_, E). (P)). ?-similaritydivergence(perceptual(symmetrical, pair(id((170170,),200id),(2000.)),70728711298DivSize)., DivAR, DivPos). SYMETRICALNON_SYMETRICALStep 3) Symmetry= [0 Configuration,= 2[,1,8,3,104 and, 115 Divergence,,612, ,7,149,Identify1513, symmetrical1716|...]|...]. structures in the image elements based on the formal symmetrical_elementDivSizesimilarity= div_size(semantic(E()9,. :-0pair, symmetrical18(.1700),, 200DivAR),(E).0=.6666666666666666div_ar(0.0595),).DivPos = div_posE (3.9051) definition of symmetry in Section 2.2 and calculate the divergence of elements from this model. non_symmetrical_elementsymmetrical_element?-...divergence(symmetrical(E)( :-P)(symmetricalid :-(P1image_element), id((P2E,)),_);(Div_SizeP),symmetricalnot(,symetrical_elementDiv_AR(_,,E).Div_Pos).(P)). ?- divergence(symmetrical(id(170), id(200)), DivSize, DivAR, DivPos). NON_SYMETRICALSYMETRICALP1 = 2, P2 = [2430=,,[21,,83,,104, 511,,6,127,,149, 1315, 1617|...].|...] Div_SizeDivSize==div_sizediv_size(6(.90.,019, .180),.0), DivAR = div_ar(0.0595), DivPos = div_pos(3.9051) ?-non_symmetrical_elementDiv_ARdivergence= div_ar(symmetricalTable(0.025384207737148917 1: Computational(P()id :-(P1image_element), id steps(),P2)), to generateDiv_Size(P), not the(,symetrical_element semanticDiv_AR, symmetryDiv_Pos). model.(P)). P1 = 2, P2 = 243, NON_SYMETRICALDiv_Pos = div_pos= [(Threshold17,.53,);4, on5 divergence, 6, 7, 9, 13, 16|...]. Threshold on similarity Div_Size?- similarity= div_size(pair((id6.(0P1,19),.0id),(P2)), Percept_Sim, Semantic_Sim). Div_AR?-P1 divergence= 2,=P2div_ar= (243symmetricalTable(0,.025384207737148917 1: Computational(id(P1), id steps),(P2)), to generateDiv_Size the, semanticDiv_AR, symmetryDiv_Pos). model. ?-P1 divergence= 2, P2 = (243symmetrical, (id(P1), id(P2)), Div_Size, Div_AR, Div_Pos). ...Div_PosPercept_Sim= div_pos= 0.5846066958174241(7.5 ); , ?-Div_Sizesimilarity= div_size(pair(id(6(.P10,),19.id0),(P2)), Percept_Sim, Semantic_Sim). P1Semantic_Sim= 170, P2 == 2000.08333333333333333, ; P1Div_AR?- =symmetrical_objects2,=P2div_ar= 243(,0.025384207737148917(SymObj). ), DivSize?- divergence= div_size(symmetrical(9.0, 18(.id0),(P1), id(P2)), Div_Size, Div_AR, Div_Pos). Percept_SimDiv_Pos = div_pos= 0.5846066958174241(7.5 ); , DivAR... = div_ar(0.0595206914614983), Semantic_Sim?-SymObjsimilarity= pair=(pairid0.(083333333333333331(),id(idP1(),2)).id(P2)),; Percept_Sim, Semantic_Sim). P1 = 170, P2 = 200, DivPos?-P1 symmetrical_objects= 2,= P2div_pos= 243(,3.905124837953327(SymObj). ); DivSize?-Resultnon_symmetrical_objectsA= Declarativediv_size Model(9.0, for18 Semantic.(0NonSymObj), Analysis). of Symmetry The process results in the declarative structure consisting of ...Percept_Sim = 0.5846066958174241, ?-DivARSymObjthesimilarity symmetrical==div_arpair(( propertiespair(id0.(05952069146149831(),id of(idP1 the(),2 image)).id( givenP2)),), by thePercept_Sim elements of the, Semantic_Sim image and the divergence). from the formal definition of ...Semantic_SimNonSymObj = id=(0.).08333333333333333; E DivPossymmetrical_body_posesymmetry.= div_pos This model(3 serves.905124837953327(pair as a basis(P,Q for), declarativeSymPose); query-answering) :- about symmetrical characteristics of the image, e.g. the P1?- =non_symmetrical_objectssymmetrical_objects170, P2 = 200, (SymObj(NonSymObj). ). ...images showcasesymmetrical_objects results from queries( inpair Section(P, 2.3Q,)), analysing symmetrical an non-symmetrical image elements. Percept_Sim?-SymObjsimilarity= pair=((0pairid.70728711298(1(),id(idP1(),2)).,id(P2)), Percept_Sim, Semantic_Sim). NonSymObjtype= id(P(,0).’person’Symmetrical), Elementstype(Q, ’person’), Non-Symmetrical Elements Semantic_Sim...symmetrical_body_pose= 0.6666666666666666(pair(P,Q), ;SymPose) :- P1?- =non_symmetrical_objects170,symmetricalP2 = 200,(pose(pair(NonSymObj(P, Q)),).SymPose). ...P = id(1symmetrical_objects), Q = id(2), (pair(P, Q)), Percept_Sim = 0.70728711298, NonSymObjSymPose type= = (id[P’upperbody’,(0’person’). ),]. type(Q, ’person’), Semantic_Simsymmetrical_body_posesymmetrical= 0.6666666666666666(pose(pair(pair(P,(QP),, ;QSymPose)), SymPose) :- ). ...?- non_symmetrical_body_pose(pair(P, Q), NonSymPose). P = id(1),symmetrical_objectsQ = id(2), (pair(P, Q)), SymPoseNonSymPose=type=[(’upperbody’P,[’facing’person’ direction’].), type(Q, ’person’’legs’].), ?- non_symmetrical_body_posesymmetry_statssymmetrical(NumImgPatches(pose(pair(pair(P, (QNumSymPatchesP)),, Q),SymPoseNonSymPose)., MeanDiv). , MeanSim). PNumImgPatches= id(1), Q ==id359(2),, NonSymPoseSymPoseNumSymPatches= =[’upperbody’[=’facing40, direction’]. , ’legs’]. ?-MeanDivsymmetry_statsnon_symmetrical_body_pose= [div_w((12NumImgPatches.394), div_h(pair,(7NumSymPatches(.P394, Q),),div_arNonSymPose,(0MeanDiv.944).), ,div_posMeanSim(8.).32)], NumImgPatchesPMeanSim= id(1),= 0Q.8162167312386968==id359(2Table,), 2: Computational. steps to generate the semantic symmetry model. NumSymPatchesNonSymPose?- symmetrical_object_stats= =[’facing40, direction’(NumObj,,’legs’NumSymObj]. , MeanDiv, MeanSim). MeanDiv?-NumObjsymmetry_stats==3[,div_w(12(NumImgPatches.394), div_h(,7.NumSymPatches394), div_ar(,0.MeanDiv944), ,div_posMeanSim(8)..32)], MeanSimNumImgPatchesNumSymObj= 0=.81621673123869682 =, 359, . ?-NumSymPatchesMeanDivsymmetrical_object_stats= [div_w= (4021,.0), div_h(NumObj(95.0,),NumSymObjdiv_ar(010,.181MeanDiv), div_pos, MeanSim(50.562). )], NumObjMeanDivMeanSim==3,[0div_w.6666666666666666(12.394), div_h. (7.394), div_ar(0.944), div_pos(8.32)], NumSymObjMeanSim = =0.28162167312386968, . MeanDiv?- symmetrical_object_stats= [div_w(21.0), div_h(NumObj(95.0),, NumSymObjdiv_ar(0.,181MeanDiv), div_pos, MeanSim(50.).562)], MeanSimNumObj ==30,.6666666666666666. NumSymObj = 2 , MeanDiv = [div_w(21.0), div_h(95.0), div_ar(0.181), div_pos(50.562)], MeanSim = 0.6666666666666666.

c 2018 Cognitive Systems Foundation. All rights reserved.

c 2018 Cognitive Systems Foundation. All rights reserved. c 2018 Cognitive Systems Foundation. All rights reserved. c 2018 Cognitive Systems Foundation. All rights reserved. c 2018 Cognitive Systems Foundation. All rights reserved. Advances in Cognitive Systems X (2018) 1-16 Submitted X/2018; published X/2018

Advances in Cognitive Systems X (2018) 1-16 Submitted X/2018; published X/2018

Advances in Cognitive Systems X (2018) 1-16 Submitted X/2018; published X/2018 patch(id(0), rectangle(point(233, 53), 107, 466) ). object(id(0), type(’person’),rectangle(point(392,106),261,381)). person(id(0), joint(id(0)), point((582, 159))). patch... (id(0), rectangle(point(233, 53), 107, 466) ). object(id(0), type(’person’),rectangle(point(392,106),261,381)). person(id(0), joint(id(0)), point((582, 159))). ...category(patch(id(0)), [("file, file cabinet, filing cabinet", 0.248048), ..., ("desk", 0.166062)]). patch... (id(0), rectangle(point(233, 53), 107, 466) ). similarity(perceptual, pair(id(636), id(1537)), 0.429507). object(id(0), type(’person’),rectangle(point(392,106),261,381)). categorysimilarity(patch(semantic(id(0)),, pair [("file,(id(636 file), id cabinet,(1537)), filing0.076923 cabinet"). , 0.248048), ..., ("desk", 0.166062)]). Advancesperson( inid Cognitive(0), joint Systems(id(0)), Xpoint (2018)(( 1-16582, 159))). Submitted X/2018; published X/2018 ... similarity... (perceptual, pair(id(636), id(1537)), 0.429507). similarity(semantic, pair(id(636), id(1537)), 0.076923). ...category(patch(id(0)), [("file, file cabinet, filing cabinet", 0.248048), ..., ("desk", 0.166062)]). Advances... in Cognitive Systems X (2018) 1-16 Submitted X/2018; published X/2018 similarity(perceptual, pair(id(636), id(1537)), 0.429507). symmetrical_elementsimilarity(semantic, pair(E()id :-(636symmetrical), id(1537)), 0(.E076923). ). Advancessymmetrical_element...patch(id in( Cognitive0), rectangle Systems((pointE) X :-((2018)233symmetrical, 53 1-16), 107, 466()E )., _); symmetrical(_, E). Submitted X/2018; published X/2018 symmetrical_elementobject(id(0), type(’person’(E)), :-rectanglesymmetrical(point(392(E,106).),261,381)). SYMETRICALperson(id(0),= joint[0, (2id,(08)),, 10point, ((11582, ,12159, ))).14, 15, 17|...] symmetrical_elementAdvances in Cognitive Systems(E) X :-(2018)symmetrical 1-16(E, _); symmetrical(_, E). Submitted X/2018; published X/2018 non_symmetrical_element... (P) :- image_element(P), not(symetrical_element(P)). SYMETRICAL = [0, 2, 8, 10, 11, 12, 14, 15, 17|...] patch(id(0), rectangle(point(233, 53), 107, 466) ). symmetrical_elementNON_SYMETRICALcategory(patch(id(=0)),[1(, [(E"file,)3, :-4,symmetrical file5, cabinet,6, 7, 9 filing,(E13)., cabinet"16|...]., 0.248048), ..., ("desk", 0.166062)]). non_symmetrical_elementobject(id(0), type(’person’),(rectangleP) :- image_element(point(392,106),261(P,),381)).not(symetrical_element(P)). symmetrical_element...person(id(0), joint(id((0E)),) :-pointsymmetrical((582, 159))).(E, _); symmetrical(_, E). ?-similaritydivergence(perceptual(symmetrical, pairHuman-Centred(id((636id),(P1id),(1537id Explainability)),(P20.)),429507Div_Size). of Visuo-Spatial, Div_AR, SymmetryDiv_Pos). NON_SYMETRICALSYMETRICALpatchsimilarity... (id(0),(semantic=rectangle[0=, ,[2pair1,(,point8(3,id,10(2336364,,),51153,id,),(61537,121077,,)),,1446690,).076923 ).1315, ).1617|...].|...] P1object...= 2,(idP2(0),= type243(,’person’),rectangle(point(392,106),261,381)). ?-non_symmetrical_elementDiv_Sizepersondivergence(id(=0),div_sizejoint(symmetrical(id((06)),.0(,pointP19()id.((0 :-(),582P1image_element,),159id))).(P2)), Div_Size(P), not,(symetrical_elementDiv_AR, Div_Pos).(P)). ...patchcategory(id(0patch), rectangle(id(0)),(point [("file,(233, file53), cabinet,107, 466 filing) ). cabinet", 0.248048), ..., ("desk", 0.166062)]). P1Div_ARobject...= 2,(=idP2(div_ar0),= type243((,0’person’.025384207737148917),rectangle(point(392),,106),261,381)). similarity(perceptual, pair(id(636), id(1537)), 0.429507). NON_SYMETRICALperson(id(0), joint= ([id1(,0)),3,point4, ((5,5826,,1597,))).9, 13, 16|...]. Div_SizeDiv_Possimilarity==(div_possemanticdiv_size,(pair7(.65(.id0);,(63619.),0),id(1537)), 0.076923). category... (patch(id(0)), [("file, file cabinet, filing cabinet", 0.248048), ..., ("desk", 0.166062)]). Div_AR?-...divergencesimilarity= div_ar(symmetricalpair(0.025384207737148917(id(P1),(idid((P1P2),)),id),Percept_Sim(P2)), Div_Size, Semantic_Sim, Div_AR, ).Div_Pos). Div_PosP1ricallysymmetrical_elementsimilarity= 2, configured=P2(div_posperceptual= 243, objects,(,7.(pair5E));( :-id( i.e.,636symmetrical), pairsid(1537 of)), symmetrically(E0.).429507). positioned objects and single objects in the ?-Percept_Simsymmetrical_elementsimilaritycategorysimilarity(patch(semantic=((idpair0(.05846066958174241)),, pair(id( [(E"file,()idP1 :-(636),symmetrical file),idid( cabinet,(P21537)),, )), Percept_Sim filing0(.E076923, _ cabinet");). symmetrical,, Semantic_Sim0.248048),(_ ...,, E). ("desk", 0.166062)]). Div_Sizecentre... of the= div_size image. (6.0,19.0), P1Div_ARSemantic_SimSYMETRICALsimilarity= 2,=P2div_ar(perceptual= 243=[0(0,,0..083333333333333332,025384207737148917, pair8, (10id(,63611),,id12(1537,;14)),),,0.15429507, 17).|...] Percept_Sim?-similaritysymmetrical_objects(semantic= 0.5846066958174241, pair(id(SymObj(636), id).(1537, )), 0.076923). Div_Posnon_symmetrical_elementsymmetrical_element= div_pos(7.(5E));( :-P)symmetrical :- image_element(E). (P), not(symetrical_element(P)). Semantic_Sim... = 0.08333333333333333; ?-SymObjsymmetrical_elementsimilarity= pair(pairid(1(),id(E(id)P1( :-),2)).symmetricalid(P2)), Percept_Sim(E, _); symmetrical, Semantic_Sim(_, E). Advances?-P1ForNON_SYMETRICALsymmetrical_objects= the2, inexampleP2 Cognitive= 243= image Systems,[1, 3 thisX(,SymObj (2018)4, results5, 1-16).6, in7, the9, two13, people16|...]. sitting on the bench Submitted in the centre X/2018; ofpublished the image. X/2018 SYMETRICAL = [0, 2, 8, 10, 11, 12, 14, 15, 17|...] symmetrical_elementPercept_Sim?- non_symmetrical_objects= 0.5846066958174241(E) :- symmetrical(NonSymObj, ).(E). SymObj?- divergence= pair(idsymmetrical(1), id(2)).(id(P1), id(P2)), Div_Size, Div_AR, Div_Pos). symmetrical_elementSemantic_Sim = 0.08333333333333333(E) :- symmetrical; (E, _); symmetrical(_, E). NonSymObjP1non_symmetrical_element= 2, P2==id243(0)., (P) :- image_element(P), not(symetrical_element(P)). ?-SYMETRICALsymmetrical_elementnon_symmetrical_objectssymmetrical_objects= [0, 2,(E8),( :-10SymObj,(symmetricalNonSymObj11,).12, 14).(,E).15, 17|...] Similarlysymmetrical_body_poseDiv_Size = todiv_size symmetrical(6(.pair0,19 object(.P0,),Q), configurations,SymPose) :- objects placed in a non-symmetrical way can be symmetrical_elementNON_SYMETRICAL = [1(,E)3, :-4,symmetrical5, 6, 7, 9(,E,13_,);16symmetrical|...]. (_, E). NonSymObjnon_symmetrical_elementSymObjDiv_AR =symmetrical_objectspairdiv_ar= id((id0().(01.),025384207737148917id(P2))). :-(pairimage_element(P, Q),)), (P), not(symetrical_element(P)). SYMETRICALqueried as follows:= [0, 2, 8, 10, 11, 12, 14, 15, 17|...] symmetrical_body_poseDiv_Pos?- divergencetype= div_pos(P(,symmetrical’person’(7.5(pair);),((idPtype,(QP1),(),QSymPose,id’person’(P2)),) :-),Div_Size, Div_AR, Div_Pos). NON_SYMETRICAL?-P1patchnon_symmetrical_objects= 2(,idP2(0),=rectangle243=,[1(,point3,(2334,(, 5NonSymObj53,),6,1077,,4669).,) ).13, 16|...]. non_symmetrical_element?- similaritysymmetrical_objectssymmetrical(pair((idpose(P1(P(),)pair( :-idpair(image_elementPP2,()),PQ,)),QPercept_Sim)),SymPose(P),).,notSemantic_Sim(symetrical_element). (P)). Div_Sizeobject(id(=0),div_sizetype(’person’(6.0),,rectangle19.0), (point(392,106),261,381)). ?-NonSymObjPP1=persondivergence=id2(,(1idtype),P2(0=),Q=(idjointP(=243,symmetrical(0id’person’).,((id2(),0)), point),(id((type(582P1,(),Q159,id))).’person’(P2)), ),Div_Size, Div_AR, Div_Pos). P1NON_SYMETRICALsymmetrical_body_poseSymPosePercept_SimDiv_AR...= 2,=symmetricalP2=div_ar=[=243’upperbody’0=.(,58460669581742410[.1025384207737148917(,pose3(,pair(4].pair,(5P,(QP6),,,Q7SymPose)),, 9),,SymPose13), :-16).|...]. Semantic_SimDiv_Pos = div_pos= 0.08333333333333333(7.5 ); ; PDiv_Size?-Resulting=divergencenon_symmetrical_body_poseid(1),symmetrical_objects= inQdiv_size objects=(symmetricalid(2),( that6.0, are19(id. not0((),pairP1pair part),((PidP, of,(QP2Q)), a),)), symmetricalNonSymPoseDiv_Size,). structure,Div_AR, i.e.,Div_Pos the). person in the left of the ?- symmetrical_objectssimilarity(pair(id((P1SymObj), id).(P2)), Percept_Sim, Semantic_Sim). SymPoseDiv_ARP1P =category=id2,(=1=P2type),(patchdiv_ar=Q[((’upperbody’243P=id,((id0,0’person’)),.(0253842077371489172), [("file,].), filetype cabinet,(Q, ’person’), filing cabinet"), , 0.248048), ..., ("desk", 0.166062)]). imageP1...= 2, hasP2 no= symmetrical243, correspondent in the right of the image. ?-Div_PosNonSymPosesimilaritynon_symmetrical_body_pose= (div_posperceptual= [’facing(,7.pair5 );( direction’id(636),(pairid(1537(,P,’legs’)),Q),0.429507NonSymPose]. ). ). Div_SizeSymObjPercept_Sim=symmetrical=pairdiv_size=(id0.(58460669581742411),(6pose.id0,(19(2pair))..0),(P,,Q)), SymPose). P?-Div_AR=similaritysimilaritysymmetry_statsid(1=),div_ar(semanticQ =(pairid(0,(.pair2025384207737148917(NumImgPatches),id((idP1(636),),idid((P21537,)),NumSymPatches)), ),Percept_Sim0.076923). ,,MeanDivSemantic_Sim, MeanSim). ). BodySemantic_Sim... Pose. Based= 0.08333333333333333 on this, the extracted; symmetrical objects can be analysed further, e.g., sym- NonSymPoseP1Div_PosSymPoseNumImgPatches?- =non_symmetrical_objects2, P2= div_pos=[243’upperbody’[=’facing,359(7.,5 ); direction’].(NonSymObj, ’legs’). ]. Percept_Sim?- symmetrical_objects= 0.5846066958174241(SymObj). , ?-metricalNumSymPatchesNonSymObjsymmetry_statssimilaritynon_symmetrical_body_pose configuration= id(pair=(0).(40NumImgPatches(id, of(P1people), id((pairandP2,)),NumSymPatches( theirP,Percept_SimQ),bodyNonSymPose pose,,,MeanDiv canSemantic_Sim). be, queriedMeanSim). using). the following rule: Semantic_Sim = 0.08333333333333333; NumImgPatchesP1PMeanDivsymmetrical_body_poseSymObj==id2,(1=P2),=pair[=Qdiv_w243=(=idid359,((1212),,),.394(idpair),(2)).div_h(P,Q),(7.SymPose394), div_ar) :- (0.944), div_pos(8.32)], ?- symmetrical_objects(SymObj). NumSymPatchesPercept_SimNonSymPoseMeanSim =symmetrical_objects0=.=8162167312386968=0[.’facing584606695817424140, direction’(.pair,(P,,’legs’Q)), ]. ?- symmetrical_object_statsnon_symmetrical_objects((NonSymObjNumObj, NumSymObj). , MeanDiv, MeanSim). symmetrical_elementMeanDivSymObjSemantic_Sim?- symmetry_stats==typepair[div_w(=(Pid,0(.’person’12083333333333333331(),NumImgPatches(.E394)id :-),(2),)).symmetricaldiv_htype(,7Q.,NumSymPatches394;’person’),(E).div_ar), (,0.MeanDiv944), div_pos, MeanSim(8.).32)], ?-NumObjsymmetrical_objects= 3, (SymObj). symmetrical_elementMeanSimNumImgPatchesNonSymObj=symmetrical0=.8162167312386968id=(0359). (,Epose) :-(pairsymmetrical.(P, Q)),(ESymPose, _); symmetrical). (_, E). NumSymObj = 2 , SYMETRICAL?-NumSymPatchesPsymmetrical_body_pose=symmetrical_object_statsnon_symmetrical_objectsid(1), =Q [=0=,id240(,2,),8,(pair10,(11(NonSymObjPNumObj,,Q),12,SymPose,14NumSymObj)., 15), :-17,|...]MeanDiv, MeanSim). NumObjSymObjMeanDiv==3pair,[div_w(id(121),.0id),(2div_h)). (95.0), div_ar(0.181), div_pos(50.562)], MeanDivSymPose =symmetrical_objects[div_w[’upperbody’(12.394),].div_h(pair(7(.P394, Q),)),div_ar(0.944), div_pos(8.32)], non_symmetrical_elementNumSymObjNonSymObjThisMeanSim results= =0. in2id6666666666666666 the,(0). symmetrically(P) :-. image_element placed people,(P and), thenot( elementssymetrical_element of the poses(P that)). are symmetrical, symmetrical_body_pose?-MeanSimnon_symmetrical_objectsnon_symmetrical_body_pose=type0.8162167312386968(P, ’person’(pair),(NonSymObjPtype,.(Qpair),(QSymPose(,P,’person’).Q),)NonSymPose :-), ). NON_SYMETRICALMeanDiv?-i.e.,symmetrical_object_stats the upper-body=symmetrical[div_w=([211 of,.(0pose person3),, 4div_h(,pair5( 1,NumObj( and695P,.07Q person),,)),9NumSymObjdiv_ar,SymPose13 2, are(160 symmetrical..|...].,).181MeanDiv), div_pos, MeanSim(50.562). )], NonSymObjP = id(1symmetrical_objects),=Qid=(0id).(2), (pair(P, Q)), MeanSimNumObjP = id(=1=),30,.Q6666666666666666= id(2), . symmetrical_body_poseNonSymPosetype=(P,[’facing’person’(pair direction’),(Ptype,Q),(QSymPose, ’person’’legs’)]. :-), ?-NumSymObjSymPosedivergence= = 2[(’upperbody’symmetrical, ].(id(P1), id(P2)), Div_Size, Div_AR, Div_Pos). P1?- =symmetry_stats2, P2symmetricalsymmetrical_objects= 243,(NumImgPatches(pose(pair(pair(P,(PQNumSymPatches,)),Q)),SymPose)., MeanDiv, MeanSim). MeanDiv?- non_symmetrical_body_pose= [div_w(21.0), div_h((pair95.0(),P, div_arQ), NonSymPose(0.181),).div_pos(50.562)], Div_SizePNumImgPatches= id(1),type= div_sizeQ(P=,=id’person’359(2(),,6.0,19),.0type), (Q, ’person’), MeanSimRespectively,P = id(1=),0.Q6666666666666666non-symmetrical= id(2), . parts of the body pose can be queried as follows: Div_ARSymPoseNumSymPatches==symmetricaldiv_ar[’upperbody’=(0.40025384207737148917(,pose(].pair(P, Q)),),SymPose). NonSymPose = [’facing direction’, ’legs’]. Div_Pos?-PMeanDiv=non_symmetrical_body_poseid(1=),div_pos[Qdiv_w= id((1272.),5394);), div_h(pair(7(.P394, Q),),div_arNonSymPose(0.944).), div_pos(8.32)], ?-PSymPoseMeanSim=similaritysymmetry_statsid(1),= 0Q.[8162167312386968(’upperbody’=pairid((2NumImgPatchesid),(P1),]. id.(P2,)),NumSymPatchesPercept_Sim,,MeanDivSemantic_Sim, MeanSim). ). NumImgPatches = 359, P1NonSymPose?-Resulting=non_symmetrical_body_posesymmetrical_object_stats2, P2 in= the243[’facing parts, of the direction’ body((NumObjpair poses(,P,,’legs’NumSymObj thatQ), areNonSymPose]. not, MeanDiv symmetrical.). , MeanSim). NumObjNumSymPatches= 3, = 40, Percept_Sim?-P =symmetry_statsid(1), Q==0.id5846066958174241(2NumImgPatches), ,,NumSymPatches, MeanDiv, MeanSim). NumSymObjMeanDiv = =[div_w2 , (12.394), div_h(7.394), div_ar(0.944), div_pos(8.32)], Semantic_SimNumImgPatchesNonSymPose = =[=0’facing.35908333333333333333, direction’,;’legs’]. ?-NumSymPatchesMeanDivMeanSimsymmetrical_objects= [0div_w.8162167312386968= (4021,.0),(SymObjdiv_h.).(95.0), div_ar(0.181), div_pos(50.562)], ?- symmetry_statssymmetrical_object_stats(NumImgPatches(NumObj, NumSymPatches, NumSymObj,,MeanDivMeanDiv,,MeanSimMeanSim).). MeanDivMeanSim = [0div_w.6666666666666666(12.394), div_h. (7.394), div_ar(0.944), div_pos(8.32)], SymObjNumImgPatchesAsNumObj such,= thepair3, above(id= (3591 analysis),, id(2)). states that the two people sitting on the bench are placed in a symmet- MeanSimc 2018 Cognitive= 0.8162167312386968 Systems Foundation. All. rights reserved. NumSymPatchesricalNumSymObj way. Their= 2 =, pose40, is symmetrical in the upper-body, while the facing direction and the legs are ?-MeanDivnon_symmetrical_objectssymmetrical_object_stats= [div_w(12.394), (div_hNonSymObj(NumObj(7.394, ).NumSymObj), div_ar,(0MeanDiv.944),,div_posMeanSim(8)..32)], NumObjMeanDivc 2018 Cognitive==3,[div_w Systems(21 Foundation..0), div_h All( rights95.0), reserved.div_ar(0.181), div_pos(50.562)], MeanSimnot symmetrical.= 0.81621673123869686666666666666666. NonSymObjNumSymObj?- symmetrical_object_stats= id2 (,0). (NumObj, NumSymObj, MeanDiv, MeanSim). symmetrical_body_pose(pair(P,Q), SymPose) :- MeanDivNumObjStatistics==3 on,[div_w Image(21. Symmetry0), div_h(95.Additionally0), div_ar(0 the.181 model), div_pos can be(50 used.562)], to query statistics on the MeanSimNumSymObjc 2018 Cognitivesymmetrical_objects= =0.26666666666666666 Systems, Foundation.( Allpair. rights(P, reserved.Q)), MeanDivsymmetricaltype= [div_w features(P, ’person’(21. of0), an),div_h image,type(95( e.g.,Q.,0),’person’ todiv_ar train a(),0 a. classifier181), div_pos based(50 on.562 the)], semantic characterisations MeanSim symmetrical= 0.6666666666666666(pose(pair.(P, Q)), SymPose). Pof= symmetryid(1), Q as= shownid(2), in Section3. (see Table3 for additional examples of symmetry statistics) SymPose = [’upperbody’]. ?- non_symmetrical_body_pose(pair(P, Q), NonSymPose). P = id(1), Q = id(2), NonSymPoseSimilarly statistics= [’facing on symmetry direction’ of objects, ’legs’ and]. people can be queried. c 2018 Cognitive Systems Foundation. All rights reserved. ?- symmetry_stats(NumImgPatches, NumSymPatches, MeanDiv, MeanSim). NumImgPatches = 359, NumSymPatches = 40, MeanDivc 2018 Cognitive= [div_w Systems(12 Foundation..394), div_h All rights(7.394 reserved.), div_ar(0.944), div_pos(8.32)], MeanSim = 0.8162167312386968. ?- symmetrical_object_stats(NumObj, NumSymObj, MeanDiv, MeanSim). c 2018 Cognitive Systems Foundation. All rights reserved. NumObj = 3, NumSymObjBased on these= 2 rules,, our model provides a declaratively interpretable characterisation of reflectional MeanDivc 2018 Cognitive= [div_w Systems(21 Foundation..0), div_h All( rights95.0), reserved.div_ar(0.181), div_pos(50.562)], symmetry in visual stimuli. MeanSim = 0.6666666666666666.

11

c 2018 Cognitive Systems Foundation. All rights reserved. J. Suchan, M. Bhatt, S. Varadarajan, A. Amirshahi, S. Yu

Num. Elements: 232 Sym. Elements: 26 Rel. Sym.: 0.112 Mean Perceptual Sim.: 0.813

Num. Elements: 77 Sym. Elements: 2 Rel. Sym.: 0.026 Mean Perceptual Sim.: 0.941

Num. Elements: 335 Sym. Elements: 5 Rel. Sym.: 0.015 Mean Perceptual Sim.: 0.75

Num. Elements: 199 Sym. Elements: 8 Rel. Sym.: 0.04 Mean Perceptual Sim.: 0.916

Table 3: Extracted elements and statistics on symmetry structures for exemplary images.

3. Human Evaluation: A Qualitative Study

Experimental Dataset Human-generated data from subjective, qualitative assessments of sym- metry serves many useful purposes: we built a dataset of 150 images consisting of landscape and architectural photography, and movie scenes. The images range from highly symmetric images showing very controlled symmetric patterns to completely non symmetric images. Each participant was shown 50 images selected randomly from the dataset; subjects had to rank the images by select- ing one of four categories: not symmetric, somewhat symmetric, symmetric, and highly symmetric. Each image was presented to approximately 100 participants; we calculated the symmetry value as the average of all responses. Empirical Results The results from the human experiment suggest, that perception of symmetry varies a lot between subjects. While in the case of no symmetry people tend to agree, i.e. variance in the answers is very low (see Figure5), in the case of high symmetry, there is a wider variance in the human perception of symmetry. In particular in the case of images with an average level of symmetry the variance in the answers tends to be high. Qualitatively, there are various aspects on the subjective judgement of symmetry that we can observe in the human evaluation (1 – 3): (1) absence of features decreases the subjective rating of symmetry, e.g., the image in Figure 6a has a nearly perfect symmetry in the image features, but as there are only very few features that can be

12 Human-Centred Explainability of Visuo-Spatial Symmetry

sym.: 0.87 sym.: 0.78 sym.: 0.67 sym.: 0.64 var.: 0.0122 var.: 0.1857 var.: 0.1735 var.: 0.2224

sym.: 0.01 sym.: 0.01 sym.: 0.02 sym.: 0.02 var.: 0.0122 var.: 0.0199 var.: 0.0190 var.: 0.0221

sym.: 0.54 sym.: 0.35 sym.: 0.26 sym.: 0.48 var.: 0.4614 var.: 0.4120 var.: 0.3506 var.: 0.3202

Figure 5: Sample results from the human experiment. (row 1) most symmetric; (row 2) most non-symmetric (these correspond directly to the images with the lowest variance in the answers); (row 3) images with the biggest variance in the answers.

(a) (b) (c)

Figure 6: Samples from the experimental data. symmetrical people only perceived it as medium symmetrical, with a high variance in the answers; (2) symmetrical placement of people in the image has a higher impact on the subjective judgement of symmetry then other objects, e.g., the image in Figure 6b is judged as symmetrical based on the placement of the characters and the door in the middle, but the objects on the left and right side are not very symmetrical; (3) images that are naturally structured in a symmetrical way are judged less symmetrical then those arranged in a symmetrical way, e.g., images of centred faces as

13 J. Suchan, M. Bhatt, S. Varadarajan, A. Amirshahi, S. Yu

(a) (b) (c)

Figure 7: Results of empirical evaluation with three different feature set combinations, showing (a) mean accuracy, (b) mean error, and (c) class probability error.

Feature Sets CA (%) Avg. Sym. Err. Class Prob. Err. fs1 41.33 0.01806876383 0.0572886659 fs1+2 52.00 0.0126452444 0.0400713172 fs1+2+3 54.00 0.009900461023 0.0375853705

Table 4: Results from classification and prediction pipeline. depicted in Figure 6c, are rated less symmetrical then other images with similar symmetry on the feature level. Subjective symmetry interpretation. To evaluate how good our symmetry model reflects sub- jective human criteria for judging symmetry in naturalistic images, we use the results from the human study to train a classifier and a regressor to predict the symmetry class of an image and pre- dict the average symmetry of the images. For our experiment, we extracted three sets of features (fs1 - fs3) from the symmetry model: fs1 consists of the cosine similarity between the two halves of each image on each of the 5 convolution layers in an AlexNet; fs2 consists of the symmetri- cal properties between image patches, i.e., divergence from symmetrical spatial configuration, and perceptual similarity; and fs3 consists of the symmetrical properties of object configuration and people in the images. We have 2 models, a classifier and a regressor. A given image is classified into one of the 4 symmetry classes using the classifier. This model is evaluated using the mean accuracy as shown in Figure7(a). The classifier model also predicts the per class probabilities, this is denoted by multiclass proba model. This model is evaluated by calculating the Mean Squared Error (MSE) between the predicted probabilities and the percentages from the human data for each class. The per class errors are shown in Figure7(c) while the mean error is shown in Figure7(b). The regressor model predicts the average symmetry value of a given image. The model is evaluated by calculating the MSE between the predicted average symmetry value and average symmetry value from the human data. We use the pipeline optimization method of TPOT (Olson et al., 2016) to au- tomatically build the classification and regression pipelines for the feature sets. This results in a classification pipeline consisting of an ensemble of DecisionTrees, SVM, RandomForest classifiers while the regression pipeline consists of an ensemble of ExtraTrees and XGBoost regressors. The models are trained and tested on the 3-feature set using 5-fold cross validation, splitting the 150 images into 5 folds. Reported are mean error and classification accuracy (CA).

14 Human-Centred Explainability of Visuo-Spatial Symmetry

Results and Discussion The results (Figure7; Table4) show that using the features from our symmetry model improves performance in both tasks, i.e., the accuracy for the classification task improves by over 10% (see Table4) from 41.33 % to 54%, and for the per class probabilities the errors decreases from 0.057 to 0.038. The biggest improvement in the classification and in the prediction of the average symmetry value happens when adding the image patch features fs2 (Figure7(a), and (b)). Adding people centred features only results in a small improvement, which may be because only a subset of the images in the dataset involves people. The results on the predicted per class probabilities (Figure7(c)) show that by adding features from our symmetry model we are able to make better predictions on the variances in the human answers.

4. Discussion and Related Work

Symmetry in images has been studied from different perspectives, including visual perception re- search, neuroscience, cognitive science, arts and aesthetics (Treder, 2010). The semantic interpre- tation of symmetry from the viewpoint of perception and aesthetics requires a mixed empirical- analytical methodology consisting of both empirical and analytical methods:

• Empirical / Human Behaviour Studies. This involves qualitative studies involving subjec- tive assessments, as well as an evidence-based approach measuring human performance from the viewpwoint of visual perception using eye-tracking, qualitative evaluations, and think- aloud analysis with human subjects; and • Analytical / Interpretation and Saliency. This involves the development of computational models that serve an interpretation and a predictive function involving, for instance: (i) multi- level computational modelling of interpreting visuo-spatial symmetry; (ii) a saliency model of visual attention serving a predictive purpose vis-a-vis the visuo-spatial structure of visual media.

Symmetry and (computer) vision Symmetry is an important feature in visual perception and there are numerous studies in vision research investigating how symmetry affects visual perception (Cohen & Zaidi, 2013; Norcia et al., 2002; Machilsen et al., 2009; Bertamini & Makin, 2014), and how it is detected by humans (Wagemans, 1997; Freyd & Tversky, 1984; Arp´ ad´ Csatho´ et al., 2004). Most relevant to our work is the research on computational symmetry in the area of computer vision (Liu et al., 2013, 2010). Typically, computational studies on symmetry in images characterise symmetry as reflection, translation, and rotation symmetry; here, reflection symmetry (also referred to as bilateral or mirror symmetry) has been investigated most extensively. Another direction of research in this area focuses on detecting symmetric structures in objects. In this context Teo et al. (2015) presents a classifier that detects curved symmetries in 2D images. Similarly, Lee & Liu (2012) presented an approach to detect curved glide-reflection symmetry in 2D and 3D images, and Atadjanov & Lee(2016) uses appearance of structure features to detect symmetric structures of objects. Computational analysis of image structure Analysing image structure is a central topic in com- puter vision research and there are various approaches for different aspects involved in this task.

15 J. Suchan, M. Bhatt, S. Varadarajan, A. Amirshahi, S. Yu

Figure 8: Manual manipulation of symmetry. Symmetry decreasing from highly symmetric to not symmetric.

Deep learning with convolutional neural networks () provide the basis for analysing images using learned features, e.g., AlexNets (Krizhevsky et al., 2012), or ResNets(He et al., 2016), trained on the ImageNet Dataset (Deng et al., 2009). Most recent developments in object detection involve RCNN based detectors such as Ren et al.(2017) and Girshick et al.(2016), where objects are de- tected based on region proposals extracted from the image, e.g., using selective search (Uijlings et al., 2013) or region proposal networks for predicting object regions. For comparing images, Zagoruyko & Komodakis(2015) and Dosovitskiy & Brox(2016) measure perceptual similarity based on features learned by a neural network.

5. Summary and Outlook

Our research addresses visuo-spatial symmetry in the context of naturalistic stimuli in the domain of visual arts, e.g., film, paintings, and landscape and architectural photography. With a principal focus on developing a human-centred computational model of (interpreting) visuo-spatial symmetry, our approach is motivated and driven by three crucial and mutually synergistic aspects, namely: reception, interpretation, and synthesis:

• Reception: A behavioural study of the human perception (and explanation) of symme- try from the viewpoint of visual attention, and spatio-linguistic and qualitative characteri- sation(s);

• Interpretation: A computational model of deep semantic interpretation of visual symmetry with an emphasis on human-centred explainability and visual sensemaking;

• Synthesis: The ability to apply human-centred explainable models as a basis to directly or indirectly engineer visual media vis-a-via their (predictive) receptive effects, i.e., guiding at- tention by influencing visual fixation patterns, minimising / maximising saccadic movements (e.g., in animation, gaming, built environment planning, and design).

In this paper, we have focussed on the reception and interpretation aspects; we presented a declara- tive, computational model of reflectional symmetry integrating (visuospatial) composition, feature- level similarity, and semantic similarity in visual stimuli. Some possible next steps could be:

16 Human-Centred Explainability of Visuo-Spatial Symmetry

• spatio-temporal symmetry and visual perception: Going beyond static images to analyse symmetry in space-time (e.g., as in the films of Wes Anderson (Bhatt & Suchan, 2015)): here, a particular focus is on the influence of space-time symmetry on visual fixations and saccadic eye-movements (Suchan et al., 2016b)

• visual processing aspect: More advanced region proposals are possible, and can be natu- rally driven by newer forms of visual computing primitives and similarity measures. The framework is modular and may be extended with improved or new visual-computing features

• Resynthesising images: produce qualitatively distinct classes of (a)symmetry (e.g., Figure 8), and conducting further empirical studies involving qualitative surveys, eye-tracking, think- aloud studies etc

The most immediate outlook of our research on the computational front is geared towards extending the current symmetry model for the analysis of spatio-temporal symmetry particularly from the viewpoint of moving images as applicable in film, animation, and other kinds of narrative media. Towards this, we extend the symmetry model to include a richer spatio-temporal ontology, e.g., con- sisting of ‘space-time’ entities (Suchan & Bhatt, 2016b; Suchan et al., 2018; Schultz et al., 2018) for the analysis of spatio-temporal symmetry. Space-time symmetry analysis will also be comple- mented with specialised methods that provide a holistic view of the cinematographic “geometry of a scene”(Suchan & Bhatt, 2016a,b). Another promising line of work we are exploring involves relational learning of visuo-spatial symmetry patterns (e.g., based on based on inductive generali- sation (Suchan et al., 2016a)). Explainable learning from (big) visual datasets promises to offer a completely new approach towards the study of media and art history, cultural studies, and aesthetics.

Acknowledgements

We thank the anonymous reviewers and ACS 2018 program chairs Pat Langley and Dongkyu Choi for their constructive comments and for helping improve the presentation of the paper. The empirical aspect of this research was made possible by the generous participation of of volunteers in the online survey; we are grateful to all participants. We acknowledge the support of Steven Kowalzik towards digital media work pertaining to manual re-synthesis of categories of stimuli.

References Atadjanov, I. R., & Lee, S. (2016). Reflection symmetry detection via appearance of structure descriptor, (pp. 3–18). Cham: Springer International Publishing. Bertamini, M., & Makin, A. D. (2014). Brain activity in response to visual symmetry. Symmetry, 6, 975–996. Bhatt, M., Lee, J. H., & Schultz, C. P. L. (2011). CLP(QS): A declarative spatial reasoning frame- work. Proceedings of the 10th International Conference on Spatial Information Theory, COSIT 2011, Belfast, ME, USA, 2011. Proceedings (pp. 210–230). Springer.

17 J. Suchan, M. Bhatt, S. Varadarajan, A. Amirshahi, S. Yu

Bhatt, M., Schultz, C., & Freksa, C. (2013). The ‘Space’ in Spatial Assistance Systems: Conception, Formalisation and Computation. Representing space in cognition: Interrelations of behavior, language, and formal models. Series: Explorations in Language and Space. 978-0-19-967991-1, Oxford University Press. Bhatt, M., & Suchan, J. (2015). Obsessed by symmetry. –Wes Anderson– and the visual and cine- matographic shape of his films. SHAPES 3.0: Proceedings of the Third Interdisciplinary Work- shop - SHAPES 3.0: The Shape of Things., CONTEXT 2015., Larna, Cyprus.. Cao, Z., Simon, T., Wei, S., & Sheikh, Y. (2017). Realtime multi-person 2d pose estimation us- ing part affinity fields. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 2017 (pp. 1302–1310). IEEE Computer Society. Cohen, E. H., & Zaidi, Q. (2013). Symmetry in context: Salience of mirror symmetry in natural patterns. Journal of Vision, 13, 22. Arp´ ad´ Csatho,´ van der Vloed, G., & van der Helm, P. A. (2004). The force of symmetry revisited: symmetry-to-noise ratios regulate (a)symmetry effects. Acta Psychologica, 117, 233 – 250. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hi- erarchical image database. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009. (pp. 248–255). IEEE. Dosovitskiy, A., & Brox, T. (2016). Generating images with perceptual similarity metrics based on deep networks. Proceedings of Advances in Neural Information Processing Systems (NIPS 2016). Freyd, J., & Tversky, B. (1984). Force of symmetry in form perception. The American Journal of Psychology, 97, 109–126. Girshick, R. B., Donahue, J., Darrell, T., & Malik, J. (2016). Region-based convolutional networks for accurate object detection and segmentation. IEEE Transactions on Pattern Analysis and Ma- chine Intelligence, 38, 142–158. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. Pro- ceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 2016 (pp. 770–778). IEEE Computer Society. Jaffar, J., & Maher, M. J. (1994). Constraint logic programming: A survey. The Journal of Logic Programming, 19, 503–581. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convo- lutional neural networks. Proceedings of Advances in Neural Information Processing Systems (NIPS 2012) (pp. 1097–1105). Lee, S., & Liu, Y. (2012). Curved glide-reflection symmetry detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34, 266–278. Liu, J., Slota, G., Zheng, G., Wu, Z., Park, M., Lee, S., Rauschert, I., & Liu, Y. (2013). Symmetry detection from realworld images competition 2013: Summary and results. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013) Workshops.

18 Human-Centred Explainability of Visuo-Spatial Symmetry

Liu, Y., Hel-Or, H., Kaplan, C. S., & Gool, L. V. (2010). Computational symmetry in computer vision and computer graphics. Foundations and Trends in Computer Graphics and Vision, 5, 1–195. Machilsen, B., Pauwels, M., & Wagemans, J. (2009). The role of vertical mirror symmetry in visual shape detection. Journal of Vision, 9, 11. Miller, G. A. (1995). Wordnet: A lexical database for english. Commun. ACM, 38, 39–41. Molnar, V., & Molnar, F. (1986). Symmetry-making and -breaking in visual art. Computers & Mathematics with Applications, 12, 291 – 301. Norcia, A. M., Candy, T. R., Pettet, M. W., Vildavski, V. Y., & Tyler, C. W. (2002). Temporal dynamics of the human response to symmetry. Journal of Vision, 2, 1. Olson, R. S., Urbanowicz, R. J., Andrews, P. C., Lavender, N. A., Kidd, L. C., & Moore, J. H. (2016). Automating biomedical data science through tree-based pipeline optimization. Applica- tions of Evolutionary Computation (pp. 123–137). Cham: Springer International Publishing. Redmon, J., Divvala, S. K., Girshick, R. B., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 2016 (pp. 779–788). IEEE Computer Society. Ren, S., He, K., Girshick, R. B., & Sun, J. (2017). Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137–1149. Schultz, C., Bhatt, M., Suchan, J., & Wałe¸ga, P. (2018). Answer set programming modulo ‘space- time’. Proceedings of the International Joint Conference Rules and Reasoning , RuleML+RR 2018, Luxembourg, 2018. Springer. Suchan, J., & Bhatt, M. (2016a). The geometry of a scene: On deep semantics for visual perception driven cognitive film, studies. Proceedings of the IEEE Winter Conference on Applications of Computer Vision, WACV 2016, Lake Placid, NY, USA,, 2016 (pp. 1–9). IEEE Computer Society. Suchan, J., & Bhatt, M. (2016b). Semantic question-answering with video and eye-tracking data: AI foundations for human visual perception driven cognitive film studies. Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 2016 (pp. 2633–2639). IJCAI/AAAI Press. Suchan, J., Bhatt, M., & Schultz, C. P. L. (2016a). Deeply semantic inductive spatio-temporal learning. Proceedings of the 26th International Conference on Inductive Logic Programming (Short papers), London, UK, 2016. (pp. 73–80). CEUR-WS.org. Suchan, J., Bhatt, M., Wałe¸ga, P. A., & Schultz, C. P. L. (2018). Visual explanation by high-level abduction: On answer-set programming driven reasoning about moving objects. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, Louisiana, USA, 2018. AAAI Press. Suchan, J., Bhatt, M., & Yu, S. (2016b). The perception of symmetry in the moving image: multi- level computational analysis of cinematographic scene structure and its visual reception. Pro- ceedings of the ACM Symposium on Applied Perception, SAP 2016, Anaheim, California, USA,

19 J. Suchan, M. Bhatt, S. Varadarajan, A. Amirshahi, S. Yu

2016 (p. 142). ACM. Teo, C. L., Fermuller,¨ C., & Aloimonos, Y. (2015). Detection and segmentation of 2d curved reflection symmetric structures. Proceedings of the IEEE International Conference on Computer Vision (ICCV 2015) (pp. 1644–1652). Treder, M. S. (2010). Behind the looking-glass: A review on human symmetry perception. Symme- try, 2, 1510–1543. Uijlings, J. R. R., van de Sande, K. E. A., Gevers, T., & Smeulders, A. W. M. (2013). Selective search for object recognition. International Journal of Computer Vision, 104, 154–171. Wagemans, J. (1997). Characteristics and models of human symmetry detection. Trends in Cogni- tive Sciences, 1, 346–352. Weyl, H. (1952). Symmetry. Princeton paperbacks. Princeton University Press. Wielemaker, J., Schrijvers, T., Triska, M., & Lager, T. (2012). SWI-Prolog. Theory and Practice of Logic Programming, 12, 67–96. Zagoruyko, S., & Komodakis, N. (2015). Learning to compare image patches via convolutional neu- ral networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 2015 (pp. 4353–4361). IEEE Computer Society.

20