<<

24th European Conference on Artificial Intelligence - ECAI 2020 Santiago de Compostela, Spain

A Knowledge-based System for the Dynamic Generation and Classification of Novel Contents in Multimedia Broadcasting

Eleonora Chiodino1 and Davide Di Luccio2 and Antonio Lieto3 and Alberto Messina4 and Gian Luca Pozzato5 and Davide Rubinetti6

Abstract. In this work we exploit a recently introduced nonmono- requires, from an AI perspective, the harmonization of two conflict- tonic extension of Description Logics, able to deal with the prob- ing requirements that are hardly accommodated in symbolic systems lem of knowledge invention via commonsense concept combination, [7]: the need of a syntactic and semantic compositionality (typical of to dynamically generate novel editorial contents in the context of a logical systems) and the one concerning the exhibition of typicality real broadcasting company: RAI - Radiotelevisione Italiana, the Ital- effects. According to a well-known argument [20], in fact, prototypes ian public broadcaster. In particular, we introduce the system imple- (i.e. commonsense conceptual representations based on typical prop- menting such logic, i.e. DENOTER: Dynamic gEnerator of NOvel erties) are not compositional. The argument runs as follows: consider contents in mulTimEdia bRoadcasting (available online at the URL: a concept like pet fish. It results from the composition of the concept http://di.unito.it/denoter), that has been applied and pet and of the concept fish. However, the prototype of pet fish can- tested in the online multimedia platform of RAI (i.e. RaiPlay) as a not result from the composition of the prototypes of a pet and a fish: tool for both the generation/suggestion of novel genres of multime- e.g. a typical pet is furry and warm, a typical fish is grayish, but a dia on-demand contents and the reclassification of the available items typical pet fish is neither furry and warm nor grayish (typically, it within such new genres. Our system works by extracting the typical is red). The pet fish phenomenon is a paradigmatic example of the properties characterizing the available genres (with a standard infor- difficulty to address when building formalisms and systems trying to mation extraction pipeline) and by building novel classes of genres imitate this combinatorial human ability. In this paper, we exploit a as the result of a creative combination of such extracted represen- framework able to account for this type of human-like concept com- tations. We have tested DENOTER (i) by reclassifying the available bination and we show how it can be used as a tool for the generation contents in RaiPlay with respect to the new generated genres (ii) with and the suggestion of novel editorial content. In particular, we adopt an evaluation, in the form of a controlled user study experiment, of the recently introduced nonmonotonic extension of Description Log- the feasibility of using the obtained reclassifications as recommended ics (from on DL, see [2]) able to reason on typicality and called contents (iii) with a qualitative evaluation done with a small group of TCL (typicality-based compositional logic) introduced in [15, 13]. experts of RAI. The obtained results are encouraging and pave the In this logic, “typical” properties can be directly specified by way to many possible further improvements and research directions. means of a “typicality” operator T enriching the underlying DL, and a TBox can contain inclusions of the form T(C) v D to represent that “typical Cs are also Ds”. As a difference with standard DLs, 1 INTRODUCTION in the logic TCL one can consistently express exceptions and reason about defeasible inheritance as well. Typicality inclusions are also Knowledge invention via conceptual recombination is an important equipped by a real number p ∈ (0.5, 1] representing the probabil- generative phenomenon highlighting some crucial aspects of the ity/degree of belief in such a typical property: this allows us to define knowledge processing capabilities in human cognition. Such ability, a semantics inspired to the DISPONTE semantics [22] characterizing in fact, concerns high-level capacities associated to creative think- probabilistic extensions of DLs, which in turn is used in order to de- ing and problem solving. Still, it represents an open challenge in the scribe different scenarios where only some typicality properties are field of artificial intelligence [3]. Dealing with this problem, indeed, considered. Given a KB containing the description of two concepts CH and CM occurring in it, we then consider only some scenarios in 1 Dipartimento di Psicologia, Universita` di Torino, Italia, email: order to define a revised knowledge base, enriched by typical proper- [email protected] 2 Dipartimento di Informatica, Universita` di Torino, Italia, email: da- ties of the combined concept C v CH u CM by also implementing [email protected] some heuristics coming from the cognitive semantics. 3 Dipartimento di Informatica, Universita` di Torino and Istituto di Cal- In this work we exploit the logic TCL in order to dynamically gen- colo e Reti ad Alte Prestazioni, ICAR-CNR Palermo, Italia, email: anto- erate novel knowledge by means of a mechanism for commonsense [email protected] 7 4 RAI - Centro Ricerche, Innovazione Tecnologica e Sperimentazione, combination . This generative and creative capacity has been tested Torino, Italia, email: [email protected] 5 Dipartimento di Informatica, Universita` di Torino, Italia, email: gian- 7 Other works have already shown how such logic can be used to model com- [email protected] cognitive phenomena [15], creative problem solving [16, 12] and to 6 Dipartimento di Informatica, Universita` di Torino, Italia, email: da- build intelligent applications in the field of computational creativity [14]. [email protected] Alternative approaches to the problem of commonsense conceptual combi- 24th European Conference on Artificial Intelligence - ECAI 2020 Santiago de Compostela, Spain in the context of a real multimedia broadcaster (RAI - RadioTele- 2 THE DESCRIPTION LOGIC TCL FOR visione Italiana) as a tool for both the suggestion of novel genres CONCEPT COMBINATION of multimedia on-demand contents of the online platform RaiPlay (https://www.raiplay.it) and for the reclassification of the In this section we briefly recall the basic concepts underlying the CL available items within such new genres. We introduce the system logic T [13, 15], used in the system DENOTER as the basis for the DENOTER (Dynamic gEnerator of NOvel contents in mulTimEdia generation of new genres as the combination of two existing ones. bRoadcasting) which, first, automatically builds prototypes of exist- This logic combines three main ingredients. The first one relies on ing basic genres in RaiPlay (comedy, thriller, kids, horror, and so the DL of typicality ALC + TR introduced in [8], which allows to on) by extracting information about concepts or properties occurring describe the protoype of a concept. In this logic, “typical” properties with the highest frequencies in the textual descriptions of the multi- can be directly specified by means of a “typicality” operator T en- media contents available in the online platform8. Such prototypes are riching the underlying DL, and a TBox can contain inclusions of the formalized by means of a TCL knowledge base, whose TBox contains form T(C) v D to represent that “typical Cs are also Ds”. As a both rigid inclusions of the form difference with standard DLs, in the logic ALC + TR one can con- sistently express exceptions and reason about defeasible inheritance as well. For instance, a knowledge base can consistently express that BasicGenre v Concept, “normally, athletes are fit”, whereas “sumo wrestlers usually are not fit” by T(Athlete) v Fit and T(SumoWrestler) v ¬Fit, given in order to express essential desiderata but also constraints, for in- that SumoWrestler v Athlete. The semantics of the T operator stance Musical v Song (musical contents must have songs) and is characterized by the properties of rational logic [10], recognized Kids v ¬Blood (due to law restrictions, contents available for kids as the core properties of nonmonotonic reasoning. ALC + TR is must not contain blood), as well as prototypical properties of the characterized by a minimal model semantics corresponding to an ex- form tension to DLs of a notion of rational closure as defined in [10] for propositional logic: the idea is to adopt a preference relation among p :: T(BasicGenre) v TypicalConcept, ALC + TR models, where intuitively a model is preferred to an- other one if it contains less exceptional elements, as well as a notion representing typical concepts of a given genre, where p is a of minimal entailment restricted to models that are minimal with re- real number in the range (0.5, 1], expressing the frequency of spect to such preference relation. As a consequence, T inherits well- such a concept in items belonging to that genre: for instance, established properties like specificity and irrelevance: in the example, 0.72 :: T(Comedy) v Heaven is used to express that the the logic ALC + TR allows us to infer T(Athlete u Bald) v Fit typical comedy contains/refers to the concept Heaven with a fre- (being bald is irrelevant with respect to being fit) and, if one knows quency/probability/degree of belief of the 72%, and such a degree is that Hiroyuki is a typical sumo wrestler, to infer that he is not fit, automatically extracted by DENOTER from the description of mul- giving preference to the most specific information. timedia contents currently available on RaiPlay and marked as be- As a second ingredient, we consider a distributed semantics sim- longing to such a genre. ilar to the one of probabilistic DLs known as DISPONTE [23], al- Given the knowledge base with the prototypical descriptions of lowing to label inclusions T(C) v D with a real number between basic genres, DENOTER exploits the reasoning capabilities of the 0.5 and 1, representing its degree of belief/probability, assuming that logic TCL in order to generate new derived genres as the result of the each axiom is independent from each others. Degrees of belief in creative combination of two (or even more) basic or derived ones. typicality inclusions allow to define a probability distribution over DENOTER also reclassifies multimedia contents of RaiPlay taking scenarios: roughly speaking, a scenario is obtained by choosing, for the new, derived genres into account. Intuitively, a multimedia item each typicality inclusion, whether it is considered as true or false. belongs to the new generated genre if its metadata (name, descrip- In a slight extension of the above example, we could have the need tion, title) contain all the rigid properties as well as at least the 30% of representing that both the typicality inclusions about athletes and of the typical properties of such a derived genre. In this respect, DE- sumo wrestlers have a degree of belief of 80%, whereas we also be- NOTER can be seen as a “white box” recommender system, able to lieve that athletes are usually young with a higher degree of 95%, suggest to its users multimedia contents/episodes belonging to new with the following KB: genres by providing an explanation of such a recommendation: in- (1) SumoWrestler v Athlete deed, a content is suggested if it is classified in the new genre, ob- (2) 0.8 :: T(Athlete) v Fit tained by combining typical properties of basic genres preferred by (3) 0.8 :: T(SumoWrestler) v ¬Fit the users themselves, and such a combination is driven by the theo- (4) 0.95 :: T(Athlete) v YoungPerson retical foundations of the logic TCL. We have also tested DENOTER by performing three different We consider eight different scenarios, representing all pos- kinds of evaluation that are reported and discussed in Section 5, sible combinations of typicality inclusion: as an example, namely an automatic evaluation, an evaluation of the satisfaction of {((2), 1), ((3), 0), ((4), 1)} represents the scenario in which (2) and users, and a qualitative evaluation by a small group of experts of RAI, (4) hold, whereas (3) does not. Obviously, (1) holds in every sce- showing promising results. nario, since it represents a rigid property, not admitting exceptions. We equip each scenario with a probability depending on those of the involved inclusions: the scenario of the example has probability nation have been recently discussed in [6], [11], [4]. The main advantages 0.8 × 0.95 (since 2 and 4 are involved) ×(1 − 0.8) (since 3 is not of TCL with respect to such approaches are detailed in [15]. = 0.152 = 15.2% 8 In RaiPlay, each multimedia item (e.g. TV series episodes, movies, etc.) is involved) . Such probabilities are then taken into currently explicitly marked as belonging to one or more basic genres by the account in order to choose the most adequate scenario describing the company owning the rights about such product. prototype of the combined concept. 24th European Conference on Artificial Intelligence - ECAI 2020 Santiago de Compostela, Spain

As a third element of the proposed formalization we employ a – (¬C)I = ∆I \ CI method inspired by cognitive semantics [9] for the identification of – (C u D)I = CI ∩ DI a dominance effect between the concepts to be combined: for every combination, we distinguish a HEAD, representing the stronger ele- – (C t D)I = CI ∪ DI ment of the combination, and a MODIFIER. The basic idea is: given – (∃R.C)I = {x ∈ ∆I | ∃(x, y) ∈ RI such that y ∈ CI } a KB and two concepts CH (HEAD) and CM (MODIFIER) occur- I I I I ring in it, we consider only some scenarios in order to define a revised – (∀R.C) = {x ∈ ∆ | ∀(x, y) ∈ R we have y ∈ C } I I I I knowledge base, enriched by typical properties of the combined con- – (T(C)) = Min<(C ), where Min<(C ) = {x ∈ C | I cept C v CH u CM . @y ∈ C s.t. y < x}. Let us now present the logic TCL more in detail. The language of TCL extends the basic DL ALC by typicality inclusions of the form A model M can be equivalently defined by postulating the existence I T(C) v D equipped by a real number p ∈ (0.5, 1] – observe that of a function kM : ∆ 7−→ N, where kM assigns a finite rank to the extreme 0.5 is not included – representing its degree of belief, each domain element [8]: the rank of x is the length of the longest whose meaning is that “we believe with degree/probability p that, chain x0 < . . . < x from x to a minimal x0, i.e. such that there is no 0 0 normally, Cs are also Ds” 9 x such that x < x0. The rank function kM and < can be defined from each other by letting x < y if and only if kM(x) < kM(y). Definition 1 (Language of TCL) We consider an alphabet of con- CL cept names C, of role names R, and of individual constants O. Given Definition 3 (Model satisfying a knowledge base in T ) Let I I A ∈ C and R ∈ R, we define: K = hR, T , Ai be a KB. Given a model M = h∆ , <, . i, we assume that .I is extended to assign a domain element aI of ∆I to C,D := A | > | ⊥ | ¬C | C u C | C t C | ∀R.C | ∃R.C each individual constant a of O. We say that: We define a knowledge base K = hR, T , Ai where: •M satisfies R if, for all C v D ∈ R, we have CI ⊆ DI ; •R is a finite set of rigid properties of the form C v D; •M satisfies T if, for all q :: T(C) v D ∈ T , we have that 11 •T is a finite set of typicality properties of the form I I I I T(C) ⊆ D , i.e. Min<(C ) ⊆ D ; p :: T(C) v D •M satisfies A if, for each assertion F ∈ A, if F = C(a) then aI ∈ CI , otherwise if F = R(a, b) then (aI , bI ) ∈ RI . where p ∈ (0.5, 1] ⊆ R is the degree of belief of the typicality inclu- sion; Even if the typicality operator T itself is nonmonotonic (i.e. •A is the ABox, i.e. a finite set of formulas of the form either C(a) T(C) v E does not imply T(C u D) v E), what is inferred or R(a, b), where a, b ∈ O and R ∈ R. from a KB can still be inferred from any KB’ with KB ⊆ KB’, i.e. the resulting logic is monotonic. As already mentioned, in or- A model M in the logic TCL extends standard ALC models by der to perform useful nonmonotonic inferences, in [8] the authors a preference relation among domain elements as in the logic of typ- have strengthened the above semantics by restricting entailment to a icality [8]. In this respect, x < y means that x is “more normal” class of minimal models. Intuitively, the idea is to restrict entailment than y, and that the typical members of a concept C are the minimal to models that minimize the atypical instances of a concept. The re- elements of C with respect to this relation10. An element x ∈ ∆I sulting logic corresponds to a notion of rational closure on top of I is a typical instance of some concept C if x ∈ C and there is no ALC + TR. Such a notion is a natural extension of the rational clo- C-element in ∆I more normal than x. Formally: sure construction provided in [10] for the propositional logic. This nonmonotonic semantics relies on minimal rational models that min- Definition 2 (Model of TCL) A model M is any structure imize the rank of domain elements. Informally, given two models of h∆I , <, .I i KB, one in which a given domain element x has rank 2 (because for instance z < y < x), and another in which it has rank 1 (because where: only y < x), we prefer the latter, as in this model the element x is assumed to be “more typical” than in the former. Query entailment • ∆I is a non empty set of items called the domain; is then restricted to minimal canonical models. The intuition is that • < is an irreflexive, transitive, well-founded and modular (for all a canonical model contains all the individuals that enjoy properties x, y, z in ∆I , if x < y then either x < z or z < y) relation over that are consistent with KB. This is needed when reasoning about the ∆I ; rank of the concepts: it is important to have them all represented. • .I is the extension function that maps each atomic concept C to Given a KB K = hR, T , Ai and given two concepts C and CI ⊆ ∆I , and each role R to RI ⊆ ∆I × ∆I , and is extended H C occurring in K, the logic TCL allows defining a prototype of the to complex concepts as follows: M combined concept C as the combination of the HEAD CH and the 9 The reason why we only allow typicality inclusions equipped with proba- MODIFIER CM , where the typical properties of the form T(C) v bilities p > 0.5 is due to our effort of integrating two different semantics: D (or, equivalently, T(CH u CM ) v D) to ascribe to the concept typicality based logic and DISPONTE. In particular, as detailed in [15] this choice seems to be the only one compliant with both the formalisms. On C are obtained by considering blocks of scenarios with the same the contrary, it would be misleading to also allow low degrees of belief for probability, in decreasing order starting from the highest one. We typicality inclusions, since typical knowledge is known to come with a low first discard all the inconsistent scenarios, then: degree of uncertainty. 10 It could be possible to consider an alternative semantics whose models are 11 It is worth noticing that here the degree q does not play any role. Indeed, equipped with multiple preference relations. However the approach based a typicality inclusion T(C) v D holds in a model only if it satisfies on a single preference relation in [8] ensures good computational proper- the semantic condition of the underlying DL of typicality, i.e. minimal ties (reasoning in the resulting nonmonotonic logic ALC + TR has the (typical) elements of C are elements of D. The degree of belief q will same complexity of the standard ALC), whereas adopting multiple prefer- have a crucial role in the application of the distributed semantics, allowing ence relations could lead to higher complexities. the definition of scenarios as well as the computation of their probabilities. 24th European Conference on Artificial Intelligence - ECAI 2020 Santiago de Compostela, Spain

• we discard those scenarios considered as trivial, consistently in- typical fish are inherited by the combined concept are considered as heriting all the properties from the HEAD from the starting con- trivial and, therefore, discarded, as a consequence the property hav- cepts to be combined. This choice is motivated by the challenges ing the lowest degree (Greyish with degree 0.6) is not inherited. The provided by task of commonsense conceptual combination itself: prototypical Pet-Fish inherits from the prototypical pet only property in order to generate plausible and creative compounds it is neces- (6), since (5) conflicts with the rigid property (1), stating that all sary to maintain a level of surprise in the combination. Thus both fishes (then, also pet fishes) live in water, whereas (7) is blocked, as scenarios inheriting all the properties of the two concepts and all already mentioned, by the HEAD/MODIFIER heuristics. Formally, the properties of the HEAD are discarded since they prevent this the Pet u Fish-revised knowledge base contains, in addition to the surprise; above inclusions, the following ones: • among the remaining ones, we discard those inheriting properties from the MODIFIER in conflict with properties that could be con- 0.8 :: T(Pet u Fish) v Scaly (3’) sistently inherited from the HEAD; 0.8 :: T(Pet u Fish) v ¬Affectionate (4’) • if the set of scenarios of the current block is empty, i.e. all the sce- 0.9 :: T(Pet u Fish) v LovedByKids (6’)

narios have been discarded either because trivial or because pre- CL ferring the MODIFIER, we repeat the procedure by considering In [15] we have shown that reasoning in T remains in the same the block of scenarios, having the immediately lower probability. complexity class of standard ALC Description Logics. CL CL Reasoning in T is XP IME-complete. Remaining scenarios are those selected by the logic T . The ulti- Theorem 1 E T mate output of our mechanism is a knowledge base in the logic TCL whose set of typicality properties is enriched by those of the com- 3 GENERATING NOVEL GENRES FOR THE pound concept C. Given a scenario w satisfying the above properties, PLATFORM RAIPLAY we define the properties of C as the set of inclusions p :: T(C) v In this section we describe DENOTER, the system exploiting the D, for all T(C) v D that are entailed from w in the logic TCL. The logic TCL in order to generate and suggest novel editorial genres for probability p is such that: RaiPlay (https://www.raiplay.it), the online platform of

• if T(CH ) v D is entailed from w, that is to say D is a prop- on-demand contents of the Italian multimedia broadcaster RAI (RA- erty inherited either from the HEAD (or from both the HEAD dio televisione Italiana, http://www.rai.it). DENOTER is and the MODIFIER), then p corresponds to the degree of belief implemented in Python and it makes use of the library owlready2 of such inclusion of the HEAD in the initial knowledge base, i.e. (https://pythonhosted.org/Owlready2/) for relying on p : T(CH ) v D ∈ T ; the services of efficient DL reasoners (like HermiT). DENOTER first • otherwise, i.e. T(CM ) v D is entailed from w, then p corre- builds a prototypical description of basic genres available in RaiPlay, sponds to the degree of belief of such inclusion of a MODIFIER namely: action/adventure, kids, comedy, drama, science fiction, hor- in the initial knowledge base, i.e. p : T(CM ) v D ∈ T . ror, musical, religious, sentimental, and thriller. A screenshot of the platform is reported in the figure 1. The knowledge base obtained as the result of combining con- To this aim, a web crawler extracts metadata from multimedia con- cepts CH and CM into the compound concept C is called C-revised tents available on the platform. More in detail, for each item (pro- knowledge base, and it is defined as follows: gram, episode, etc.) the crawler extracts (i) the genre to which it be- longs and (ii) the set of “significant” words (i.e., excluding prepo- K = hR, T ∪ {p : T(C) v D}, Ai, C sitions, proper names, articles, etc.) occurring in the description of each item, as well as their frequency. These information are used for all D such that either T(CH ) v D is entailed in w or T(CM ) v in order to provide a description of each basic genre in terms of its D is entailed in w, and p is defined as above. CL As an example, consider the following version of the Pet-Fish typical properties in the logic T , where the frequency of a con- problem. Let KB contains the following inclusions: cept/word for a genre is obtained from the number of occurrences of such a concept/word in the items belonging to that genre. The five Fish v LivesInWater (1) properties with the highest frequency over 0.5 are included in the 0.6 :: T(Fish) v Greyish (2) prototypical description of each basic genre. Formally, we have: 0.8 :: T(Fish) v Scaly (3) Definition 4 Given a multimedia item m, let S be the set of signifi- 0.8 :: T(Fish) v ¬Affectionate (4) m cant concepts extracted for m by the web crawler, and let Concept ∈ 0.9 :: T(Pet) v ¬LivesInWater (5) S . Let n be the number of occurrences of Concept in 0.9 :: T(Pet) v LovedByKids (6) m m,Concept the description of m. We define the frequency f of concept 0.9 :: T(Pet) v Affectionate (7) m,Concept Concept for the item m as representing that a typical fish is greyish (2), scaly (3) and not af- nm,Concept fectionate (4), whereas a typical pet does not live in water (5), is fm,Concept = P . D∈S nm,D loved by kids (6) and is affectionate (7). Concerning rigid proper- m ties, we have that all fishes live in water (1). The logic TCL combines Definition 5 Given a basic genre Genre, let MI be the set of the concepts Pet and Fish, by using the latter as the HEAD and the multimedia items assigned to/labelled as belonging to Genre, and former as the MODIFIER. The prototypical Pet-Fish inherits from let SGenre be the set of the concepts occurring in such items, i.e. S the prototypical fish the fact that it is scaly and not affectionate, the SGenre = m∈MI Sm, where Sm is as in Definition 4. last one by giving preference to the HEAD since such a property Given a concept Concept ∈ SGenre and an item m ∈ MI, let conflicts with the opposite one in the modifier (a typical pet is affec- nm,Concept be the number of occurrences of Concept in the descrip- tionate). The scenarios in which all the three typical properties of a tion of m. We define nGenre,Concept the number of occurrences of 24th European Conference on Artificial Intelligence - ECAI 2020 Santiago de Compostela, Spain

Figure 1. A screenshot of the RaiPlay platform.

Concept in the description of items of Genre, i.e. data-driven, process of prototype formation with top down expert knowledge. In the example above, properties like History and Faith, X nGenre,Concept = nm,Concept . commonly associated to such genre, have been added. Therefore, the m∈MI knowledge base generated by the crawler will contain, among others, the following inclusions : We also define the frequency of a concept Concept for a genre Genre, written fGenre,Concept , as follows: 0.9 :: T(Religious) v God 0.7 :: T(Religious) v Life nGenre,Concept fGenre,Concept = P . Religious v History nGenre,C C∈SGenre Religious v Faith The prototypical description of a basic Genre in the logic DENOTER generates novel hybrid genres by combining existing CL T is defined as the set of inclusions p1 :: T(Genre) v ones (by using the same logical procedure of the pet-fish problem).

TypicalConcept 1, p2 :: T(Genre) v TypicalConcept 2, ..., As an example, consider the following prototypes of basic genres p5 :: T(Genre) v TypicalConcept 5, where TypicalConcept 1, Kids and Drama: TypicalConcept , ..., TypicalConcept are the five concepts in 2 5 Kids v ¬Sex S with the highest frequencies higher than 50%; frequencies Genre Kids v ¬Homicide are then also used as degrees of belief of the respective inclusions. 0.72 :: T(Kids) v Queen Formally: 0.64 :: T(Kids) v World 0.62 :: T(Kids) v Adventure Definition 6 Given a genre Genre, let the set of concepts SGenre of 0.6 :: T(Kids) v ¬DeadPerson Definition 5 in descending order by the frequencies fGenre,Concept of Definition 5: Drama v ¬Happiness 0.85 :: T(Drama) v Homicide SGenre = hC1,C2,...,Cki 0.83 :: T(Drama) v Life

where fGenre,C1 ≥ fGenre,C2 ≥ ... ≥ fGenre,Ck > 0.5. The 0.7 :: T(Drama) v DeadPerson prototypical description of Genre in the logic TCL is defined as the set of inclusions: DENOTER combines the two basic genres by implementing a variant of CoCoS [17], a Python implementation of reasoning ser- CL fGenre,C1 :: T(Genre) v C1 vices for the logic T in order to exploit efficient DLs reasoners

fGenre,C2 :: T(Genre) v C2 for checking both the consistency of each generated scenario and the . existence of conflicts among properties. More in detail, DENOTER . considers both the available choices for the HEAD and the MODI- f :: T(Genre) v C Genre,C5 5 FIER, and it allows to restrict its concern to a given and fixed number of inherited properties. As an example, the new, derived genre com- As an example, consider the basic genre Religious. The bining kids and drama with the limit fixed to four properties has the episodes/multimedia items labelled as belonging to such a genre are CL following T description (concept Kids u Drama): “Giacobbe”, “Gesu´ di Nazareth” and “Francesco”. All contain the word/concept God, whereas Life appears in the two latter ones, and 0.83 :: T(Kids u Drama) v Life they are both in the five most frequent concepts. 0.72 :: T(Kids u Drama) v Queen In some cases, i.e. when possible, RAI experts have also man- 0.7 :: T(Kids u Drama) v DeadPerson ually added some rigid properties, thus integrating the bottom-up, 0.64 :: T(Kids u Drama) v World 24th European Conference on Artificial Intelligence - ECAI 2020 Santiago de Compostela, Spain

Obviously, rigid properties of both basic concepts Kids and It is worth noticing that DENOTER is applied to a single multi- Drama are inherited by the derived concept, and this avoids the sys- media item, therefore it is normally applied to an episode of a given tem to consider the property Homicide, even if it has the highest series, rather than to a whole series. This allows a finer selection of probability/degree of belief associated to the prototypical description multimedia contents, since single episodes of a series often propose of Drama. DENOTER is also able to involve derived genres in the significantly different contents, then it could be plausible to suggest concept combination, for instance we can combine derived genres only some episodes to match the users’ objectives. Action u Sentimental and the above Kids u Drama. 5 EVALUATION AND DISCUSSION 4 RE-CLASSIFICATION AND SUGGESTIONS OF MULTIMEDIA CONTENTS IN RAIPLAY DENOTER has been tested in a threefold way. The first evaluation is completely automatic and inheres the capability of the system of Apart from the process of automatic knowledge generation, DE- generating novel hybrid genres that are able to be populated by the NOTER is also able to reclassify the multimedia items/episodes of original content of the RaiPlay platform via a re-classification mech- RaiPlay within the novel derived genres (generated as described anism involving the 4612 multimedia items of the platform. In this in the previous section). As mentioned, indeed, each multimedia case, the success criterion concerns the avoidance of the creation of item/episode is equipped by some information available in RaiPlay, empty boxes corresponding to the new generated combined genres. namely: title, name of the program/episode, description of the pro- A second evaluation, aimed at measuring the satisfaction of the gram/serie, description of the episode. DENOTER extracts such in- potential users of the platform when exposed to the contents of the formation and then computes the frequencies of concepts in it as in novel categories suggested by DENOTER, consisted in a user study12 Definition 4, in order to compare them with the properties of a de- involving 20 persons (5 females, 15 males, aged 20-35) that evalu- rived genre. If the item contains all the rigid properties and at least ated a total of 122 recommendations generated by the system. All the the 30% of the typical properties of the genre under consideration, participants were selected from the same population, i.e., voluntary then the multimedia content is classified as belonging to it. Last, DE- students at the Departments of Psychology and Computer Science NOTER suggests the set of classified contents, in a descending order of the University of Turin, using an availability sampling strategy. of compatibility, where a rank of compatibility of a single item with Participants were all naive to the experimental procedure and to the respect to a genre is intuitively obtained as the sum of the frequen- aims of the study. This evaluation was carried out as a classical “one cies of “compatible” concepts, i.e. concepts belonging to both the to one” lab controlled experiment (i.e. one person at time with one item and the prototypical description of the genre. Formally: expert interviewer) and we adopted a thinking aloud protocol13. At this stage, this solution was methodologically preferred with respect Definition 7 Given a multimedia item m, let DerivedGenre be a to the adoption of large scale online surveys since it allowed us to derived genre as defined in Section 3 and let Sm be the set of con- have more control on the type of thoughts and considerations emerg- cepts/words occurring in m as in Definition 4. Given a knowledge ing during the evaluation of the results. In this setting, the users had to base KB of genres built by DENOTER, we say that m is compatible start the interview by indicating a couple of preferred genres among with DerivedGenre if the following conditions hold: those available in RaiPlay. This selection triggered both the activa- tion of a novel hybrid prototypical genre by DENOTER and the cor- 1. m contains all rigid properties of DerivedGenre, i.e. {C | responding reclassification of the RaiPlay multimedia contents based DerivedGenre v C ∈ KB} ⊆ S m on such selection. The output of the system, pruned to show the top 2. m contains at least the 30% of typical properties of 5 best results, was then evaluated with a 1-10 voting scale expressing DerivedGenre , i.e. the satisfaction of the received recommendations. | S ∩ S | The results and the insights of the first two evaluation are reported m DerivedGenre ≥ 0.3, | SDerivedGenre | in Table 1. In particular, the first part of the table (total) reports re- spectively: where SDerivedGenre is the set of typical properties of DerivedGenre as in Definition 6. i the average reclassification results calculated for all the novel con- cepts generated by DENOTER (on average 180 items out of 4612 As an example, consider the above derived genre Kids u Drama, were reclassified for each novel genre obtained via concept com- and the multimedia items “Eneide” (https://www.raiplay. bination); it/programmi/eneide) and “Raccontami” (https://www. ii the average user vote assigned by the users to the recommenda- raiplay.it/programmi/raccontami). They are both re- tions of the reclassified elements (with an average score of 6.5 out classified in the novel, generated genre Kids u Drama, since: of 10). This score was calculated by considering, for each new category, the score assigned to the top 5 reclassified items, since • all rigid properties of both basic genres are satisfied, that is to say they were provided, to the users, as recommendations for the novel neither Sex nor Homicide nor Happiness belong to the proper- genres. ties extracted by the crawler for both the items; • more than the 30% of the typical properties of the derived genre The first column of the table also reports the details of the percentage are satisfied by the items, in particular “Eneide” has Life with of reclassified elements (calculated on the total of all the 4612 items frequency 0.6, Queen (0.65) and DeadPerson (0.65), whereas 12 “Raccontami” has Life (0.62) and Queen (0.6). This is one of the most commonly used methodology for the evaluation of recommender systems based on controlled small groups analysis, see [24]. 13 This technique consists in recording the verbal explanations provided by These two items will be then recommended by DENOTER in this the people while executing a given laboratory task. It has been used in the order (thanks to its three compatible properties, “Eneide” will have a AI literature since the pioneering work by Newell and Simon, as a source higher score with respect to “Raccontami”). to individuate the heuristics used by humans to solve a given task [18, 19]. 24th European Conference on Artificial Intelligence - ECAI 2020 Santiago de Compostela, Spain

available in RaiPlay) for the combined genres obtained by consider- all these issues can be addressed to improve the performance of the ing the original selections by the users based on their preference. For system and its adoption in the production phase. Overall the experts the same set of novel combined genres, the second column reports agreed in considering DENOTER as a good approach at addressing the average score assigned by the users to the recommendations of the very well known filter bubble effect [21], by introducing seeds each combined genre. Overall the obtained results are encouraging of serendipity in content discovery by users. One fundamental dis- since the average rate assigned is above 6 (on a 1-10 scale). cussion about the applicability of DENOTER in practice is whether or not it represents a truly innovative technical solution for a multi- Combined Genres Automatic Reclassification Average User Vote media recommender system. The context in the latest years in this 180 items per combination, 3.9% (total) 6.5 out of 10 (total) field is characterised by fervent research, which finds in the RecSys Thriller-Musical 1.3% 8.5 conference series the reference venue for publication14. According to Thriller-Fantasy 1.3% 6.4 Thriller-Comedy 0.71% 4.7 [25] recommender systems “try to identify the need and preferences Thriller-Action 2.9% 7.2 of users, filter the huge collection of data accordingly and present the Romance-Thriller 1.7% 7 Romance-Comedy 1.4% 9 best suited option before the users by using some well-defined mech- Romance-Drama 4.2% 5.4 anism”. Despite the huge amount of proposals, the main families of Musical-Fantasy 2.4% 7.3 Musical-Comedy 1.89% 7.2 recommender systems can be identified as based on: i) collaborative Musical-Action 4.29% 6.5 filtering; ii) content-based filtering; iii) hybrid filtering. At their core Fantasy-Thriller 1.25% 8 of functioning, collaborative filtering exploits similarities of usage Fantasy-Comedy 0.84% 4.6 Fantasy-Action 2.5% 5.6 patterns among mutually affine users, while content-based filtering Drama-Thriller 3.2% 5 exploits content similarity. DENOTER by definition falls into the lat- Drama-Fantasy 3.14% 6.5 Comedy-Romance 1.28% 7.5 ter category since in its current form it uses content description as the Comedy-Musical 1.89% 4.6 input. From the technical point of view, however, it differs from the Comedy-Drama 2.69% 7.6 current mainstream approaches that are mostly based on the compar- ison and matching of visual and aural features of the content [26, 5] by adding a logic framework capable of mapping and representing Table 1. Combined results of the the first 2 evaluations. genuinely new intuitive principles influencing user preferences and usage attitudes which cannot be derived from the pure analysis of content and/or the comparison of similar users. Furthermore, it has a A final, qualitative, evaluation was done with a small group of native adaptability to industrial contexts in which the editorial input three experts of RAI in the form of an expert interview (the interview has to be merged with automatic recommendation, since both kinds was carried out after a demo of the system and of its produced rec- of input can be effectively processed by the same framework. ommendations in RaiPlay). From this interview three main problem- atic elements emerged, that affected the overall quality (and there- 6 CONCLUSIONS AND FUTURE WORKS fore the assigned ratings) of the recommended items and that, once solved, can contribute to improve the accuracy of the recommenda- In this work we have presented DENOTER, a knowledge-based sys- tions. First, they pointed out how the quality of some recommen- tem for the dynamic generation of novel media genres, exploiting the dations (the low ranked ones) was affected by the fact that there reasoning mechanism of the logic TCL in order to generate, reclassify was a misalignment between the textual description of the recom- and suggest novel content genres in the context of RaiPlay, the online mended item and the content of the recommended item: in fact, the platform of RAI. The system has been tested in threefold evaluation descriptions associated to the items were not always reporting in- showing promising results for both the automatic evaluation and the formation about the content of the programme but sometimes only user acceptability of the recommended items. In addition, a last eval- very generic information (e.g. describing the plot of a whole TV se- uation conducted with experts has provided some valuable feedback ries and not of current episode) and this yielded - in some cases - that can be addressed in order to improve the results provided by the to counter-intuitive results. Second: some of the (low ranked) items system (in particular for what concerns the recommendation phase). corresponded to very old TV episodes or movies done before the The core component of the system DENOTER relies on CoCoS, a ’60s. The recommendation of such items was somehow unexpected tool for combining concepts in the logic TCL. In future research, we by the interviewed users, who expected to be exposed to more re- aim at studying the application of optimization techniques in [1] in cent content (this particular problem emerged also during the think- order to improve the efficiency of CoCoS and, a consequence, of the ing aloud protocol). Finally, the experts pointed out the importance proposed knowledge generation system. Secondly we aim at consid- of considering (in the initial selection phase) the possibility of explic- ering more accurate descriptions of media items than the online de- iting also negative preferences (e.g. ¬Kids) in order to additionally scriptions used in this work, namely Automatic Speech Recognition filter out some unwanted content. Notably the first two of the above data and semantic visual categories extracted from video and audio mentioned issues are not directly related to DENOTER, since: i) the channels of the content. Finally, as a mid-term goal, we plan to con- system can not know if the association description/item is coherent, duct a large scale experiment to further validate the effectiveness of but it just provides (for the recommended output) the correspondence the proposed approach. already in place in RaiPlay; ii) the recommendations of old editorial contents is based on the actual dataset of RaiPlay (collecting hun- dreds of TV shows, movies etc. from the 1954 to the recent days). ACKNOWLEDGEMENTS This element can be overcome by simply adding an additional filter This work is partially sopported by the INdAM project GNCS 2019 about the period preferences of the users. Finally, iii) the expression “METALLIC #2”. of negative preferences in DENOTER can be expressed by includ- ing the negation of the undesired prototypical descriptions. Overall 14 https://recsys.acm.org/ 24th European Conference on Artificial Intelligence - ECAI 2020 Santiago de Compostela, Spain

REFERENCES ‘Reasoning with probabilistic ontologies’, in Proceedings of IJCAI 2015, eds., Qiang Yang and Michael Wooldridge, pp. 4310–4316. AAAI Press, (2015). [1] Marco Alberti, Elena Bellodi, Giuseppe Cota, Fabrizio Riguzzi, and [24] Guy Shani and Asela Gunawardana, ‘Evaluating recommendation sys- Riccardo Zese, ‘cplint on SWISH: probabilistic logical inference with tems’, in Recommender systems handbook, 257–297, Springer, (2011). a web browser’, Intelligenza Artificiale, 11(1), 47–64, (2017). [25] Shahab Saquib Sohail, Jamshed Siddiqui, and Rashid Ali, ‘Classifica- [2] Franz Baader, Diego Calvanese, Deborah McGuinness, Peter Patel- tions of recommender systems: A review’, Engineering Science and Schneider, and Daniele Nardi, The description logic handbook: Theory, Technology Review, 10(4), 132–153, (2017). implementation and applications, Cambridge university press, 2003. [26] Mingxuan Sun Sun, Fei Li, and Jian Zhang, ‘A multi-modality deep [3] Margaret A Boden, ‘Creativity and artificial intelligence’, Artificial In- network for cold-start recommendation’, Big Data and Cognitive Com- telligence, 103(1-2), 347–356, (1998). puting, 2(7), (2018). [4] Roberto Confalonieri, Marco Schorlemmer, Oliver Kutz, Rafael Penaloza,˜ Enric Plaza, and Manfred Eppe, ‘Conceptual blending in EL++’, in Proc. of the 29th Int. Work. on Description Logics, DL 2016, (2016). [5] Yashar Deldjoo, Mihai Gabriel Constantin, Hamid Eghbal-Zadeh, Bog- dan Ionescu, Markus Schedl, and Paolo Cremonesi, ‘Audio-visual en- coding of multimedia content for enhancing movie recommendations’, in Proceedings of the 12th ACM Conference on Recommender Systems, pp. 455–459. ACM, (2018). [6] Manfred Eppe, Ewen Maclean, Roberto Confalonieri, Oliver Kutz, Marco Schorlemmer, Enric Plaza, and Kai-Uwe Kuhnberger,¨ ‘A com- putational framework for conceptual blending’, Artificial Intelligence, 256, 105–129, (2018). [7] Marcello Frixione and Antonio Lieto, ‘Representing and reasoning on typicality in formal ontologies’, in Proceedings of the 7th International Conference on Semantic Systems, pp. 119–125. ACM, (2011). [8] Laura Giordano, Valentina Gliozzi, Nicola Olivetti, and Gian Luca Poz- zato, ‘ Semantic characterization of Rational Closure: from Proposi- tional Logic to Description Logics’, Artif. Int., 226, 1–33, (2015). [9] James A Hampton, ‘Inheritance of attributes in natural concept con- junctions’, Memory & Cognition, 15(1), 55–71, (1987). [10] Daniel Lehmann and Menachem Magidor, ‘What does a conditional knowledge base entail?’, Artificial Intelligence, 55(1), 1–60, (1992). [11] Martha Lewis and Jonathan Lawry, ‘Hierarchical conceptual spaces for concept combination’, Artificial Intelligence, 237, 204–227, (2016). [12] Antonio Lieto, Federico Perrone, Gian Luca Pozzato, and Eleonora Chiodino, ‘Beyond subgoaling: A dynamic knowledge generation framework for creative problem solving in cognitive architectures’, Cognitive Systems Research, 58, 305–316, (2019). [13] Antonio Lieto and Gian Luca Pozzato, ‘A description logic of typical- ity for conceptual combination’, in Foundations of Intelligent Systems - 24th International Symposium, ISMIS 2018, Limassol, Cyprus, Octo- ber 29-31, 2018, Proceedings, eds., Michelangelo Ceci, Nathalie Jap- kowicz, Jiming Liu, George A. Papadopoulos, and Zbigniew W. Ras, volume 11177 of Lecture Notes in Computer Science, pp. 189–199. Springer, (2018). [14] Antonio Lieto and Gian Luca Pozzato, ‘Applying a description logic of typicality as a generative tool for concept combination in computational creativity’, Intelligenza Artificiale, 13(1), 93–106, (2019). [15] Antonio Lieto and Gian Luca Pozzato, ‘A description logic framework for commonsense conceptual combination integrating typicality, prob- abilities and cognitive heuristics’, Journal of Experimental & Theoret- ical Artificial Intelligence, arXiv:1811.02366, (2020). [16] Antonio Lieto, Gian Luca Pozzato, Federico Perrone, and Eleonora Chiodino, ‘Knowledge capturing via conceptual reframing: A goal- oriented framework for knowledge invention’, in Proceedings of the 10th ACM Conference on Knowledge Capture, K-CAP 2019, Marina del Rey, pp. 109–114. ACM, (2019). [17] Antonio Lieto, Gian Luca Pozzato, and Alberto Valese, ‘ COCOS: a typicality based COncept COmbination System ’, in Proceedings CILC 2018, eds., M. Montali and P. Felli, pp. 55–59, (2018). [18] Allen Newell, John C Shaw, and Herbert A Simon, ‘Report on a gen- eral problem solving program’, in IFIP congress, volume 256, p. 64. Pittsburgh, PA, (1959). [19] Allen Newell and Herbert A. Simon, Human problem solving, volume 104, n. 9, Prentice-Hall Englewood Cliffs, NJ, 1972. [20] Daniel N Osherson and Edward E Smith, ‘On the adequacy of prototype theory as a theory of concepts’, Cognition, 9(1), 35–58, (1981). [21] E. Parisier, The Filter Bubble: What the Internet Is Hiding from You, 2012. [22] Fabrizio Riguzzi, Elena Bellodi, Evelina Lamma, and Riccardo Zese, ‘Probabilistic description logics under the distribution semantics’, Se- mantic Web, 6(5), 477–501, (2015). [23] Fabrizio Riguzzi, Elena Bellodi, Evelina Lamma, and Riccardo Zese,