A Logic Based Approach for the Multimedia Data Representation and Retrieval
Total Page:16
File Type:pdf, Size:1020Kb
A Logic Based Approach for the Multimedia Data Representation and Retrieval Samira Hammiche, Salima Benbernou Athena Vakali LIRIS – University Claude Bernard Lyon 1 Department of Informatics 43, bld du 11 Novembre 1918 Aristotle University of Thessaloniki 69622 Villeurbanne - France 54006 Thessaloniki – Greece {shammich, sbenbern}@liris.univ-lyon1.fr [email protected] Abstract The Moving Picture Experts Group (MPEG) published the standard MPEG-7 1, in February 2002 which aims to Nowadays, the amount of multimedia data is increasing describe the multimedia content from different viewpoints. rapidly, and hence, there is an increasing need for efficient This standard is based on XML Schema [2] and focuses on methods to manage the multimedia content. This paper pro- the structural and low-level features, then it is suitable for poses a framework for the description and retrieval of mul- the representation of the syntactic dimension of the multi- timedia data. The data are represented at both the syntactic media content. However, XML Schema displays some lim- (structure, metadata and low level features) and semantic itations when there is a need to extract implicit information. (the meaning of the data) levels. We use the MPEG-7 stan- This is due to the fact that there is neither formal semantics dard, which provides a set of tools to describe multimedia nor inference capabilities associated with the elements of an content from different viewpoints, to represent the syntactic XML document. Consequently, MPEG-7 cannot represent level. However, due to its XML Schema based represen- the meaning (semantics) of the multimedia content. Then, tation, MPEG-7 is not suitable to represent the semantic another formalism has to be considered for the semantic di- aspect of the data in a formal and concise way. Moreover, mension, where the following issues are required: inferential mechanisms are not provided. To alleviate these • limitations, we propose to extend MPEG-7 with a domain formally define the semantics of MPEG-7 terms and ontology, formalized using a logical formalism. Then, the multimedia content; semantic aspect of the data is described using the ontol- • express these definitions in a machine understandable ogy’s vocabulary, as a set of logical expressions. We en- language; hance the ontology by a rules layer, to describe more com- plex constraints between domain concepts and relations. • and provide inferential mechanisms to allow reasoning User’s queries may concern the syntactic and/or seman- over the multimedia content. tic features. The syntactic constraints are expressed using XQuery language and evaluated using an XML query en- Logical formalisms, which are rather appropriate for the gine; whereas the semantic query constraints are expressed above requirements, are considered in our framework. using a rules language and evaluated using a specific reso- To take into account the two dimensions (syntactic and lution mechanism. semantic) in the same framework and to alleviate the limita- tion of MPEG-7 description, we propose to extend MPEG-7 with domain ontology to provide a shared and formal vo- 1. Introduction cabulary for the specification of the semantics. As descrip- tion logic languages are usual to represent ontologies, we use a specific description logic (DL) language to describe Usually, the multimedia content is studied in two paral- the semantic of the multimedia content. However, if most lel dimensions : the syntax (structure and low level features) of the DL languages are very expressive to describe the con- and the semantic (the meaning of the data). In multimedia cepts (i.e. conceptual entities of the domain), they don’t pay data representation and retrieval models, these two dimen- so much attention to the role expressions (relations between sions have not only to be taken into account, but also require concepts). Moreover, DLs formalisms have rather weak that each of them be dealt by the the most appropriate tools query languages. The simple and existing ones are mostly to it, and these sets of tools have to be integrated in a com- plimentary way in order to respond user’s queries. 1http://www.chiariglione.org/mpeg/standards/mpeg-7/mpeg-7.htm Proceedings of the Seventh IEEE International Symposium on Multimedia (ISM’05) 0-7695-2489-3/05 $20.00 © 2005 IEEE based on concepts satisfiability. This technique cannot be used directly for queries involving roles and variables. To Table 1. Extract of Soccer game ontology cope with these limitations, we enhance the ontology by a Ontology Concepts rules layer (represented using Datalog rules) which provide Actor roles hierarchy ≤ ∃ a powerful query langauge. Therefore, a unified and ade- P erson , P layer P erson play.T eam quate formalism integrating rules and DLs is used to repre- AreaP layer P layer, Goalkeeper AreaP layer ⊥ sent the semantic dimension of the multimedia data. Substitute P layer, AreaP layer Substitute The semantic query processing is done in two steps, first Game event hierarchy ≤ the query is rewritten using the Datalog rules and the DL Event ∃ reasoning mechanism, then an evaluation strategy is applied RefereeAction Event r action.Referee ∃ to the rewritten queries, which is based on first order logic GameAction Event g action.P layer ∃ ¬( ) resolution algorithm. Miscellneous Event m action Referee P layer To summarize, we present a formal framework where the The concept Video (≥ 1 ) (≥ 1 ) standard MPEG-7 and an hybrid language (DL and rules) V ideo EventOccur.Event Appear.P erson are used to capture the syntactic and semantic dimensions Ontology roles × respectively, of the multimedia data. These two dimensions AgentOf P erson Event × are related using XPath expressions. For the retrieval, we P atientOf P erson Event × use XQuery [4] and a resolution algorithm to answer user’s MemberOf P erson T eam × queries in a complimentary way. ResultOf Event Event × This paper is organized as follows: in Section 2, we give play P layer Match × a case study scenario and some preliminaries concerning the EventOccur V ideo Event × MPEG-7 standard and the hybrid language (DL and Data- Appear V ideo P erson log rules languages). Section 3 presents the logical based Datalog rules : ( ) ←− ( ) framework architecture used for the representation and re- R1 Eventactors P1,P2,E,X P erson P1 , ( ) ( ) ( ) ( ) trieval of the multimedia data. Section 4 introduces an ap- P erson P2 , Event E , V ideo X , AgentOf P1,E , ( ) ( ) plication example. In Section 5, we relate our work to the P atientOf P2,E , EventOccur X, E : ( ) ←− ( ) ( ) existing ones and we conclude in Section 6. R2 SameT eam P1,P2,T P layer P1 ,Player P2 , T eam(T ), MemeberOf(P1,T), MemberOf(P2,T) 2. Preliminaries segment where the player “Ronaldo‘” scores a goal against 2.1 A case study scenario the goalkeeper “Khan”. This query can be expressed in the Datalog query language as follows: In our proposal, we consider soccer game videos. Then, Q(x) ←− Eventactors(Ronaldo, Khan, goal, x). the data dealt with are video data and the ontology is a To answer this query, we should first rewrite it using the soccer game ontology. The soccer game ontology involves ontology vocabulary and the intentional Datalog rules, then many concepts and relations. An extract of the soccer game evaluate the resulting queries. The rewriting is done in two ontology and the rules is given in Table 1. Our DL knowl- steps: the first is to expand the Datalog intentional rule. edge base is a set of axioms using two forms. The first is Q will be rewritten into the query Q as follows: C D, which states that every instance of the concept C Q (x) ←− P erson(Ronaldo), P erson(Khan), must be an instance of the concept D; and the second form Event(goal), V ideo(x), AgentOf(Ronaldo, goal), R C × D states that the role R has the domain C and the P atientOf(Khan, goal), EventOccur(x, goal). range D. The second step is to reason over the domain ontology We introduce a Datalog rules layer on the top of the DL to infer concepts that are subsumed by the concepts in the to cope with the DL limitations about roles and query lan- query. Then, the query Q will be rewritten into a disjunc- guage. Then, it is possible to define n-ary predicates such tion of queries Q1, ..., Qn. Each of these queries is seman- as the predicate introduced in the rule R1, which captures tically equivalent to the initial query Q . an event in a video segment and its actors (for example, the goal event has a specific player as agent and a goalkeeper as 2.2 The MPEG-7 standard patient); and the predicate of the rule R2, which gives the conditions to have two players in the same team. MPEG-7, formally called “Multimedia Content Descrip- Semantic queries are posed in term of the domain ontol- tion Interface”, specifies a standard set of description tools, ogy vocabulary and the ordinary predicates defined using which are applicable for all kinds of multimedia content in- Datalog rules. For example, if a query is to obtain the video dependent of its format and coding. These tools are [15]: Proceedings of the Seventh IEEE International Symposium on Multimedia (ISM’05) 0-7695-2489-3/05 $20.00 © 2005 IEEE • A set of Descriptors (Ds), to represent the features of distributor), address information related to the management audio-visual material, (e.g. color histogram). of the content (content management information), whereas the last two are mainly devoted to the description of perceiv- • A set of Description Schemes (DSs), which define able information in term of structure and semantics (content the structure and the semantics of the relationships be- description information). tween elements, which include Ds and DSs.Anex- The semantic descriptions have advantages over syntac- ample is the hierarchical structure of a video repre- tic descriptions, because of their proximity to human under- sented by the VideoSegment DS.