A Quantum Interference Inspired Neural Matching Model for Ad-Hoc
Total Page:16
File Type:pdf, Size:1020Kb
A Quantum Interference Inspired Neural Matching Model for Ad-hoc Retrieval Yongyu Jiang, Peng Zhang*, Hui Gao Dawei Song College of Intelligence and Computing School of Computer Science and Technology Tianjin University Beijing Institute of Technology Tianjin, China Beijing, China [email protected] [email protected] ABSTRACT An essential task of information retrieval (IR) is to compute the probability of relevance of a document given a query. If we regard a query term or n-gram fragment as a relevance matching unit, most retrieval models firstly calculate the relevance evidence between the given query and the candidate document separately, and then accumulate these evidences as the final document relevance pre- diction. This kind of approach obeys the the classical probability, (a) probability of relevance (b) classical probabilistic model which is not fully consistent with human cognitive rules in the ac- tual retrieval process, due to the possible existence of interference effect between relevance matching units. In our work, we propose a Quantum Interference inspired Neural Matching model (QINM), which can apply the interference effects to guide the construction of additional evidence generated by the interaction between matching units in the retrieval process. Experimental results on two bench- mark collections demonstrate that our approach outperforms the quantum-inspired retrieval models, and some well-known neural (c) dependency-based model (d) neural matching model retrieval models in the ad-hoc retrieval task. CCS CONCEPTS Figure 1: Illustrative examples for different retrieval mod- els for document relevance judgments. & represents the user • Information systems ! Retrieval models and ranking. query which is composed of two terms @1 and @2, D8 denotes KEYWORDS a query-document matching unit and '퐷 represents the rel- evance probability of document 퐷. Information Retrieval, Neural Matching Models, Quantum Interfer- ence, Learning-to-Rank ACM Reference Format: the current query) [25]. One of the essential steps is how to calcu- Yongyu Jiang, Peng Zhang*, Hui Gao and Dawei Song. 2020. A Quantum late the probability of relevance of a candidate document based on Interference Inspired Neural Matching Model for Ad-hoc Retrieval. In Pro- a user query (see Figure 1 (a)). Some classical probabilistic models ceedings of the 43rd International ACM SIGIR Conference on Research and (e.g., BM25 [26] and the binary independence model (BIM) [28]) Development in Information Retrieval (SIGIR ’20), July 25–30, 2020, Virtual make an assumption that each query term is independent. They Event, China. ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/ firstly calculate the co-occurrence information between each query 3397271.3401070 term and the candidate document as the relevance evidence sepa- rately, and then accumulate these evidences as the final relevance 1 INTRODUCTION probability prediction. From the description in Figure 1 (b), we can The aim of a IR system is to find the optimum retrieval mechanism, find that the above process actually obeys the classical law oftotal which is achieved when candidate documents are ranked according probability (LTP). However, this independence assumption ignores to decreasing values of the probability of relevance (with respect to the dependencies between terms, which plays a crucial role in the documents relevance judgment [11, 28]. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed Many dependency-based models [8, 10, 18, 27] has focused on for profit or commercial advantage and that copies bear this notice and the full citation modeling the term dependencies in the retrieval process. Typically, on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or Metzler and Croft [18] adopts Markov random field (MRF) to repre- republish, to post on servers or to redistribute to lists, requires prior specific permission sent three variants (i.e., occurrences of single terms, ordered phrases, and/or a fee. Request permissions from [email protected]. and unordered phrases) for capturing different dependencies be- SIGIR ’20, July 25–30, 2020, Virtual Event, China tween query terms. The final document relevance prediction of this © 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM. ACM ISBN 978-1-4503-8016-4/20/07...$15.00 https://doi.org/10.1145/3397271.3401070 ∗The corresponding author. method is obtained by weighted summation of these three variables Convolution Network and Query Attention mechanism, we select (see an example in Figure 1(c) for a combination of @1 and @2 as a the effective matching features in the operator. Finally, the ranking kind of dependency). Compared with classical probabilistic models, score is calculated by the Multi-Layer Perceptron (MLP). this idea calculates the co-occurrence information of phrase and Evaluation results on a series of systematic experiments show applies it as additional evidences for the prediction of document that the proposed QINM performs well on two TREC collections, relevance. However, different kinds of relevance evidences are still Robust-04 and ClueWeb-09-Cat-B. Our major contributions are calculated separately. By considering this process as described by summarized are as follows: the probability theory, dependency-based models still conforms to 1. We analyze the neural matching model in a probabilistic the classical probability. relevance point of view, and show that a typical neural model With the development of deep learning technology, the IR task is consistent with the classical law of total probability (LTP). can be formalized as a text matching task [6], and many neural 2. Using the projection measurement, we show that there is an matching models have been proposed, such as MP [22], Conv- interference term in the process of measuring the probability KNRM [9], DRMM [13] and MIX [6]. These models usually regard of relevance, which can violate the LTP. a term or n-gram fragment embedding as a query-document match- 3. We further propose a Quantum Interference inspired Neural ing unit. They calculate the relevance matching evidences of a docu- Matching model (QINM), which can model the interference ment for each matching unit separately, and then accumulate these effect in the neural matching method. Systematic evaluation relevance evidences for a final relevance prediction, as shown in Fig- also shows the effectiveness of this proposed method. ure 1(d). The idea of calculating the relevance evidence separately is similar to the aforementioned probabilistic and dependency-based 2 RELATED WORK models, so that we investigate whether or not neural matching The proposed QINM formulates the quantum interference in a models are consistent with the classical probability. neural matching model. In this part, we summarize the related In our work, we first re-formalize a representative neural match- work of neural matching model and quantum-inspired retrieval ing model (i.e., DRMM) in a probabilistic form, and find that its model in Section 2.1 and Section 2.2, respectively. matching idea dose obey the classical law of total probability (LTP). The main reason is that for different matching units, the relevance 2.1 Neural Matching Models evidences are calculated separately. However, if we re-visit Fig- ure 1(a)) and consider the human relevance judgement, the judge- Neural matching models can be categorized as representation-based ment process is regarding each query & as a whole, rather than models and interaction-based models, according to their architec- tures [14, 23]. The representation-based models map a single doc- treating each query term @8 (or matching unit) separately. Some research [1, 2, 7, 32] has shown that the human cognition laws in ument to a low-dimensional semantic space by neural network the real decision-making do not conform to the classical probability, (e.g., CNN, RNN or self-attention mechanism [17]), and calculate its due to the quantum-like interference effects. In the next section, distance from the query representation [15]. This kind of model is by using the projection measurement in quantum mechanics, we mainly concerned with semantic matching, and is highly dependent will show that the LTP is violated by an interference term in the on the contextual representations of individual tokens. However, process of calculating the probability of relevance. our work dose not focus on exploring the contextual representation, In IR literature, there are many work inspired by quantum me- but on the essential relevance judgment process of IR. chanics. Sordoni et al. [27] propose a quantum language model The interaction-based models construct the matching informa- (QLM) for mapping the dependencies between words or phrases in tion (e.g., similarity) through the local interaction between a query a single text into a density matrix, but do not take into account the and a document, and then calculate the matching degree through interference effects. Zuccon and Azzopardi [38] propose a quantum the neural network. This kind of model is mainly about the rele- probability ranking principle (QPRP), which encodes quantum inter- vance matching, which is more suitable for ad-hoc retrieval. Ex- ference effects, but just explore how the user’s document relevance amples include