Query2prod2vec: Grounded Word Embeddings for Ecommerce

Query2Prod2Vec Grounded Word Embeddings for eCommerce Federico Bianchi Jacopo Tagliabue∗ Bingqing Yu Bocconi University Coveo Labs Coveo Milano, Italy New York, USA Montreal, Canada [email protected] [email protected] [email protected] Abstract industry-specific jargon (Bai et al., 2018), low- resource languages; moreover, specific embedding We present Query2Prod2Vec, a model that strategies have often been developed in the con- grounds lexical representations for product text of high-traffic websites (Grbovic et al., 2016), search in product embeddings: in our model, meaning is a mapping between words which limit their applicability in many practical sce- and a latent space of products in a digital shop. narios. In this work, we propose a sample efficient We leverage shopping sessions to learn the un- word embedding method for IR in eCommerce, and derlying space and use merchandising anno- benchmark it against SOTA models over industry tations to build lexical analogies for evalua- data provided by partnering shops. We summarize tion: our experiments show that our model our contributions as follows: is more accurate than known techniques from the NLP and IR literature. Finally, we stress the importance of data efficiency for product 1. we propose a method to learn dense represen- search outside of retail giants, and highlight tations of words for eCommerce: we name how Query2Prod2Vec fits with practical con- our method Query2Prod2Vec, as the map- straints faced by most practitioners. ping between words and the latent space is 1 Introduction mediated by the product domain; The eCommerce market reached in recent years 2. we evaluate the lexical representations learned an unprecedented scale: in 2020, 3.9 trillion dol- by Query2Prod2Vec on an analogy task lars were spent globally in online retail (Cramer- against SOTA models in NLP and IR; bench- Flood, 2020). While shoppers make significant marks are run on two independent shops, dif- use of search functionalities, improving their ex- fering in traffic, industry and catalog size; perience is a never-ending quest (Econsultancy, 2020), as outside of few retail giants users complain about sub-optimal performances (Baymard Insti- 3. we detail a procedure to generate synthetic em- tute, 2020). As the technology behind the indus- beddings, which allow us to tackle the “cold try increases in sophistication, neural architectures start” challenge; are gradually becoming more common (Tsagkias et al., 2020) and, with them, the need for accu- 4. we release our implementations, to help the rate word embeddings for Information Retrieval community with the replication of our find- (IR) and downstream Natural Language Processing ings on other shops1. (NLP) tasks (Yu and Tagliabue, 2020; Tagliabue et al., 2020a). While perhaps not fundamental to its industry Unfortunately, the success of standard and significance, it is important to remark that grounded contextual embeddings from the NLP litera- lexical learning is well aligned with theoretical con- ture (Mikolov et al., 2013a; Devlin et al., 2019) siderations on meaning in recent (and less recent) could not be immediately translated to the prod- literature (Bender and Koller, 2020; Bisk et al., uct search scenario, due to some peculiar chal- 2020; Montague, 1974). lenges (Bianchi et al., 2020b), such as short text, ∗Corresponding author. All authors contributed equally 1Public repository available at: https://github. and are listed alphabetically. com/coveooss/ecommerce-query-embeddings. 154 Proceedings of NAACL HLT 2021: IndustryTrack Papers, pages 154–162 June 6–11, 2021. ©2021 Association for Computational Linguistics 2 Embeddings for Product Search: an 4. Computational capacity. The majority of Industry Perspective the market has the necessity to strike a good trade-off between quality of lexical represen- In product search, when the shopper issues a query tations and the cost of training and deploying (e.g. “sneakers”) on a shop, the shop search engine models, both as hardware expenses and as ad- returns a list of K products matching the query ditional maintenance/training costs. intent and possibly some contextual factor – the shopper at that point may either leave the website, The embedding strategy we propose – or click on n products to further explore the offer- Query2Prod2Vec – has been designed to ing and eventually make a purchase. allow efficient learning of word embeddings for Unlike web search, which is exclusively per- product queries. Our findings are useful to a wide formed at massive scale, product search is a prob- range of practitioners: large shops launching in lem that both big and small retailers have to solve: new languages/countries, mid-and-small shops while word embeddings have revolutionized many transitioning to dense IR architectures and the areas of NLP (Mikolov et al., 2013a), word embed- raising wave of multi-tenant players4: as A.I. dings for product queries are especially challenging providers grow by deploying their solutions to obtain at scale, when considering the huge vari- on multiple shops, “cold start” scenarios are ety of use cases in the overall eCommerce industry. an important challenge to the viability of their In particular, based on industry data and first-hand business model. experience with dozens of shops in our network, we identify four constraints for effective word em- 3 Related Work beddings in eCommerce: The literature on learning representations for lex- 1. Short text. Most product queries are very ical items in NLP is vast and growing fast; as short – 60% of all queries in our dataset are an overview of classical methods, Baroni et al. one-word queries, > 80% are two words or (2014) benchmarks several count-based and neural less; the advantage of contextualized embed- techniques (Landauer and Dumais, 1997; Mikolov dings may therefore be limited, while lexical et al., 2013b); recently, context-aware embed- vectors are fundamental for downstream NLP dings (Peters et al., 2018; Devlin et al., 2019) tasks (Yu and Tagliabue, 2020; Bianchi et al., have demonstrated state-of-the-art performances 2020a). For this reason, the current work in several semantic tasks (Rogers et al., 2020; specifically addresses the quality of word em- Nozza et al., 2020), including document-based beddings2. search (Nogueira et al., 2020), in which target entities are long documents, instead of prod- 2. Low-resource languages. Even shops that uct (Craswell et al., 2020). To address IR-specific have the majority of their traffic on English challenges, other embedding strategies have been domain typically have smaller shops in low- proposed: Search2Vec (Grbovic et al., 2016) uses resource languages. interactions with ads and pages as context in the 3. Data sparsity. In Shop X below, only 9% of typical context-target setting of skip-gram mod- all shopping sessions have a search interac- els (Mikolov et al., 2013b); QueryNGram2Vec (Bai tion3. Search sparsity, coupled with vertical- et al., 2018) additionally learns embeddings for n- specific jargon and the usual long tail of search grams of word appearing in queries to better cover queries, makes data-hungry models unlikely the long tail. The idea of using vectors (from im- to succeed for most shops. ages) as an aid to query representation has also been suggested as a heuristic device by Yu et al. 2Irrespectively of how the lexical vectors are computed, (2020), in the context of personalized language query embeddings can be easily recovered with the usual techniques (e.g. sum or average word embeddings (Yu et al., models; this work is the first to our knowledge to 2020)): as we mention in the concluding remarks, investi- benchmark embeddings on lexical semantics (not gating compositionality is an important part of our overall research agenda. 4As an indication of the market opportunity, only in 2019 3This is a common trait verified across industries and sizes: and only in the space of AI-powered search and recommenda- among dozens of shops in our network, 30% is the highest tions, we witnessed Coveo (Techcrunch), Algolia (Techcrunch, search vs no-search session ratio; Shop Y below is around 2019a) and Lucidworks (Techcrunch, 2019b) raising more 29%. than 100M USD each from venture funds. 155 tuned for domain-specific tasks), and investigate clicked after it, using frequency as a weighting fac- sample efficiency for small-data contexts. tor (i.e. products clicked often contribute more). The model has one free parameter, rank, which 4 Query2Prod2Vec controls how many embeddings are used to build the representation for q: if rank=k, only the k most In Query2Prod2Vec, the representation for a clicked products after q are used. The results in query q is built through the representation of the Table1 are obtained with rank=5, as we leave to objects that q refers to. Consider a typical shopper- future work to investigate the role of this parameter. engine interaction in the context of product search: The lack of large-scale search logs in the the shopper issues a query, e.g. “shoes”, the en- case of new deployments is a severe issue gine replies with a noisy set of potential refer- for successful training. The referential nature ents, e.g. pairs of shoes from the shop inventory, of Query2Prod2Vec provides a fundamental com- among which the shopper may select relevant items. petitive advantage over models building embed- Hence, this dynamics is reminiscent of a coopera- dings from past linguistic behavior only, as syn- tive language game (Lewis, 1969), in which shop- thetic embeddings can be generated as long as pers give noisy feedback to the search engine on cheap session data is available to obtain an ini- the meaning of the queries. A full specification tial prod2vec model. As detailed in the ensuing of Query2Prod2Vec therefore involves a represen- section, the process happens in two stages, event tation of the target domain of reference (i.e. prod- generation and embeddings creation. ucts in a digital shop) and a denotation function.

Load more