<<

Schema Aware Semantic Reasoning for Interpreting Natural Language Queries in Enterprise Settings

Jaydeep Sen1, Tanaya Babtiwale2, Kanishk Saxena3, Yash Butala4, Sumit Bhatia1, Karthik Sankaranarayanan1 | 1 IBM Research AI 2 NMIMS 3 SMVIT 4 IIT KGP

Introduction Outline Reasoning ontology Correction Unit Inferences • NLIDB: Natural Language Interface to Database • We build a generic reasoning ontology (RKB) • Enables business users access and query DB. with axioms which are universally true for • Non-technical users need not know complex any valid interpretation. Query language like SQL or Domain schema. • We also build a generic set of Correction • Text-to-SQL Challenge(DL based systems): axioms (CU) which captures inferences to • started on single table and simple SQL queries. produce semantically valid interpretations. • Extended to Spider with multi-table databases. • Given an ontology OD, and a question, we • Ontology Based QA Systems: use reasoning over (RKB) and make • Uses ontology to capture and use domain inferences using CU to infer implicit intents. semantics for QA. Offline Online Results NL Motivation • Benchmarks • Natural Language Understanding (NLU) remains Query • Existing: QALD, Spider an AI-Hard problem. Show me all customers holding more IBM stock than Arvind Krishna • Custom: FIBEN, GOSALES – BI Queries • Existing systems can not understand complex COGIT operations and/or implicit intents seen in # questions producing correct answers # question asked to the system. Business Intelligence (BI) queries. 100

Domain 91.96 • Ontology Reasoning can be useful for inferring Language 88.89 84 Ontology (OD) Processor 80 implicit intents/operations for analytic queries. 72 67.02

executives — Concept:ContractParty 60 54.98

traded — Concept:SecuritiesTransaction 46.42 46.7 45.13

41.2 40 stock — Concept:ListedSecurity 36.2 Motivating Examples: Arvind Krishna — Property:hasName 30

20

Proposal 0 Fact Population FIBEN GOSALE S QALD Spider Generator Key Point: Reasoning produces better accuracy as compared to neural and rule-based SOTA baselines. SELECT (ContractParty) WHERE Precision = {ContractParty.ListedSecurity # of NL queries producing correct intent MORE_THAN () # of NL Queries which produced inferences ArvindKrishna.ListedSecurity} Recall = Reasoning KB (R ): # of NL queries producing correct intent KB Consistency # of NL Queries which required inferences Reasoning Axioms (RA), MORE_THAN ⊆ NUMERIC_COMPARISON Checker NUMERIC_COMPARISON ⊆ MEASURE Reasoning Facts (RF) Key Point: Reasoning infers semantically valid and MEASURE: = NUM_AGGREGATION correct intents, with sensitive detection, without Inconsistent U NUMERIC_PROPERTY NUMERIC_PROPERTY ⊆ PROPERTY introducing noise in terms of erroneous inferences. CONCEPT(ListedSecurity) CONCEPT ∩ PROPERTY=ɸ Conclusion Corrective Unit (CU): CONCEPT ∩ NUM_AGGREGATION =ɸ We propose a novel schema-aware, domain-agnostic, Detection Axioms, Interpreter semantic reasoning framework that uses ontology Challenges in Enterprise Settings Action Axioms reasoning for natural language query interpretation. SELECT (ContractParty.hasPersonName) It can interpret complex operations in an explainable and same vocabulary WHERE • Similarly phrased queries have different implicit {SUM(Transaction#stockCount) generalizable manner. intents and need different operations. DETECTION: Consistent MORE_THAN () NUM_COMPARISON(ListedSecurity) ∩ SUM(Transaction#stockCount) ) References CONCEPT(ListedSecurity) ∩ Facts • Need to reason over domain schema and • Saha, Diptikalyan, et al. "ATHENA: an ontology-driven system for natural hasMeasureProp(ListedSecurity, stockCount) language querying over relational data stores." Proceedings of the VLDB semantics to infer implicit intent for generating Explainability Endowment 9.12 (2016): 1209-1220. ACTION: • FIBEN: An IBM benchmark for Business Intelligence queries on Finance semantically consistent interpretations. Module https://github.com/IBM/fiben-benchmark S* → SUM(Transaction.stockCount) QA Engine/ + NUM_COMPARISON(S*) ATHENA — NUM_COMPARISON(ListedSecurity)