OPEN INFORMATION EXTRACTOR FOR SEVA
Jitin Krishnan Mentor: Patrick Coronado
Other Project Members: Trevor Reed (NASA JPL) Ben Brumback (George Mason University) Youngyo Na (Rowan University) SEVA: A Systems Engineer’s Virtual Assistant
Question-Answering
Explainable System
Systems Dynamic Information Personal Engineer Assistant Time & Scheduling
Hypothetical Scenarios Information Assimilation
Capture Experience Documents (Eg: SE Handbook) SEVA: ARCHITECTURAL COMPONENTS
INPUT NATURAL LANGUAGE QUESTION DYNAMIC PROCESSING UNIT KNOWLEDGE BASE TARGETTED (WITH TEMPORAL FACTS/STATEMENTS OPEN INFORMATION RELATIONS) EXTRACTION SYSTEM BASE KB + RULES DICTIONARY & COMMANDS VERB NORMALIZATION CUSTOM KB + RULES OUTPUT RULE EXTRACTION
SE QUERY LANGUAGE ANSWER INFERENCE ENGINE Motivation QUESTION* ONTOLOGY LANGUAGE HYPOTHETICAL MODE REASONER § Information Assimilation SYSTEM CASE-BASED EXPERIENCE RULE REASONER § Explainability RESPONSES § Personal System *in conversational mode TEMPORAL REASONER § Domain-specific (rule/experience building) Common Sense 2 SEVA: SYSTEM CONCEPT
NATURAL INTERACTIVE LANGUAGE Q&A PROCESSING LOGICKER ENDOWED MODELS INFERENCE MONITOR ENGINE
INTER-MODULE COMMUNICATION DYNAMIC PROTOCOL ONTOLOGY SEVA: Knowledge Representation Example
Sentence: Instruments have thermal zones
Subject Verb/Predicate Object NLP
First Order Logic (FOL) Formalism: ∀x ∃y Instrument(x) ∧ Thermal Zone(y) ⇒ have(x,y)
Concepts : Instrument, Thermal Zone (like classes in OOP) Instances : x, y Relationship : have KB ∀ = for all ∃ = there exists SEVA: Knowledge Representation Example
Instrument(STI) Instantiations and relation STI is an instrument between instances ABox part of(MassSpectrometerAP8717, STI) MassSpectrometerAP8717 is a part of STI
Conduit ≡ Pipe Conduit is same as Pipe Concepts and relations TBox MassSpectrometer ⊑ Spectrometer RBox Subclass relationship "#$%&' ∘ "#$%&' ⊑ "#$%&' Transitive property of the role
− Relations between "#$%&' ≡ *#+,-."-/0/% relations Inverse property of two roles SEVA: Types of Knowledge
STI is an instrument. STI has a length of 200 cm. Mass Spectrometer MS81Z is [STI; is-a; instrument] Project [STI; has; length] Specific a part of STI ABox OIE [length; has-value; 200 Documents . . . cm] [Mass Spectrometer MS81Z; part of; STI] Conduit is same as Pipe. Mass Spectrometer is a type of Spectrometer. Instruments have TBox SE Handbook, [Conduit; same as; Pipe] thermal zones. [Mass Spectrometer; Common-Sense OIE Knowledge . . . type of; Spectrometer] [Instruments; have; Entity Linking thermal zones] & Verb Normalization Verb Vocabulary, Word RBox Verb-oriented Net, DBPedia, Word2Vec [partOf: transitive] Common-Sense contextual synonyms OIE + [partOf; inverse-of; knowledge supervised hasComponent] KB Construction . . . & Population Open Information Extraction
• Identifies wide range of domain-independent relations • Traditional Information Extraction: uses predetermined templates
Examples: Stanford Open IE, Open IE by AI2, ClausIE
• For the experiment the domain language complexity is reduced – we work with only simple English sentences NLP Basics SEVA – Targeted Open IE (TOIE) Sentence: STI, an instrument, weighs 56 kg Pattern Matching on the dependency tree
Dependency Extracting relations of type: is-a, transitive-verb, has-property, Parse Tree has-value
NN = Noun A set of universal dependencies: VBZ = Verb form nsubj, dobj, case, nmod, compound, amod
Chunking example: "NP:{(
NLTK, Stanford CoreNLP, POS Tagger SEVA-TOIE: Comparison THANK YOU!