Aprendizado De Máquina E Inferência Em Grafos De Conhecimento Machine Learning and Inference in Knowledge Graphs

Total Page:16

File Type:pdf, Size:1020Kb

Aprendizado De Máquina E Inferência Em Grafos De Conhecimento Machine Learning and Inference in Knowledge Graphs Aprendizado de Máquina e Inferência em Grafos de Conhecimento Machine Learning and Inference in Knowledge Graphs Daniel N. R. da Silva Artur Ziviani Fabio Porto dramos,ziviani,[email protected] Laboratório Nacional de Computação Científica (LNCC) Simpósio Brasileiro de Bancos de Dados (SBBD), Outubro de 2019 Big Data Management Computational Reproducibility Knowledge Bases Machine Learning Network Science Scientific Workflows http://dexl.lncc.br/ Outline • Introduction and Motivation • Applications • Data Models and Systems • Relational Machine Learning • Knowledge Graph Embeddings • Events, Softwares, and Datasets • Open Research 3 Introduction and Motivation The third (current) rise of AI Custom Drug Image Board Games Relationship Discovery and Captioning Management Toxicology Natural Medical Image Medical Online Language Analysis Informatics Advertisement Processing Recommendation Self-Driving Robotics Web Search Systems Vehicles Deng, Li. Artificial Intelligence in the Rising Wave of Deep Learning: The Historical Path and Future 5 Outlook [Perspectives]. IEEE Signal Processing Magazine. 2018. Knowledge Representation and Reasoning (KRR) How to symbolically encode human knowledge and reasoning in such a way that this encoded knowledge can be processed by a computer via encoded reasoning to obtain intelligent behavior. Fragmentary Set of Medium for Medium of Theory of Surrogate Ontological Efficient Human Intelligent Commitments Computation Expression Reasoning Chein, M. and Mugnier, M-L. Graph-based Knowledge Representation: Computational Foundations of 6 Conceptual Graphs. Springer. 2008. Porphyry's Linked Open Commentaries 1960s Ontology 1998 Data 2012 300s AD Semantic 1980s The Semantic 2006 Knowledge Networks Web Graph 7 https://www.gartner.com/smarterwithgartner/5-trends-appear-on-the-gartner-hype-cycle-for-emerging- 8 technologies-2019/ Graph DBMS https://db-engines.com/en/ranking_categories 9 Linked Open Data https://lod-cloud.net/ 10 Knowledge Graph • Network representation for knowledge; • Heterogeneous data integration; • Constrained by an ontology or data schema; • AI applications. 11 Knowledge Graph • Terminological Component • C: Concepts • RC:Concept Relations • TC: Relationships between concepts • TC->A: Attributive relationships • A: Attributes • V: Attribute Values • Assertional Component: • E: Entities • RE: Entity Relations • TE: Relationships between entities • TE->C: Instantiation relationships • TE->V: Attributive relationships 12 13 Related Terms • Knowledge base: Set of sentences (assertions) expressed in a language called a knowledge representation language (semantics and syntax); • (Graph) Database: Storage and data organization; • Ontology: Formal explicit description of concepts in a domain of discourse. It provides sharable and reusable knowledge. • Knowledge Based System: Keeps a knowledge base and performs reasoning. Noy, N. and McGuinness, D. Ontology development 101: A guide to creating your first ontology. 2001. 14 Ehrlinger, L. and Wöß, W. Towards a definition of knowledge graphs. SEMANTiCS. 2016. Related Fields Artificial Intelligence Knowledge Natural Machine Representation Language Learning and Reasoning Processing Data Management Data Information Data Modeling Integration Retrieval 15 General-purpose KGs Bio & Medical KGs NELL Common-sense KGs & NLP 16 Product Graph & E-commerce KGs Big Techs • Apple Siri Knowlege Graph • Amazon Product Graph • Facebook Graph • Google Knowledge Graph • IBM Watson • Microsoft Satori 17 Knowledge Graphs # Entities # Triples # Concepts # Relations DBpedia 4 298 433 411 885 960 736 2 819 Freebase 49 947 799 3 124 791 156 53 092 70 902 OpenCyc 41 029 2 412 520 116 822 18 028 Wikidata 18 697 897 748 530 833 302 280 1 874 YAGO 5 130 031 1 001 461 792 569 751 106 Färber, M. and Rettinger, A. Which Knowledge Graph Is Best for Me? ArXiv. 2019. 18 Knowledge Graphs Accuracy 1 Interlinking Trustworthiness 0.8 0.6 Licensing 0.4 Consistency 0.2 0 Accessibility Relevancy Interoperability Completeness Ease of Timeliness understanding DBpedia Freebase OpenCyc Wikidata YAGO Färber, M. and Rettinger, A. Which Knowledge Graph Is Best for Me? ArXiv. 2019. 19 Färber, M. et al. Linked data quality of dbpedia, freebase, opencyc, wikidata, and yago. Semantic Web. 2018. DBpedia • Crowd-sourced community effort to extract structured content from Wikmedia projects; • Served as Linked Data; • Ontology, including persons, places, creative works, organizations, species, and diseases; • 125 languages; • Links to YAGO and Wikipedia. Lehmann, J et al. DBpedia – A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia.SWJ. 2012. https://dbpedia.org/ Structure in Wikipedia • Title • Abstract • Infoboxes • Geo-coodinates • Categories • Images • Links: • other language versions • other Wikipedia pages • to the web; • redirections; • disambiguations. 21 YAGO • YAGO (Yet Another Great Ontology). • Derived from Wikipedia, WordNet, and Geonames. • 95% accuracy on sample facts. • Temporal Dimension: • Before, after, during, and overlaps • Location Dimension: • northOf, eastOf, southOf, westOf, nearby, and locatedIn. • Multilingual facts. Rebele, T. et al. YAGO - A Multilingual Knowledge Base from Wikipedia, Wordnet, and Geonames. ISWC. 2016. https://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/ https://gate.d5.mpi-inf.mpg.de/webyago3spotlxComp/SvgBrowser/ 23 Structured Semi-structured Non-Structured Data Data Data Ex.: Relational Databases. Ex.: XML and JSON Ex.: Text, audio, image, documents. and video files. Knowledge Base Construction Ex.: Entity and relation extraction. Correctness Completeness Knowledge Graph Freshness Refinement Completion and Correction Applications 24 Example Text fragment Marie Curie was born on November 7, 1867, in Warsaw, Poland. She was a french naturalized, physicist and chemist who conducted pioneering research in the fields of radioactivity. Extracted triples (Marie Curie, date-of-birth, 07/11/1867) (Marie Curie, born-in, Poland) (Marie Curie, nationality, French) (Marie Curie, profession, Physicist) (Marie Curie, profession, Chemist) (Marie Curie, research-field, Radioactivity) 25 Polish Chemist born-in Physicist profession profession Marie Curie date-of-birth research-field nationality “07/11/1867” Radioactivity French 26 Challenges Coverage/Completeness Have we got the information we need? Correctness Is information accurate? Freshness Is information up to date? Gao, Yuqing. Building a Large-scale, Accurate and Fresh Knowledge Graph. KDD Tutorial 2018. 27 Applications Applications • Conversational Agents; • Data integration; • Fact checking and fake news detection; • Question Answering; • Recommendation Systems; • Search Engines. 29 Search Engines 30 Recommendation Systems Cast Tom Away Hanks Drama Back Robert Forrest To The Zemeckis Future Gump User SciFi Saving Steven Jurassic Private Spielberg Ryan Park Action Blade Harrison Runner Ford starred-in Movies the director-of Recommended user has genre movies. watched. Wang, H et al. Exploring High-Order User Preference on the Knowledge Graph for Recommender 31 Systems. ACM TOIS. 2019. Conversational Agents Natural Jonh Doe: @bot How much money Language do I have in my account? Processing User Query has Generation Account Balance isA Context JD has Management $ 0,99 Account ChatBot owns Domain Knowledge Jonh Doe Natural Language Bot: @johndoe You have $0,99 in Generation your bank account. 32 Question Answering Which films directed by Steven Spielberg won the Oscar Award? Film directed film Steven won isA Spielberg dep nsubj directed ? ? ... ? nsubj det nmod:by won Oscar Award Which Oscar Award Steven Spielberg Zheng, W et al. Question Answering Over Knowledge Graphs: Question Understanding Via Template Decomposition. PVLDB. 2019. 33 https://nlp.stanford.edu/software/lex-parser.shtml Fact Checking and Fake News Detection Barack Obama Columbia University Association of American Universities Canada Barack Obama secretly practices Islam. Stephen Harper Calagary Naheed Neshi Islam Ciampaglia, G. L. Computational Fact Checking from Knowledge Networks. PLOS ONE. 2015. 34 Personalized Medicine iASiS: Big Data to Support Precision Medicine and Public Health Policy. 2017. 35 Vidal, M-E et al. Semantic Data Integration of Big Biomedical Data for Supporting Personalised 36 Medicine. Springer. 2019. Literome • Extract genomic knowledge from PubMed articles; • Knowledge pertinent to genomic medicine: • Directed genic interactions such as pathways; • Genotype–phenotype associations. Poon, H. Literome: PubMed-scale genomic knowledge base in the cloud. Bioinformatics. 2014. 37 Project HANOVER • Machine reading for accelerating “Curations as a service” (CaaS) in precision medicine: • Molecular Tumor Board: • Cancer is a thousand diseases driven by disparate genetic mutations. • Real-world evidence: • FDA-approved drug takes over a decade and costs more than $2 billion. • Clinical trial matching: • 20% of clinical trials fail due to insufficient patients. https://hanover.azurewebsites.net/ 38 ResearchSpace • Semantic Web / Linked Data application; • Museums, libraries, archives: knowledge or memory institutions • CIDOC-CRM: • Conceptual Reference Model Oldman, D. and Tanase, D. Reshaping the Knowledge Graph by Connecting Researchers, Data and Practices in ResearchSpace. ISWC. 2018. 39 https://www.researchspace.org. Oldman, D. and Tanase, D. Reshaping the Knowledge Graph by Connecting Researchers, Data and Practices in ResearchSpace. ISWC. 2018. 40 https://www.researchspace.org.
Recommended publications
  • Ontotext Platform Documentation Release 3.4 Ontotext
    Ontotext Platform Documentation Release 3.4 Ontotext Apr 16, 2021 CONTENTS 1 Overview 1 1.1 30,000ft ................................................ 2 1.2 Layered View ............................................ 2 1.2.1 Application Layer ...................................... 3 1.2.1.1 Ontotext Platform Workbench .......................... 3 1.2.1.2 GraphDB Workbench ............................... 4 1.2.2 Service Layer ........................................ 5 1.2.2.1 Semantic Objects (GraphQL) ........................... 5 1.2.2.2 GraphQL Federation (Apollo GraphQL Federation) ............... 5 1.2.2.3 Text Analytics Service ............................... 6 1.2.2.4 Annotation Service ................................ 7 1.2.3 Data Layer ......................................... 7 1.2.3.1 Graph Database (GraphDB) ........................... 7 1.2.3.2 Semantic Object Schema Storage (MongoDB) ................. 7 1.2.3.3 Semantic Objects for MongoDB ......................... 8 1.2.3.4 Semantic Object for Elasticsearch ........................ 8 1.2.4 Authentication and Authorization ............................. 8 1.2.4.1 FusionAuth ..................................... 8 1.2.4.2 Semantic Objects RBAC ............................. 9 1.2.5 Kubernetes ......................................... 9 1.2.5.1 Ingress and GW .................................. 9 1.2.6 Operation Layer ....................................... 10 1.2.6.1 Health Checking .................................. 10 1.2.6.2 Telegraf ....................................... 10
    [Show full text]
  • A Question Answering System for Chemistry
    A Question Answering System for Chemistry Preprint Cambridge Centre for Computational Chemical Engineering ISSN 1473 – 4273 A Question Answering System for Chemistry Xiaochi Zhou1, Daniel Nurkowski4, Sebastian Mosbach1;2, Jethro Akroyd1;2, Markus Kraft1;2;3 released: January 25, 2021 1 Department of Chemical Engineering 2 CARES and Biotechnology Cambridge Centre for Advanced University of Cambridge Research and Education in Singapore Philippa Fawcett Drive 1 Create Way Cambridge, CB3 0AS CREATE Tower, #05-05 United Kingdom Singapore, 138602 3 School of Chemical 4 CMCL Innovations and Biomedical Engineering Sheraton House Nanyang Technological University Castle Park, Castle Street 62 Nanyang Drive Cambridge, CB3 0AX Singapore, 637459 United Kingdom Preprint No. 266 Keywords: Question Answering system, Natural language processing, Knowledge Graph Edited by Computational Modelling Group Department of Chemical Engineering and Biotechnology University of Cambridge Philippa Fawcett Drive Cambridge, CB3 0AS United Kingdom CoMo E-Mail: [email protected] GROUP World Wide Web: https://como.ceb.cam.ac.uk/ Abstract This paper describes the implementation and evaluation of a proof-of-concept Ques- tion Answering system for accessing chemical data from knowledge graphs which offer data from chemical kinetics to chemical and physical properties of species. We trained a question type classification model and an entity extraction model to interpret chemistry questions of interest. The system has a novel design which applies a topic model to identify the question-to-ontology affiliation. The topic model helps the sys- tem to provide more accurate answers. A new method that automatically generates training questions from ontologies is also implemented. The question set generated for training contains 80085 questions under 8 types.
    [Show full text]
  • Discussion #1 | University of Texas at Austin | August 13, 2018 Facilitated By: Itza A
    Introduction to Linked Data Discussion #1 | University of Texas at Austin | August 13, 2018 Facilitated by: Itza A. Carbajal, LLILAS Benson Latin American Metadata Librarian Hello! My name is Itza A. Carbajal I will be facilitating this discussion series on Linked Data I am the Latin American Metadata Librarian for a Post Custodial project at LLILAS Benson Have questions? Email me at: [email protected] Like twitter? You can find me at: @archiviststan Housekeeping ◎ Discussions are meant to highlight collective wisdom of the group ◎ Attendance to all discussion meetings not required, but recommended ◎ Take home practices are not required, but encouraged to further an individual’s understanding ◎ Readings are not required, but should be considered as a method for self education or for sharing with others Link to LIVE syllabus: http://bit.ly/SYLLABUSUTLD Link to reading materials: http://bit.ly/READUTLD 1. Discussion #1 Topics Topics to Discuss ◎ Semantic Web ◎ Linked data principles ◎ RDF and Triple statements Related Topics not covered in discussions ◎ Linked Open Data ◎ Examples of linked open data sets ◎ Linked Data platform 2. First, the Semantic Web Semantic Web (project) ◎ Derives from the concept of semantics defined as the study of meanings ◎ Extension to the current World Wide Web ◎ Also known as Web 3.0 where the web can now “read-write-execute” ◎ Changes the current web of information to a web of data including data inside the web and outside ◎ Core functions include semantic markup ○ Semantic markup - data interchange formats accessible to humans and machine ◎ Focuses on adding meaning/context to information giving machines and humans the ability to communicate and cooperate ◎ Relies on machine-readable metadata to expresses that meaning/context ◎ No formal definition, still ongoing and constantly maturing 3.
    [Show full text]
  • Data Platforms Map from 451 Research
    1 2 3 4 5 6 Azure AgilData Cloudera Distribu2on HDInsight Metascale of Apache Kaa MapR Streams MapR Hortonworks Towards Teradata Listener Doopex Apache Spark Strao enterprise search Apache Solr Google Cloud Confluent/Apache Kaa Al2scale Qubole AWS IBM Azure DataTorrent/Apache Apex PipelineDB Dataproc BigInsights Apache Lucene Apache Samza EMR Data Lake IBM Analy2cs for Apache Spark Oracle Stream Explorer Teradata Cloud Databricks A Towards SRCH2 So\ware AG for Hadoop Oracle Big Data Cloud A E-discovery TIBCO StreamBase Cloudera Elas2csearch SQLStream Data Elas2c Found Apache S4 Apache Storm Rackspace Non-relaonal Oracle Big Data Appliance ObjectRocket for IBM InfoSphere Streams xPlenty Apache Hadoop HP IDOL Elas2csearch Google Azure Stream Analy2cs Data Ar2sans Apache Flink Azure Cloud EsgnDB/ zone Platforms Oracle Dataflow Endeca Server Search AWS Apache Apache IBM Ac2an Treasure Avio Kinesis LeanXcale Trafodion Splice Machine MammothDB Drill Presto Big SQL Vortex Data SciDB HPCC AsterixDB IBM InfoSphere Towards LucidWorks Starcounter SQLite Apache Teradata Map Data Explorer Firebird Apache Apache JethroData Pivotal HD/ Apache Cazena CitusDB SIEM Big Data Tajo Hive Impala Apache HAWQ Kudu Aster Loggly Ac2an Ingres Sumo Cloudera SAP Sybase ASE IBM PureData January 2016 Logic Search for Analy2cs/dashDB Logentries SAP Sybase SQL Anywhere Key: B TIBCO Splunk Maana Rela%onal zone B LogLogic EnterpriseDB SQream General purpose Postgres-XL Microso\ Ry\ X15 So\ware Oracle IBM SAP SQL Server Oracle Teradata Specialist analy2c PostgreSQL Exadata
    [Show full text]
  • Constructing Reference Semantic Predictions from Biomedical Knowledge Sources
    Constructing Reference Semantic Predictions from Biomedical Knowledge Sources Demeke Ayele1, J. P. Chevallet2, Million Meshesha1, Getnet Kassie1 (1) Addis Ababa University, Addis Ababa, Ethiopia (2) University of Grenoble, Grenoble, France {demekeayele, meshe8, getnetmk}@gmail.com, jean- [email protected] ABSTRACT Semantic tuples are core component of text mining and knowledge extraction systems in biomedicine. The practical success of these systems significantly depends on the correctness and quality of the extracted semantic tuples. The quality and correctness of the semantic predictions can be measured against a benchmark semantic structure. In this article, we presented an approach for constructing a reference semantic tuple structure based on the existing biomedical knowledge sources in which the evaluation is based on the UMLS knowledge sources. In the evaluation, 7400 semantic triples are extracted from UMLS knowledge sources and the semantic predictions are constructed using the proposed approach. In the semantic triples, 87 concepts are found redundantly classified and 207 pair of semantic triples showed hierarchically inconsistent. 128 are found to be non-taxonomically inconsistent. The quality of the semantic triple is also judged using expert evaluators. The Cohen's kappa coefficient is used to measure the degree of agreement between two evaluators and the result is promising (0.9). Construire des prévisions de référence sémantique à partir de sources de connaissances biomédicales Les "tuples sémantiques" forment un élément essentiel à la fouille de texte et aux systèmes d'extraction de connaissances dans le domaine biomédical. Le succès en pratique des systèmes exploitant ces informations sémantique, dépend fortement de l'exactitude et de la qualité des tuples sémantiques.
    [Show full text]
  • Distributed Semantic Sensor Networks How to Use Semantics and Knowledge Distribution to Integrate Sensor Data of Disparate Data Sources
    2007 Distributed Semantic Sensor Networks How to use semantics and knowledge distribution to integrate sensor data of disparate data sources. In this research project we focused on sensor networks and how semantic web technologies and knowledge sharing can make integration of distributed sensor data possible. To achieve this we have first identified the main challenges to address to allow for data integration. A distributed semantic sensor network framework is proposed that uses semantic web techniques and knowledge sharing to address these challenges. We propose that knowledge sharing in a sensor network is a key aspect of data integration in a dynamic environment, since it allows the network to handle changes in the environment. This project is innovative in that it proposes new ways to handle semantic integration in a distributed environment, by using question federation and data conversion on semantically annotated data and questions. It contributes to current research in that our framework enables data integration of distributed sensor data. Masters Student M.J. van der Veen, Department of Artificial Intelligence, Rijksuniversiteit Groningen, Groningen Supervisors Bart Verheij, Department of Artificial Intelligence, Rijksuniversiteit Groningen Arnoud de Jong, TNO ICT & Telecommunicatie, Groningen 1 People A masters student at the Department of Artificial Intelligence at the Rijksuniversiteit Groningen in The Netherlands. This thesis is the final work in his masters studies. His main interests besides writing this thesis is doing sports, coaching, travelling and making music . Masters Student: Maarten van der Veen A tenured lecturer/researcher (in Dutch: universitair docent) at the University of Groningen, Department of Artificial Intelligence and a member of the ALICE institute.
    [Show full text]
  • Benchmarking RDF Query Engines: the LDBC Semantic Publishing Benchmark
    Benchmarking RDF Query Engines: The LDBC Semantic Publishing Benchmark V. Kotsev1, N. Minadakis2, V. Papakonstantinou2, O. Erling3, I. Fundulaki2, and A. Kiryakov1 1 Ontotext, Bulgaria 2 Institute of Computer Science-FORTH, Greece 3 OpenLink Software, Netherlands Abstract. The Linked Data paradigm which is now the prominent en- abler for sharing huge volumes of data by means of Semantic Web tech- nologies, has created novel challenges for non-relational data manage- ment technologies such as RDF and graph database systems. Bench- marking, which is an important factor in the development of research on RDF and graph data management technologies, must address these challenges. In this paper we present the Semantic Publishing Benchmark (SPB) developed in the context of the Linked Data Benchmark Council (LDBC) EU project. It is based on the scenario of the BBC media or- ganisation which makes heavy use of Linked Data Technologies such as RDF and SPARQL. In SPB a large number of aggregation agents pro- vide the heavy query workload, while at the same time a steady stream of editorial agents execute a number of update operations. In this paper we describe the benchmark’s schema, data generator, workload and re- port the results of experiments conducted using SPB for the Virtuoso and GraphDB RDF engines. Keywords: RDF, Linked Data, Benchmarking, Graph Databases 1 Introduction Non-relational data management is emerging as a critical need in the era of a new data economy where heterogeneous, schema-less, and complexly structured data from a number of domains are published in RDF. In this new environment where the Linked Data paradigm is now the prominent enabler for sharing huge volumes of data, several data management challenges are present and which RDF and graph database technologies are called to tackle.
    [Show full text]
  • Applying the Semantic Web to Computational Chemistry
    Applying the Semantic Web to Computational Chemistry Neil S. Ostlund, Mirek Sopek Chemical Semantics Inc. , Gainesville, Florida, USA {ostlund, sopek}@chemicalsemantics.com Abstract. Chemical Semantics is a new start-up devoted to bringing the Semantic Web to computational chemistry, with a future goal to cover chemical results of other kinds, including experimental. We have created a prototype test bed that enables computational chemists to publish the results of their computations (for example: structure, energies and wave functions of Ab Initio calculations) on servers (powered by semantic triple stores) holding RDF data. In addition, scientists will be able to search across results produced by themselves and other researchers using SPARQL queries and faceted search built on top of it. The semantics of computational chemistry oriented RDF data is defined by an ontology identified as “GC” (Gainesville Core). Gainesville Core defines concepts such as: theoretical models used in computations, molecular systems (as collections of atoms and molecules), molecular arrangements which describe system evolution in time or in phase space and so on. The poster demonstrates the project during its early stage, including a description and demonstration of the prototype portal created by Chemical Semantics. This paper is the concise description of the poster content. Keywords: computational chemistry, semantic web, linked data, Gainesville Core, ontology Chemical Semantics portal main targets The Chemical Semantic portal, now in its prototype version, has the following long term goals: 1. Interoperable PUBLISHING of Computational Chemistry calculations using Linked Data principles 2. Enhanced search across all published results with basic reasoning enabled by the new ontology 3. FEDERATION of published data with existing web-based chemical datasets 4.
    [Show full text]
  • Relext: Relation Extraction Using Deep Learning Approaches for Cybersecurity Knowledge Graph Improvement
    RelExt: Relation Extraction using Deep Learning approaches for Cybersecurity Knowledge Graph Improvement Aditya Pingle, Aritran Piplai, Sudip Mittal, Anupam Joshi James Holt, and Richard Zak University of Maryland, Baltimore County, Baltimore, MD 21250, USA Laboratory for Physical Sciences fadiping1, apiplai1, smittal1, [email protected] fholt, [email protected] Abstract—Security Analysts that work in a ‘Security Opera- cybersecurity reports, CVE [21], CWE [22], National Vul- tions Center’ (SoC) play a major role in ensuring the security nerability Datasets [28], and any cybersecurity text publicly of the organization. The amount of background knowledge they available. Various SIEM systems, fetch, process, and store have about the evolving and new attacks makes a significant difference in their ability to detect attacks. Open source threat much of the cyber threat intelligence available through OSINT intelligence sources, like text descriptions about cyber-attacks, sources. Knowledge Graphs (KG) specifically to store cyber can be stored in a structured fashion in a cybersecurity knowl- threat intelligence has been used by many cybersecurity in- edge graph. A cybersecurity knowledge graph can be paramount formatics systems. These cybersecurity knowledge graphs can in aiding a security analyst to detect cyber threats because it not only store but are also be able to retrieve data and answer stores a vast range of cyber threat information in the form of semantic triples which can be queried. A semantic triple complex queries asked by the security analysts. contains two cybersecurity entities with a relationship between The dependence of various cybersecurity informatics sys- them. In this work, we propose a system to create semantic tems on various knowledge representation schemes makes triples over cybersecurity text, using deep learning approaches it imperative that we develop systems that improve these to extract possible relationships.
    [Show full text]
  • Graphdb-Free.Pdf
    GraphDB Free Documentation Release 8.5 Ontotext Jun 17, 2019 CONTENTS 1 General 1 1.1 About GraphDB...........................................2 1.2 Architecture & components.....................................2 1.2.1 Architecture.........................................2 1.2.1.1 RDF4J.......................................3 1.2.1.2 The Sail API....................................4 1.2.2 Components.........................................4 1.2.2.1 Engine.......................................4 1.2.2.2 Connectors.....................................5 1.2.2.3 Workbench.....................................5 1.3 GraphDB Free............................................5 1.3.1 Comparison of GraphDB Free and GraphDB SE......................6 1.4 Connectors..............................................6 1.5 Workbench..............................................6 2 Quick start guide 9 2.1 Run GraphDB as a desktop installation...............................9 2.1.1 On Windows........................................ 10 2.1.2 On Mac OS......................................... 10 2.1.3 On Linux.......................................... 10 2.1.4 Configuring GraphDB................................... 10 2.1.5 Stopping GraphDB..................................... 11 2.2 Run GraphDB as a stand-alone server................................ 11 2.2.1 Running GraphDB..................................... 11 2.2.1.1 Options...................................... 11 2.2.2 Configuring GraphDB................................... 12 2.2.2.1 Paths and network settings...........................
    [Show full text]
  • Remote Sensing
    Remote Sens. 2015, 7, 9473-9491; doi:10.3390/rs70709473 OPEN ACCESS remote sensing ISSN 2072-4292 www.mdpi.com/journal/remotesensing Article Improving the Computational Performance of Ontology-Based Classification Using Graph Databases Thomas J. Lampoltshammer 1,2,* and Stefanie Wiegand 3 1 School of Information Technology and Systems Management, Salzburg University of Applied Sciences, Urstein Süd 1, Puch, Salzburg 5412, Austria 2 Department of Geoinformatics (Z_GIS), University of Salzburg, Schillerstrasse 30, Salzburg 5020, Austria 3 IT Innovation Centre, University of Southampton, Gamma House, Enterprise Road, Southampton SO16 7NS, UK; E-Mail: [email protected] * Author to whom correspondence should be addressed; E-Mail: [email protected]; Tel.: +43-50-2211 (ext. 1311); Fax: +43-50-2211 (ext. 1349). Academic Editors: Ioannis Gitas and Prasad S. Thenkabail Received: 31 March 2015 / Accepted: 17 July 2015 / Published: 22 July 2015 Abstract: The increasing availability of very high-resolution remote sensing imagery (i.e., from satellites, airborne laser scanning, or aerial photography) represents both a blessing and a curse for researchers. The manual classification of these images, or other similar geo-sensor data, is time-consuming and leads to subjective and non-deterministic results. Due to this fact, (semi-) automated classification approaches are in high demand in affected research areas. Ontologies provide a proper way of automated classification for various kinds of sensor data, including remotely sensed data. However, the processing of data entities—so-called individuals—is one of the most cost-intensive computational operations within ontology reasoning. Therefore, an approach based on graph databases is proposed to overcome the issue of a high time consumption regarding the classification task.
    [Show full text]
  • An Approach for Knowledge Graph Construction from Spanish Texts
    ISSN 1870-4069 An Approach for Knowledge Graph Construction from Spanish Texts Andrea G. Garcia Perez1, Ana B. Rios Alvarado1, Tania Y. Guerrero Melendez1, Edgar Tello Leal1, Jose L. Martinez Rodriguez2 1 Autonomous University of Tamaulipas, Mexico 2 CINVESTAV Unidad Tamaulipas, Mexico [email protected], farios, tyguerre, [email protected], [email protected] Abstract. Knowledge Graphs (KGs) represent valuable data sources for the Educational and Learning field, allowing computers and peo- ple to process and interpret information in an easier way. In addition KGs are useful to get information by software systems on any data re- source (person, place, organization, among others). KGs are represented through triples composed of entities and semantic relations commonly obtained from textual resources. However, exploiting triples from text is complicated due to linguistic variations, where Spanish has been slightly approached. This paper presents a method for constructing KGs from textual resources in Spanish. The method is composed of four stages: 1) obtaining of documents, 2) identification of entities and their relations, 3) association of entities to linked data resources, and 4) the construction of a schema for the KGs. The experiments were performed over a set of doc- uments in a general and computer science domain. Our method showed encouraging results for the precision in the identification of entities and semantic relationships for the KGs constructed from unstructured texts. Keywords: knowledge graph, construction of graph from texts. 1 Introduction In recent years the Web has become a global repository that represents a source of knowledge in a wide range of domains and languages, where the information is shared and stored in different formats.
    [Show full text]