Editor: Michiel van Genuchten VitalHealth Software IMPACT [email protected]

Editor: Les Hatton Oakwood Computing Associates [email protected]

Drawing Conclusions from Linked Data on the Web The EYE Reasoner

Ruben Verborgh and Jos De Roo

This issue’s installment examines a software program reasoning about the world’s largest knowledge source. Ruben Verborgh and Jos De Roo describe how a small open source project can have a large impact. This is the fourth open source product discussed in the Impact department and the rst written in the logic programming language Prolog. —Michiel van Genuchten and Les Hatton

THE WEB is the world’s largest source If you look closely at Figure 1, you’ll of knowledge for people—and ma- notice that the link type itself (the prop- chines. In the beginning, those machines erty) is also a URL. So, if a machine were mostly search engine crawlers doesn’t understand what “knows” that extracted keywords from natural- means, it can look it up by following language texts. But now, the Web of- that URL. This principle is crucial to fers them something far more powerful: linked data: if you don’t know some- linked data. thing, look it up. Which Thomas are Linked data goes back to the essence we talking about? What does “knows” of the Web and information itself, by mean? Follow the URL to nd out. representing each piece of data as a link If you follow the URL for this par- between two things. For example, Fig- ticular “knows” (http://xmlns.com/foaf ure 1 shows a triple stating that Thomas /0.1/knows), you’ll learn about the na- Edison “knows” . Unlike ture of this relationship. First, using most hyperlinks between Edison and “knows” means the involved subject Tesla, this one carries a speci c mean- and object are people. So even if we ing. Yet linked data’s real bene t goes don’t know Thomas or Nikola, we know deeper: Edison and Tesla are repre- they’re people (as opposed to pets or car- sented by their Web address or URL. So, toon gures). Second, this “knows” indi- if you want to know more about Edison cates reciprocity, so Nikola also knows or Tesla, you can follow their URLs. Thomas. As humans, we can derive this Therefore, linked data is linked on two without even being aware of who Nikola levels: each triple links two concepts, or Thomas are. and those concepts link to more infor- Such pieces of derived knowledge mation about themselves. seem human-speci c, but linked data

0740-7459/15/$31.00 © 2015 IEEE MAY/JUNE 2015 | IEEE SOFTWARE 23 IMPACT

links that object to the subject. http://dbpedia.org/resource/Thomas_Edison Subject So, if we give EYE the Edison– knows–Tesla triple, the description http://xmlns.com/foaf/0.1/knows Predicate of “knows,” and the previous rule, http://dbpedia.org/resource/Nikola_Tesla Object EYE will derive that Tesla also knows Edison. In isolation, this might not look FIGURE 1. At the heart of linked data are triples: predicates link a subject to an object. spectacular. After all, we gave EYE This triple states that “knows” Nikola Tesla. all the knowledge it needed. How- ever, the rules can cascade at a large scale, so it becomes interest- to turn that data into knowl- ing if we give EYE tons of knowl- Software Machines edge and concrete actions. edge—say, millions of triples—and Continuing the previous it derives just the facts we want. For Generic reasoning engine example, let’s see how EYE instance, we might ask for “contem- Notation3 P-code moves from data to conclu- poraries of Nikola Tesla,” and by sions. We can describe the the (derived) fact that Tesla knows Euler abstract machine knows property with existing Edison, EYE could find Edison as Prolog VM code concepts such as domain, range, an answer. In a different use case, and SymmetricProperty. Many rea- we might feed EYE with medical Prolog virtual machine soners have built-in knowl- information of millions of patients Assembly code edge; they would know what to identify adverse drug effects that those concepts mean and how previously were hidden. EYE is used Central processing unit to apply them. The draw- regularly to solve such real-world back, of course, is that only clinical applications. built-in concepts can cre- FIGURE 2. The EYE stack offers a generic ate new knowledge. So, EYE EYE’s Internals reasoning engine on top of interchangeable virtual was designed to have the least Algorithmically speaking, EYE is machines. amount of inherent knowl- a theorem prover. Users set a goal, edge; it’s extensible through and EYE tries to reach it by applying rules so that new knowledge logical rules similar to what we men- also lets software reasoners arrive at can be created. For example, the tioned previously, mostly working the same conclusions. This column EYE webpage offers rules that model backward from the goal. To evade discusses how the EYE reasoner can the SymmetricProperty concept as follows: endless loops, the algorithm avoids do just that and how industry is al- needlessly repeating previous work ready using this. { through Euler path detection, simi- ?property rdf:type owl:SymmetricProperty. lar to Leonhard Euler’s Königsberg Reasoning on the ?subject ?property ?object. bridge problem. EYE interprets each Semantic Web with EYE } logical rule P ⇒ C (where P is a pre- Since 2006, EYE (available un- => condition and C is a consequent) as der the W3C license at http:// { P AND NOT(C) ⇒ C, so that rules eulersharp.sourceforge.net) has been ?object ?property ?subject. execute only when they can generate tackling such reasoning challenges }. new triples. on a large scale, thereby forming a A key characteristic of EYE’s ar- part of the bigger Semantic Web vi- Perhaps surprisingly, this rule is chitecture is portability, because sion. In this vision, machines per- also a (special) triple: antecedent–­ interoperability is crucial to the Se- form tasks on the Web for people, then–consequent. It explains that, mantic Web. EYE components are combining linked data with con- if a symmetric property links a interchangeable (see Figure 2). EYE cepts such as reasoning and proof subject to an object, then it also runs in a Prolog virtual machine,

24 IEEE SOFTWARE | WWW.COMPUTER.ORG/SOFTWARE | @IEEESOFTWARE IMPACT

which is compiled to assembly code and runs directly on the CPU. The 2,500 core of EYE, the Euler Abstract Machine, is compatible with at least 2,000 two major Prolog engines (YAP and SWI-Prolog). This machine accepts N3 (Notation3) P-code, which is a 1,500 Prolog representation resulting from parsing RDF (Resource Description 1,000

Framework) triples and N3 rules. No. of download s The entire stack thereby becomes a 500 generic reasoning engine, which is extensible with any kind of domain- 0 specific rules. Because EYE can out- 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 put N3 P-code to a file, users can Year create specific reasoning-engine in- stances that have a certain rule set preloaded for a particular domain. FIGURE 3. EYE has been downloaded 30,000 times, at an average of 200 downloads EYE also exhibits external com- per month. patibility: it accepts exchangeable N3 and Turtle RDF documents from any source on the Web. Us- predominant and only implemen- domain modeled in RDF (and N3 ers can ask EYE to generate a proof tation, aptly named EYE (first for for rules). In particular, if a domain explaining how the given goal is Euler YAP [Prolog] Engine, then for was modeled with existing core on- reached. Such proofs use a publicly Euler Yet Another Proof Engine). tological concepts from the Semantic available, interoperable vocabulary, Since then, more than 150 updates Web, such as those from the RDFS so other parties can follow and of EYE with new features and bug (RDF Schema) and OWL vocabular- understand the line of reasoning. fixes have been released; the sched- ies, EYE can derive new triples right Most important, this mechanism ule was recently adapted to a three- away because it can reuse the exist- allows independent proof valida- month cycle. The number of down- ing N3 rules for these concepts. tion, which contributes to one of loads from SourceForge has steadily In particular, EYE has several the forms of trust on the Seman- increased, with an average of 200 active applications in the medical tic Web. If a certain party comes per month and a total of 30,000 (see sector.1 One example is the SALUS to a conclusion, any other party Figure 3). EYE is also available as a (Scalable, Standard Based Interop- can thus verify why that conclu- public Web service and a Docker im- erability Framework for Sustainable sion is valid. Furthermore, EYE is age (with 500 downloads so far). Proactive Post Market Safety Stud- compatible with the W3C’s Rule Jos De Roo develops and main- ies) project, for which Agfa Health- Interchange Format (RIF), so the tains EYE, and 30 people in its com- care is developing a semantic in- rules that constitute deductions and munity support it through bug re- teroperability service for postmarket proofs can be exchanged even be- ports, which have totaled more than drug safety studies. Different data yond the N3 language. 200 so far. These reports are usually sources each employ different on- addressed within days. EYE’s cur- tologies, and EYE lets users trans- EYE’s Development and Use rent code base consists of roughly form queries and data to and from EYE originated around 1999, the 10,000 lines of Prolog source code, CREAM (Clinical Research Entities beginning of the Semantic Web era, which amounts to 100,000 lines of Patterns: Advanced Model). This as a program called Euler. After Prolog virtual-machine code. works through rules for mapping several iterations in other program- Being a generic reasoner, EYE is each ontology to CREAM. The Paris ming languages, the Prolog incarna- applicable in a range of contexts. In public hospital system is employing tion started in 2006, becoming the essence, it can tackle any problem EYE for similar use cases.

MAY/JUNE 2015 | IEEE SOFTWARE 25 IMPACT

available in RDF format. EYE’s role RELATED WORK IN REASONERS there is clear: ensuring that linked data in one vocabulary can be eas- A few alternatives to the EYE reasoner exist—most notably, the cwm, Jena, ily transformed into another, thanks and Fuxi reasoners. However, EYE offers far superior performance. For in- to explicit relations between those stance, EYE solves the Deep Taxonomy Benchmark problem (http://eulersharp vocabularies. Because these building .sourceforge.net/2003/03swap/dtb-note) for 100,000 triples in 4.8 seconds, blocks are mostly in place, the EYE whereas cwm needs nine days and Jena goes out of memory. Researchers project will focus on unifying logic have reported similar drastic performance improvements with the RESTdesc and explainable reasoning. Composition Benchmark (http://eulersharp.sourceforge.net/2003/03swap Unifying logic designates a logic /rcb-note) and Basic MONADIC Benchmark (http://eulersharp.sourceforge framework that can be shared across .net/2003/03swap/bmb-note). different agents on the Web. So far, Three other products described previously in IEEE Software’s Impact de- Semantic Web logic has been frag- partment are similar to EYE. RealPlayer is also an open source product that mented. N3 hasn’t been standard- has seen industrial applications.1 It’s much larger (millions of lines of code). In ized, but the Turtle syntax on which 2010, the community comprised 150,000 engineers, many of whom worked it’s based was  nally standardized in mobile phone companies that were part of the Helix community. YAWL (Yet in 2014. In practice, not everything Another Work ow Language), another open source product, was also written can be expressed elegantly on the by a small team of engineers and is comparable in size and installed base.2 level of N3 itself. Some basic blocks, Bayesian networks are similar to EYE in terms of scienti c grounding, the such as mathematical addition or availability of a free version, size, and volume.3 string comparison, are best exposed through built-ins, special predicates that aren’t necessarily implemented References in N3. To communicate with each 1. L. Bouchard, “Multimedia Software for Mobile Phones,” IEEE Software, vol. 27, no. 3, 2010, other, agents must agree on a pre- pp. 8–10. de ned set of built-ins with a rig- 2. M.J. Adams, A.H.M. ter Hofstede, and M. La Rosa, “Open Source Software for Work ow Man- agement: The Case of YAWL,” IEEE Software, vol. 28, no. 3, 2011, pp. 16–19. orously speci ed meaning. If they 3. N. Fenton and M. Neil, “Decision Support Software for Probabilistic Risk Assessment Using don’t, results and corresponding Bayesian Networks,” IEEE Software, vol. 31, no. 2, 2014, pp. 21–26. proofs will be untrustworthy. Al- though the RIF standard includes such a set, adoption has been mini- mal. EYE and other reasoners must strive to improve this. Explainable reasoning indicates Even though the use of reason- composition, data, (regular) rules, the possibility of agents producing ing isn’t always obvious, EYE is at and services work seamlessly to- and exchanging proofs. Any con- the heart of several applications. gether. So, regular ontological rea- clusion reached by a reasoner must Ghent University and iMinds are us- soning can take place at the same be independently veri able to en- ing EYE’s proof capabilities to solve time as service composition, creating able real-world applications. For the problem of semantic service com- service connections that were previ- instance, for  nancial or safety cal- position.2 Each service’s function- ously infeasible. culations, all the involved parties ality is expressed in the RESTdesc For a look at technologies similar want to ensure they can trust the language, consisting of N3 rules, to EYE, see the sidebar. result. And, if these proofs are used so that EYE can reach a goal with in work ow contexts, they re ect on them—even though they’re not static EYE’s Future an entire agent-driven process. EYE data rules. Agfa Healthcare experi- The Semantic Web world is still wants to play a role in such an agent- ments with RESTdesc rules for se- under development, but the linked enabled Web, and the current proof- mantic clinical work ow composi- data principles have already inspired enabled reasoning functionality is tion and execution. In this kind of many people to make their data the  rst step toward that.

26 IEEE SOFTWARE | WWW.COMPUTER.ORG/SOFTWARE | @IEEESOFTWARE IMPACT

n the future, reasoners could

have a large impact on software. PURPOSE: The IEEE Computer Society is the world’s largest association of computing A few decades ago, software professionals and is the leading provider of technical information in the field. I MEMBERSHIP: Members receive the monthly magazine Computer, discounts, and took over many jobs from hardware: calculations that used to be hard- opportunities to serve (all activities are led by volunteer members). Membership is open to all IEEE members, affiliate society members, and others interested in the wired became soft-wired as lines of computer field. code. Although some highly opti- mized calculations still happen in COMPUTER SOCIETY WEBSITE: www.computer.org Next Board Meeting: 1–5 June 2015, Atlanta, GA, USA hardware, most are now performed with special-purpose software on EXECUTIVE COMMITTEE President: Thomas M. Conte generic-purpose hardware. Simi- President-Elect: Roger U. Fujii; Past President: Dejan S. Milojicic; Secretary: larly, reasoners will start rewiring Cecilia Metra; Treasurer, 2nd VP: David S. Ebert; 1st VP, Member & Geographic software for specific situations. EYE Activities: Elizabeth L. Burd; VP, Publications: Jean-Luc Gaudiot; VP, Professional already does this when performing & Educational Activities: Charlene (Chuck) Walrad; VP, Standards Activities: Don rule-based Web service composition. Wright; VP, Technical & Conference Activities: Phillip A. Laplante; 2015–2016 IEEE Director & Delegate Division VIII: John W. Walz; 2014–2015 IEEE Director & In the end, this opens the door to Delegate Division V: Susan K. (Kathy) Land; 2015 IEEE Director-Elect & Delegate automatically customized software Division V: Harold Javid processes. What software once did BOARD OF GOVERNORS for hardware, reasoning might one Term Expiring 2015: Ann DeMarle, Cecilia Metra, Nita Patel, Diomidis Spinellis, day do for software. Phillip A. Laplante, Jean-Luc Gaudiot, Stefano Zanero Term Expriring 2016: David A. Bader, Pierre Bourque, Dennis J. Frailey, Jill I. Gostin, Atsuhiro Goto, Rob Reilly, Christina M. Schober Term Expiring 2017: David Lomet, Ming C. Lin, Gregory T. Byrd, Alfredo Benso, Forrest Shull, Fabrizio Lombardi, Hausi A. Muller References 1. E. Papageorgiou et al., “Application of EXECUTIVE STAFF Probabilistic and Fuzzy Cognitive Ap- Executive Director: Angela R. Burgess; Director, Governance & Associate Executive proaches in Semantic Web Framework for Director: Anne Marie Kelly; Director, Finance & Accounting: John G. Miller; Medical Decision Support,” Computer Director, Information Technology Services: Ray Kahn; Director, Membership: Eric Methods and Programs in Biomedicine, vol. 112, no. 3, 2013, pp. 590–598. Berkowitz; Director, Products & Services: Evan M. Butterfield; Director, Sales & 2. R. Verborgh et al., “Capturing the Func- Marketing: Chris Jensen tionality of Web Services with Functional COMPUTER SOCIETY OFFICES Descriptions,” Multimedia Tools and Applications, vol. 64, no. 2, 2012, pp. Washington, D.C.: 2001 L St., Ste. 700, Washington, D.C. 20036-4928 365–387. Phone: +1 202 371 0101 • Fax: +1 202 728 9614 • Email: [email protected] Los Alamitos: 10662 Los Vaqueros Circle, Los Alamitos, CA 90720 Phone: +1 714 821 8380 • Email: [email protected] Membership & Publication Orders RUBEN VERBORGH is a postdoctoral re­searcher in semantic hypermedia at the Multimedia Lab, Phone: +1 800 272 6657 • Fax: +1 714 821 4641 • Email: [email protected] a research group of iMinds and Ghent University. Asia/Pacific: Watanabe Building, 1-4-2 Minami-Aoyama, Minato-ku, Tokyo 107- Contact him at [email protected]. 0062, Japan • Phone: +81 3 3408 3118 • Fax: +81 3 3408 3553 • Email: tokyo.ofc@ computer.org JOS DE ROO is a researcher at Agfa Healthcare. Contact him at [email protected]. IEEE BOARD OF DIRECTORS President & CEO: Howard E. Michel; President-Elect: Barry L. Shoop; Past President: J. Roberto de Marca; Director & Secretary: Parviz Famouri; Director & Treasurer: Jerry Hudgins; Director & President, IEEE-USA: James A. Jefferies; Director & President, Standards Association: Bruce P. Kraemer; Director & VP, Educational Activities: Saurabh Sinha; Director & VP, Membership and Geographic Activities: Wai-Choong Wong; Director & VP, Publication Services and Products: Sheila Hemami; Director & VP, Technical Activities: Vincenzo Piuri; Director & Delegate Division V: Susan K. (Kathy) Land; Director & Delegate Division VIII: Selected CS articles and columns John W. Walz

are also available for free at revised 27 Jan. 2015 http://ComputingNow.computer.org.

MAY/JUNE 2015 | IEEE SOFTWARE 27