Metadata inference

Course "Empirical Software Engineering"

University of Koblenz-Landau Department of Computer Science Ralf Lämmel Software Languages Team

© 2013 101companies and Ralf Lämmel 1 Overall research context

© 2013 101companies and Ralf Lämmel 2 Objective

Languages

Your arbitrary Technologies program Concepts

Features https://commons.wikimedia.org/wiki/File:Prism-rainbow.svg

© 2013 101companies and Ralf Lämmel 3 Raw research questions

How to automatically detect ... perhaps specifically used software languages, in the context of the 101companies project used software technologies, relevant software concepts, implemented system features?

© 2013 101companies and Ralf Lämmel 4 Sesame XPATH TXL What’sJPA the problem? Jena Rose JDBC EMF.gen jDOM JAXB Jersey RDF(S) Jena UTF8 ODM MOF XSD JeanBeans Teneo UML VLDB Stratego BNF xerces Too GWTmany SLE2010languages.Json OCL sax RDFS saxon OWL OWL Ecore Rest Too many technologies. MySQL ORACLE RDF Jean OMG XMI JMI JMF EMF XSD ArgoUML xalan RDFa ODBC SparQLTooXMLSpy manyYacc concepts. LALR Prolog XSLT JAXP Protegé SBVR DOM Java ATOM ER CFG Too little time. SQL DDL Dragan SQL Antlr QVT TENEO XLST Awk DTD sed Saxon TCS grep ASCII XSD XQuery

© 2013 101companies and Ralf Lämmel 5 Research challenges

New 101contributions with unforeseen technologies and concepts and new ways of implementing features. Different platforms and repositories: Apache, .NET Framework, gem, Maven repository, hackageDB, ... Lack of an established ontology: sets of tags or categories used by the different platforms and repositories. Overall: scalability, generality

© 2013 101companies and Ralf Lämmel 6 Let’s focus on software chrestomathies.

© 2013 101companies and Ralf Lämmel 7 Software chrestomathies

© 2013 101companies and Ralf Lämmel 8 What’s a software chrestomathy? • Quick definition ‣ A collection of software systems (‘programs’) ✦ designed as an aid in learning a CS subject ✦ exercising diverse languages & technologies • Examples (more or less) ‣ 99 Bottles of Beer ‣ Rosetta Code ‣ HollingBerries ‣ 101companies

© 2013 101companies and Ralf Lämmel The 101companies chrestomathy A collection of human-resources management systems: code + documentation

Company X: Swing + JDBC

Company Y: SWT + Hibernate

Company Z: GWT + MongoDB

... • Total salaries • Increase salaries • Cut salaries • Edit employee data • Import / export company data © 2013 101companies and Ralf Lämmel “101” covers many languages

• Java • JSON • Text • WSDL • # • Ecore • Markdown • Rascal • VB.NET • PHP • Prolog • Javascript • Python • PHP4 • F# • 101meta • XML • PHP5 • • Ruby • GIF • SQL • CSS • C++ • Haskell • .sh • XMI • Cobol • XHTML • PNG • AspectJ • (make) • .properties • ATL • Smalltalk • (Ant) • .ini • XSD • JAR • ... • HTML • Scala • Erlang © 2013 101companies and Ralf Lämmel “101” covers many technologies

• make • • SOA • Java RMI • Ant • Hibernate • JPA • JDBC • Maven • Java collections • EMFCompare • zip • sbt • VS • 101explorer • jEdit • JAXB • java.util • Mediawiki • jdom • xjc • java.io • hackage • w3c.dom • EF • java.lang.reflect • cabal • dom4j • xsd.exe • LINQ • ghci • xom • XmlSerializer • System.Xml • ghc • SAX • ANTLR • System.Xml.Linq • Swing • ... • javac • ADO • AWT

© 2013 101companies and Ralf Lämmel One particular implementation

© 2013 101companies and Ralf Lämmel Part of the documentation

© 2013 101companies and Ralf Lämmel Traceability between code and documentation

© 2013 101companies and Ralf Lämmel 15 The traceability challenge for a general software product

© 2013 101companies and Ralf Lämmel The traceability challenge for a software chrestomathy

© 2013 101companies and Ralf Lämmel The traceability challenge

© 2013 101companies and Ralf Lämmel Metamodels at hand 101repo vs. 101wiki

© 2013 101companies and Ralf Lämmel © 2013 101companies and Ralf Lämmel © 2013 101companies and Ralf Lämmel The 101meta language for rule- based metadata specification

• Check condition on files • Associate metadata with files or fragments ‣ Languages “Metadata ‣ Technologies inference” ‣ Concepts ‣ Tools for validation, fragment selection, ...

© 2013 101companies and Ralf Lämmel Links established by 101meta rules

© 2013 101companies and Ralf Lämmel © 2013 101companies and Ralf Lämmel © 2013 101companies and Ralf Lämmel Links established by 101meta rules

© 2013 101companies and Ralf Lämmel 101meta rules for ANTLR

© 2013 101companies and Ralf Lämmel 27 Wanted! 101meta rules for ANTLR

When is a file ... • ... the ANTLR library? • ... an ANTLR grammar? • ... an ANTLR-generated parser? • ... a source that imports ANTLR? (See more rules online.)

© 2013 101companies and Ralf Lämmel www..orghttp:// What’s ANTLR? company : 'company' STRING '{' department* '}' EOF; department : 'department' STRING '{' ('manager' employee) ('employee' employee)* department* ANTLR style '}'; grammar for employee : companies STRING '{' 'address' STRING 'salary' FLOAT '}'; Grammars were hot in the 70ies? You are so wrong!

© 2013 101companies and Ralf Lämmel http://101companies.org/wiki/Theme:ANTLR

[−] ANTLR theme ANTLR-centric grammarware theme yapg A ANTLR-based generator for text-to-object mappings antlrAcceptor An ANTLR-based acceptor for textual syntax antlrLexer Lexer-based processing with ANTLR antlrObjects Object/Text mapping for Java with ANTLR for antlrParser Processing textual syntax with semantic actions of ANTLR antlrTrees Parsing text to trees and walk them with ANTLR gra2mol Grammar to model transformation with Grammar2Model xtext An XText- and Eclipse-based DSL editor

© 2013 101companies and Ralf Lämmel A 101meta rule to find the ANTLR library

“Condition” of the rule { "basename" : "#^antlr-(.*)\\.jar$#", "metadata" : { "partOf" : "ANTLR", Expressed relationship "comment" : "The ANTLR library, Version $1" } } Assigned metadata

© 2013 101companies and Ralf Lämmel A 101meta rule to find an ANTLR grammar

“Condition” of the rule { "sufx" : ".g", Expressed relationship "metadata" : { "inputOf" : "ANTLR", "comment" : "An ANTLR grammar" } } Assigned metadata

© 2013 101companies and Ralf Lämmel A 101meta rule to find a generated parser { "basename" : "#^.*Parser\\.java$#", "content" : "// \\$ANTLR.*\\.g", "metadata" : [ { Condition on text of file "outputOf" : "ANTLR", "comment" : "An ANTLR-generated parser" }, { "concept" : "Parser" } Metadata for software concept ] }

© 2013 101companies and Ralf Lämmel A rule to find a source importing ANTLR

{ Programmatic condition "sufx" : ".java", "predicate" : "technologies/Java_platform/javaImport.sh", "args" : [ "org.antlr.runtime" ], Java package to "metadata" : { be tested for "dependsOn" : "ANTLR", "comment" : "A source that imports ANTLR" } }

© 2013 101companies and Ralf Lämmel Metadata management

© 2013 101companies and Ralf Lämmel 35 Metadata management

• Metadata declaration with rules, as demonstrated • Metadata assignment with a rule engine • Metadata exploration with a browser (as seen in second)

© 2013 101companies and Ralf Lämmel Metadata assignment

101worker continuously walks over 101repo to construct data for files matches 101meta rules. In fact, each file is associated with such matches.

http://data.101companies.org/resources/contributions/antlrObjects/org/softlang/parser/ CompanyParser.java.summary.json

© 2013 101companies and Ralf Lämmel Metadata exploration

Disclaimer: This is not even beta.

© 2013 101companies and Ralf Lämmel © 2013 101companies and Ralf Lämmel © 2013 101companies and Ralf Lämmel Metadata exploration

Disclaimer: This is not even beta.

© 2013 101companies and Ralf Lämmel Platforms

© 2013 101companies and Ralf Lämmel 42 HackageDB http://hackage.haskell.org

© 2013 101companies and Ralf Lämmel HackageDB http://hackage.haskell.org

© 2013 101companies and Ralf Lämmel The Apache Software Foundation http://projects.apache.org/

© 2013 101companies and Ralf Lämmel What’s that?

© 2013 101companies and Ralf Lämmel PyPi - the Python Package Index https://pypi.python.org/

© 2013 101companies and Ralf Lämmel Maven Central Repository http://search.maven.org/

© 2013 101companies and Ralf Lämmel What’s that?

© 2013 101companies and Ralf Lämmel http://rubygems.org/

© 2013 101companies and Ralf Lämmel Microsoft Developer Network http://msdn.microsoft.com/

© 2013 101companies and Ralf Lämmel Thanks! Questions?

© 2013 101companies and Ralf Lämmel 52