<<

Public Defence, January 18, 2013

Contributions to the Construction of Extensible Semantic Editors

Emma Söderberg

Doctoral Dissertation, 2012

Department of Computer Science Lund University 2/20 Orientation Today’s programming editors

Text editor Semantic editor

gedit, Notepad, .. Eclipse, NetBeans, ...

Error feedback Background: Code browsing Refactoring - Programming: From text to semantic editor Name completion - Not all languages have semantic editors ... Problems: - Construction is time-consuming and complex - Maintenance may be difficult (extensions) Challenge: How can we make it easier to construct and maintain semantic editors? 3/20 Approach Generate services from specification

Specification: RAGs

- Formalism, specify semantics Services - Declarative, easily modularized - JastAddJ, JModelica, JastAdd

RAGs – Reference Attribute Grammars

- Grammar – defines program model, Program abstract syntax tree (AST) Model Programmer - Attributes – computed properties of AST nodes (types, scopes, ..) - References: Attribute values

This dissertation: Extend compiler with editor module 4/20 Contributions

Paper I

Paper II Specification Paper III

Paper IV Robustness Paper V

Performance Paper VI Paper VII 5/20 Contributions

Paper I

Paper II Specification Paper III

Paper IV Robustness Paper V

Performance Paper VI Paper VII 6/20 Specification The JastAdd Editor Framework Framework Problem: Browsing Completion How to add an editor to an existing compiler? Errors Views Approach: - Editor Framework - JastAdd: semantics - Eclipse: graphical components - Predefined generic services Compiler Errors Views

- RAG-based compiler extension Browsing Completion

Results: Editor - Two demonstrators (size in LOC): - JastAdd: compiler 29, 200, editor 4, 300 (1, 100) - PicoJava: compiler 210, editor 600 (420)

Conclusions - Modularly defined editor services - Compiler reuse (name analysis, type analysis, ...) [Paper I] 7/20 Specification

Example Service: a = 0; Here "a = 0" is a Dead assignments b = 0; dead assignment Problem: a = b + 5; because the value a = a + b; is not used and it How to specify flow analysis services? return a; could be removed.

Liveness (textbook) RAGs

Let n be a node and succ[n] the syn Set CFGNode.in() circular [empty()] = set of successors for the node n: use().union(out().compl(def())); in[n] = use[n] ∪ (out[n] \ def[n]) coll Set CFGNode.out() circular [empty()] with add; S Stmt contributes in() to CFGNode.out() for each pred(); out[n] = s∈succ[n] in[s] Expr contributes in() to CFGNode.out() for each pred();

JastAddJ Conclusions: 1.4 Java 1.5 10,400 LOC 5,000 LOC - Textbook-like definitions - Flow analysis added modularly ControlFlow 1.4 ControlFlow 1.5 with few LOC. 450 LOC 20 LOC - Precision/performance on par DataFlow with Soot. 60 LOC DeadAssign [ ] 25 LOC Paper II 8/20 Contributions

Paper I Editor framework Paper II Flow analysis Specification Paper III

Paper IV Robustness Paper V

Performance Paper VI Paper VII 9/20 Contributions

Paper I Editor framework Paper II Flow analysis Specification Paper III

Paper IV Robustness Paper V

Performance Paper VI Paper VII 10/20 Performance Faster evaluation from scratch Reference Attributes a = b. b = d e = .. c = d d = e

Attribute Instance Background: At attribute evaluation Call Graphs no caching - attribute dependencies → call graph I a b - no caching - multiple evaluations → very slow I I I - full caching - at most one evaluation → faster I I

cI dI eI Problem: Memory/performance costs II IIII I I I New idea: Selective caching aII bII I - based on profiling full caching I - skip caching of some attributes a b I I I Results: 20% speedup and 38% memory reduction I I - Compared to full caching cI dI eI I I - JastAddJ Java compiler I I - Java benchmarks I aII bII I [Paper III] 11/20 Performance Incremental evaluation

Problem: How to efficiently update the program model after edits? Edits State of the art:

if(c<10 - Hand-coded solutions, complex, error-prone c++; } for(int Challenge: c += Program - Automatically update model, RAGs Editor Feedback Model Earlier work: - Optimal automatic updates for AGs - No handling of references

New results: - Dynamic algorithm for RAGs. - Build dynamic dependency graph during evaluation - Use graph to uncache affected attributes after edits [Paper IV] 12/20 Performance Example Incremental evaluation comparison int a;b int c; a = 42;

Progused Prog

Block org Block org Single used Single used used upt BlockSt org BlockSt upt upt org Block used Blockll[a] upt org Comb ll[a] Comb ll[a] upt org upt ll[a] Comb org Assign upt ll[a] Comb Assign org used ll[a] Singleupt DeclSt IdUse Single DeclSt IdUse dec ll[a] DeclSt Decl DeclSt Decl "c" used "c" type dec ll[a] Decl Type type Decl Type decl "a" "a" "a" lo[a] type type Type Type "a" type "int" type "int" "int" AGs "int" RAGs

Legend edited AST edge Name AST node affected Attribute dep. name Attribute instance examined Attribute dep. skipped (examined) ".." Token with value ".." [Paper V] 13/20 Contributions

Paper I Editor framework Paper II Flow analysis Specification Paper III Efficient caching Paper IV Robustness Paper V Incremental Performance evaluation Paper VI Paper VII 14/20 Contributions

Paper I Editor framework Paper II Flow analysis Specification Paper III Efficient caching Paper IV Robustness Paper V Incremental Performance evaluation Paper VI Paper VII 15/20 Robustness How to handle erronoues input?

Editor Program Model Problem: class A{ Outline Scope errors cause void m() { x int a Parser recovery to fail int b; Folding }

Idea: Error Recovery: Err Completion - Use layout for recovery Error Productions Automatic Recovery - Aid existing recovery with preprocessor class C{ Island grammars: void m() { - Islands: Interesting New Algorithm: // ... - Water: Uninteresting } Bridge parsing void n() { Bridge Parsing // ... - Reefs: Patterns for recovery } } - Bridges – Scopes

0 class.. { 1 void.. { 2 //.. 1 } 1 void.. { 2 //.. 1 } 0 } [Paper VI] 16/20 Robustness Bridge Parsing Algorithm

[Paper VI] 17/20 Robustness Results from adding bridge parsing

Antlr – a well-known LL-based parser generator

250

plain ANTLR 200 with Bridge Parsing

150

100

50 Distance to ideal AST 0

0 10 20 30 40 Test cases

[Paper VI] 18/20 Robustness Bridge Parsing ideas in JSGLR

Collaboration: TU Delft

Problem: Context Free Grammars Provide error recovery for Scannerless GLR (SGLR)

SGLR: LL LR GLR - Generalized LR: Arbitrary CFG - Scannerless – include tokens in the grammar - Language composition, e.g., Java-SQL and enum - JSGLR – implementation of SGLR in Java Recovery Quality

Method: Eclipse/Java Recovery using island grammars, JSGLR layout and bridge parsing 0 20 40 60 80 100 Results: Excellent % of Tests Good Recovery quality on par with the Eclipse Java parser. Poor Failed [Paper VII] Bridge Parser Parts 19/20 Contributions

Paper I Editor framework Paper II Flow analysis Specification Paper III Efficient caching Paper IV Robustness Paper V Incremental Performance evaluation Paper VI Paper VII Bridge parsing Layout-sensitive recovery 20/20 Contributions

Paper I Editor framework Paper II Flow analysis Specification Paper III Efficient caching Paper IV Robustness Paper V Incremental Performance evaluation Paper VI Paper VII Bridge parsing Layout-sensitive recovery