QGE - an Attribute Grammar Based System to Assess Grammars Quality
Total Page:16
File Type:pdf, Size:1020Kb
Universidade do Minho Escola de Engenharia Departamento de Informatica´ Master Course in Computing Engineering Joao˜ Carlos Alves Cruz QGE - An Attribute Grammar based System to assess Grammars Quality Master dissertation Supervised by: Pedro Rangel Henriques Co-supervised by: Daniela da Cruz Braga, October 27, 2015 Universidade do Minho Escola de Engenharia Departamento de Informatica´ Master Course in Computing Engineering Joao˜ Carlos Alves Cruz QGE - An Attribute Grammar based System to assess Grammars Quality Master dissertation Supervised by: Pedro Rangel Henriques Co-supervised by: Daniela da Cruz Braga, October 27, 2015 PARECER Serve o presente parecer para declarar que o aluno Joao˜ Carlos Alves Cruz concluiu, conforme esper- ado, a escrita do seu relatorio´ de dissertac¸ao.˜ O documento foi revisto pelos orientadores, os quais atestam a sua validade cient´ıfica, assim como o cumprimento dos objetivos propostos para esta etapa. Mais se informa que as atividades de mestrado do aluno Joao˜ Carlos Alves Cruz decorrem dentro dos planos e prazos inicialmente previstos. ORIENTADOR Pedro Rangel Henriques CO - ORIENTADOR Daniela da Cruz ACKNOWLEDGEMENTS a ABSTRACT The development and support of a software system is a complex task, therefore software engineering is responsible to supply a collection of techniques and scientific methods, to deal and cope with such complexity. Analysis, modeling/specification, design, maintenance, among others, are activities included in this field of engineering. This master work aims at proposing an approach for the application of these techniques and meth- ods within languages, more precisiley, grammars. Emphasis is put on the grammars development process, implementation and maintenance, aiming for better results, in terms of efficiency and quality, concerning both the generated language and its processor. Assuming a grammar, gaps were identified related to the study of its quality, more specific, in terms of ease of learning (reading and understand- ing), ease of derivation and maintenance, as well as in terms of efficiency, like the language sentences recognition and the generation of the language processor. A set of metrics is proposed to overcome those gaps and improve the quality of grammars. This process is based on: i) selecting a set of well-defined software metrics, for the quantitative measurement of the quality of grammars; ii) deciding how each metric relates to the characteristics which affect the quality of a grammar. The solution that will be presented here aims at developing a tool, based on Attribute Grammars to assess, in an automatic way, the quality of grammars. The implementation of the grammars quan- tification process enables the achievement and discussion of results for a given attribute grammar as well, as it allows us to take elations concerning the language that is defined by this grammar, due to a symbiotic relationship between the two concepts . Another important aspect, after reading the results obtained by this process is the direct manipulation of grammars, turning them into a higher quality software product, validated by well-defined software engineering techniques. Keywords: grammar engineering, attribute grammars, software metrics, software quality and main- tenance b RESUMO A construc¸ao˜ e o suporte de um sistema de software e´ um processo complexo, pelo que a engen- haria de software e´ responsavel´ por proporcionar uma colec¸ao˜ de tecnicas´ e metodos´ cient´ıficos, de acordo com essa complexidade. Inclui atividades como a analise,´ modelac¸ao/especificac¸˜ ao,˜ design, implementac¸ao,˜ manutenc¸ao,˜ entre outros. Neste documento de tese de mestrado propoe-se˜ uma abordagem de aplicac¸ao˜ destas tecnicas´ e metodos´ no ambitoˆ das linguagens, mais precisamente, das gramaticas,´ num ramo da engenharia de software conhecido por Engenharia Gramatical. A proposta foca-se no processo de desenvolvimento, implementac¸ao˜ e manutenc¸ao˜ de gramaticas´ com o intuito de melhorar os resultados em termos de eficienciaˆ e qualidade, quer da linguagem definida pela gramatica,´ quer do respetivo processador. Pesquisando no universo da engenharia gramatical, identificaram-se lacunas referentes a estudos sobre a sua qualidade, mais propriamente, em termos de facilidade de aprendizagem (leitura e compreensao),˜ facilidade de derivac¸ao˜ e manutenc¸ao,˜ bem como em termos de eficiencia,ˆ tanto no reconhecimento de frases da linguagem como na gerac¸ao˜ de um processador para essa linguagem, e propoe-se˜ uma soluc¸ao˜ para superar tais lacunas, atraves´ de um processo de avaliac¸ao˜ baseado em medidas rigorosas de certos parametros.ˆ Este processo baseia-se: i) na selec¸ao˜ de um conjunto de metricas´ de software bem definidas, para a aferic¸ao˜ quantitativa da qualidade das gramaticas;´ ii) na escolha da forma como cada metrica´ se relaciona com as caracter´ısticas que influenciam a qualidade de uma gramatica.´ A soluc¸ao˜ que sera´ aqui apresentada visa o desenvolvimento de uma ferramenta, baseada em Gramaticas´ de Atributos para aferir, de uma forma automatica,´ a qualidade das gramaticas.´ A implementac¸ao˜ do processo de avaliac¸ao˜ quantitativa de gramaticas´ possibilita a obtenc¸ao˜ e discussao˜ de resultados referentes a uma dada gramatica´ de atributos, bem como, permite tirar elac¸oes˜ referentes a` linguagem que e´ definida por essa gramatica,´ devido a uma relac¸ao˜ simbiotica´ que existe entre os dois con- ceitos. Outro aspecto importante, depois de uma leitura aos resultados obtidos por este processo e´ a manipulac¸ao˜ direta das gramaticas,´ transformando-as num produto de software de maior qualidade, validado por tecnicas´ de engenharia de software bem definidas. Palavras-Chave: engenharia gramatical, gramaticas´ de atributos, metricas´ de software, qualidade de software c CONTENTS Contents........................................... iii 1 INTRODUCTION ................................... 3 1.1 Grammar Engineering .............................. 3 1.2 Motivation..................................... 4 1.3 Aims/Objectives.................................. 6 1.4 Document Structure................................ 7 2 STATE OF THE ART .................................. 8 2.1 The art of Measurement.............................. 8 2.1.1 Measurement in Software Engineering.................. 9 2.1.2 The Scope of Metrics........................... 12 2.1.3 The Theory behind measurement..................... 15 2.1.4 A framework for software measurement................. 16 2.2 Quality of Languages............................... 18 2.3 Quality of Grammars............................... 22 2.3.1 Formal Grammar Definitions....................... 22 2.3.2 Assessing Grammar Quality....................... 27 2.3.3 Metrics for Grammars .......................... 33 3 THEPROBLEMANDITSCHALLENGES ...................... 41 3.1 Tool Description ................................. 42 4 GQE - GRAMMAR QUALITY EVALUATOR .................... 44 4.1 A Tool for Metric Evaluation........................... 44 4.2 GQE Architecture................................. 45 4.3 Grammar Objects................................. 47 4.3.1 Context-Free Grammar.......................... 48 4.3.2 Attribute Grammar............................ 55 4.4 ANTLR Linking Component........................... 58 4.5 Splitter Expander................................. 59 4.6 GQE Interface................................... 61 4.7 Execution Flow.................................. 62 5 GQEMAINRESULTS ................................. 64 5.1 CFG analysis: Lisp Language .......................... 64 5.2 AG analysis: Lisp Language ........................... 70 6 CONCLUSIONS .................................... 75 iii Contents 6.1 Conclusions.................................... 75 6.2 Future work.................................... 77 AGRAMMAROFANTLRMETA - LANGUAGE .................... 81 BTHEANTLRTOOL ..................................101 B.1 Generated Code..................................103 B.2 ANTLR Grammar Format ............................105 B.2.1 Lexicon..................................105 B.2.2 Structure .................................106 B.2.3 Parser Rules ...............................107 B.2.4 Actions and Attributes..........................107 B.2.5 Lexer Rules................................108 C NATURAL LANGUAGES TOOLS . 110 C.1 Perl Module Lingua::IdSplitter..........................110 C.2 WordNet......................................111 C.2.1 Relations .................................111 C.2.2 Options..................................113 C.3 Perl Module WordNet::QueryData........................115 iv LISTOFFIGURES Figure 1 Measurement for promotion ...................... 11 Figure 2 Example aspects of a productivity model................ 12 Figure 3 The ISO 9126-1 software quality model ................ 14 Figure 4 Empirical relations for the attribute ”height”.............. 15 Figure 5 Derivation Tree of a production..................... 25 Figure 6 Dependency Graph between Symbols of a grammar that specifies List lan- guage................................... 26 Figure 7 Activity Diagram ............................ 43 Figure 8 Grammar Quality Evaluator Logo.................... 45 Figure 9 Structural diagram with the four main components of GQE. 46 Figure 10 Structural diagram of the Grammar Objects component. 47 Figure 11 Data structure for productions of the grammar from Example1. 50 Figure 12 Structural diagram of the ANTLR linking component.......... 58 Figure 13 Structural diagram of the SplitterExpander component.