Feature-Based Comparison of Language Transformation Tools
Total Page:16
File Type:pdf, Size:1020Kb
Feature-Based Comparison of Language Transformation Tools 1Muhammad Ilyas, 2Saad Razzaq, 3Fahad Maqbool, 4Qaiser Abbas, 5Wakeel Ahmad, 6Syed M Adnan 1,2,3,4Department of Computer Science and IT, University of Sargodha, Sargodha, Pakistan 5,6Department of Computer Science, University of Wah, Wah Cantt, Pakistan [email protected], Abstract Code transformation is the best option while switching from farmer to next technology. Our paper presents a comparative analysis of code transformation tools based on 18 different factors. These factors are Classes, pointers, Access Specifiers, Functions and Exceptions, etc. For this purpose, we have selected varyCode, Telerik, Multi-online converter, and InstantVB. Source Language considered for this purpose is C sharp (C#) and the target language is Visual Basics (VB). Results show that VaryCode is best among the four tools as its converted programs throw fewer errors and require minor changes while running the program. Key words: Code transformation, Multiconverter, Stylistic Feature, Code Smell, Author style. more readable for the developer. Due to the 1. INTRODUCTION rapid improvement in programming languages, companies want to migrate their Source to source compilation is a process of software to new updated/upgraded code. converting source code written in one high- Developing code from scratch each time is level language to itself or other high-level difficult because it is a complex and languages [1]. It is a refactoring process that cumbersome activity. The automatic is helpful when programs that have to transformation of the code is always very refactor are outside the control of the favorable and speeds up the work. original implementer. The purpose of Several techniques have been used for transpilation is to convert legacy code to a this purpose such that automatic source to newer version of a particular language. Such source error compensation of floating-point as converting a program from one dialect to programs [17], generating database access another in the same programming language code from domain models [18], and [2, 25]. Translation of code from one to generating pseudo-code from source code another language is also necessary for [19], etc. understanding code according to expertise in a particular language. This makes code much The objectives of feature-based comparison third is Language Conversion that converts are to help in making newer versions of tools from one programming language into efficient and more accurate as well as to another. highlight the areas where a particular tool is There are many purposes for code lagging. For this purpose, we have selected transformation such as performance four source to source compilers: Telerik improvements [11], memory optimization [20], VaryCode [21], Multi-online Converter [12], parallelization, and vectorization. Due [22], and Instant VB [23]. We have chosen to the rapid growth of the Internet, one of the C# as the source language and Visual Basic main reasons for code transformation is the (VB) has the target language. We have migration of a legacy system into a web- focused on eighteen different features for enabled environment [3]. Code comparison, some of these features are transformation [13] also helps advisory tools Classes, Functions, Access Specifiers and that guide the developers to parallelize the Exceptions, etc. The paper is organized as real-world problems. These tools have faced follows: related work is described in Section problems due to the large size of programs 2, feature-based comparison of language and high code complexity. That’s why these transformation tools are presented in Section tools are unable to provide meaningful 3, and the conclusion is described in Section parallelization hint to the developers. Then 4. code transformation overcomes this problem by simplifying the code. 2. RELATED WORK A source to source compilation is a process Cfir Aguston et al. Proposed the approach to of translating a high-level programming overcome the complexity of code issue language to itself or another high-level called Skeletonization which automatically programming language [1]. One of the transforms the complex code into a much purposes of the source-to-source compilation simpler structure. The proposed algorithm is translating legacy code to use the transforms the constructs such as covers upcoming version of the underlying pointers, nested conditional statements, programming language. It is a refactoring nested loops, etc. This algorithm transforms process that is helpful when the programs to pointers into integer indexes and replaces C refactor are outside the control of the struct references with references to arrays. original implementer such as to convert a Source level compiler performs analysis on program from legacy API to the new API, or skeletonized code for parallelizing the code. when the program size makes it impossible The generated code is not equivalent to the to refactor it by hand. original code, it suggests possible parallelization patterns to the developers. Malton [2] defines three conversion tasks: First is Dialect Conversion which converts a Another tool that is concerned with program from one dialect to another in the performance improvements is GPU S2S same programming language. This is used which automatically transforms the C when a new version of the compiler is used. sequential code into CUDA (Compute The second is API Migration used to convert Unified Device Architecture) code [11]. This the program into a new set of APIs. The contains three modules: Directive recognition, Parsing module, and CUDA code generation. GPU S2S takes C code The concept of source-to-source Translator with directives as input and translates it into can be viewed as a generation of Database the CUDA code. Experiments showed that Access code from Domain Models. N. Y. generated code has significant improvements Khelifi et al. presented his Idea on Database as compared to C original code, which leads Access code generation from Domain to performance enhancement. Models. As a source, they used RSL (Requirements Specification Language). The G. Dimitroulakos and C. Lezos et al. process of code generation consists of two presented a source to source compiler main steps. The first step is related to MEMSCOPT for Dynamic code analysis transformation rules having translational and loop transformations and assisting the semantics for domain vocabulary constructs optimization of the memory hierarchy of a of RSL. The second step constitutes of digital hardware system. MEMSCOPT is a MOLA (Model Transformation Language) command-line application and is extended algorithms for implementing by providing a Graphical User Interface transformations. The validity of results is (GUI) in C#. It takes a C code file as input assured by using the framework of the and performs an analysis such that ReDSeeDS (Requirements-Driven Software monitoring the loop. It also performs a Development System) tool suite. The dynamic code translation. MEMSCOPT resultant transformations produced applies 7 types of transformations such that consistent and quality code that can be loop extends, loop shift, loop reversal, loop utilized directly for the implementation of interchange, loop fusion, loop fission, loop the data access layer [18]. normalization, loop reorder, loop switching, loop scope move forward and loop scope A Pseudogene is a tool for converting move backward [12]. pseudo code from Source code using SMT (Statistical Machine Translation), worked on J. Cronsioe et al proposed another source to by H. Fudaba et al. The tool performs the source compiler that performs automatic transformation of Python code to English or transformations such that optimization of Japanese. The proposed tool uses T2SMT loop structures on multi-core platforms. The (Tree-to-String machine translation) method. proposed research is concerned with a Input to this method is a parse tree through scientific application written in FORTRAN which Tokenization and parsing are named BigDFT (Density Functional performed. The tree is broken up for Theory). BigDFT is defined by its heavy use translation using translation patterns for of convolution operators on large arrays. conversion into pseudo-code. Proposed PIPs and BOAST the two S2S applications techniques are one of the best application of have been used with MagicFilter to optimize SMT for translating the code into natural BigDFT. It is found that PIP’s generic language [19]. transformation logic is not always suitable for adapted optimizations there is another B. S. K. Vorobyov et al worked on source to technique BOAST for more suitable source Translator from Datalog to SQL for transformations over multi-core platforms Static Program Analysis. The process starts [15]. with the processing of facts and rules by a lexer and parser. An intermediate representation of the Datalog program is are inserted in this step. In step 2 type constructed. The intermediate representation conversion and data types are analyzed, as a is translated to SQL queries by using simple result, Java incompatible types are removed. syntactic translation schemes. The proposed In step 3 C/C++ code is translated into Java work cannot be matched with the code, compiled with any Java compiler, and performance of industrial-strength tools [16]. verified. Also, it doesn’t support many L. Thévenoux et al presented his work on features such as external libraries, assembly automatic source to source Error code, and goto statements. Experiments