<<

International Journal of Pure and Applied Mathematics Volume 117 No. 17 2017, 15-21 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu Special Issue ijpam.eu

LLVM COMPILER

R. Bansal 1, Jasmine Norman 2,Mangayarkarasi R 3,Vanitha M 4,Chandramouliswaran S 5 1,2,3,4,5 School of Information Technology and Engineering, VIT University, Vellore, [email protected]

Abstract: The compiler is pointed and equipped towards 2. Literature Survey giving local assemblage to smaller scale pc’s, for example, raspberry pie, arduino and different other IOT The LLVM framework engineering [1] is intended to related miniaturized scale pc’s that are yet to influence a address these issues in conventional frameworks. stamp in the market to like the new Android Things Quickly, the static compilers in the LLVM framework venture. The devices we'll be utilizing are C/C++ based. arrange the source code down to a low-level portrayal LLVM is particularly C++ and our dialect will stick to that incorporates abnormal state data: The LLVM this same pattern since there are a few comforts of OOP Virtual Instruction Set [2]. This enables the static and the STL (C++'s stdlib) that for lines of compiler to perform generous improvements aggregate code. Notwithstanding C, both and Bison have their , while as yet conveying abnormal state data to the own linguistic structure. The linguistic use of the dialect . At interface time, the program is consolidated into will be kept fundamentally the same as C/C++, hence a solitary unit of LLVM virtual direction set code, and is making it feasible for new devices. inter procedurally improved [3]. Once the program has been totally advanced, machine code is created and a 1. Introduction local executable is delivered. This executable is local machine code, yet it likewise incorporates a duplicate of The low level virtual machine is also known as the program's LLVM bytecode for later phases of LLVM compiler is mainly used to establish compiler improvement. The LLVM run-time streamlining agent Front end and back end. Basically LLVM compiler is essentially screens the execution of the running project, used to optimize the programs which can be written in gathering profile data.At the point when the run-time different languages. The language used to LLVM is analyzer confirms that it can enhance the execution of the C++. The idea behind the invention of LLVM is program through a change, it might do as such through programs optimization at compile time to idle time. two courses: either coordinate alteration of the as of now LLVM provides fully independent platform for improved machine code or new code age from the joined languages. Many different of languages can be used LLVM bytecode[4]. In either case, the LLVM bytecode for different tasks, there is no worry about which gives essential abnormal state control stream, information language is used that’s why the whole working becomes stream, and sort data that is helpful for forceful run-time smooth and flexible when LLVM is used. The gathered advancements. A few changes are excessively costly, instructions or different kind of systems can be provided making it impossible to perform at run-time, even given in any language LLVM supports all of them, this an effective portrayal to work with. For these changes, flexibility provides benefits because the whole the run-time streamlining agent assembles profile data, environment becomes free from any type of obstacles serializing it to plate. At the point when sit without related to languages and can be established in all type of moving time is identified on the client's PC, a situations related to work. LLVM provides to run the disconnected reoptimizer is utilized to play out the programs in an desired way so that the code can be run forceful profile-driven advancements to the application. smoothly with an correctly optimized manner, that’s why The disconnected analyzer is identical in energy to the it provides good results and benefits for programs connection time streamlining agent. The distinction is execution by working on the optimization of programs at that the disconnected streamlining agent utilizes profile run time to compile time. and interprocedural investigation data to enhance the program, where the connection time enhancer must manage without profile data. Note that this framework gathers profile data in the field, which gives the most

15 International Journal of Pure and Applied Mathematics Special Issue

exact data conceivable, and does not meddle with the incremental recompilations. Additionally, in light of the improvement procedure at all[5]. The utilization of the fact that the majority of the segments work on a similar LLVM virtual direction set enables work to be offloaded portrayal, they can share usage of changes. from connect time to accumulate time, accelerating

3. Proposed Model

A compiler on a high level that is to say on an abstract languages such and C/C++ . The figure above shows the level is a combination of mainly 3 to 4 components Compiler Pipeline and also lists he tools that we will be which transfer data from 1 component to another in a using to create the compiler for our language. pipeline fashion. The paper talks about a compiler being In Summary, the compiler design involves 3 steps.We designed for a new language that will have a similar will dive deeper into these later on in the paper: grammatical structure to the popular languages such and a. Lexical Analysis with Lex/Flex: This mainly C/C++ . The figure above shows the Compiler Pipeline translates to splitting the input data into a set and also lists he tools that we will be using to create the oftokens(numbers, brackets, braces, keywords, identifiers compiler for our language.In Summary, the compiler etc) that a compiler can understand as is done design involves 3 steps.We will dive deeper into these with every other language. later on in the paper:.A compiler on a high level that is to b. Semantic Parsing with /Bison: In this step we say on an abstract level is a combination of mainly 3 to 4 will generate an AST[9] while parsing thetokens and feed components which transfer data from 1 component to it to Yacc/Bison. another in a pipeline fashion. The paper talks about a c. Assembly using LLVM: This is where LLVM comes compiler being designed for a new language that will in and we will be using it to generate themachine/byte have a similar grammatical structure to the popular code for every node that we have.

The Grammar/Syntax for our language will be kept the number of tokens has been kept to a minimal amount similar to C language because of its familiarity in the such as numbers (floats and integers), parentheses , developer world. mathematical operators and braces. The definition of the tokens is given below. 3.1 Lexical Analysis In the first step of the language essentially all we are doing is defining the tokens that will be accepted by In this step all we will be doing is to break down our the compilers(tokens make up the syntactical structure of input statements into a set of knownlist items that we the language), and in addition to that we will also be have established as the tokens for our language/compiler. ignoring the blank spaces and other white line/spaces For the first few iterations and versions of the language using the tokens file.

16 International Journal of Pure and Applied Mathematics Special Issue

will need to contain all sorts of expressions that the language is expected to support namely Variable 3.2 Semantic Parsing Declarations, Method calls, references, method declarations and then represent all of them as a separate This step mainly consists of defining an Abstract Syntax node in our Abstract Syntax . Following is a snippet Tree(AST)[10] for the language and then afterwards of how the AST node. h giving a brief idea of how is an feeding our AST to Yacc/Bison should do the job. The AST represented in terms of C code. AST that is to be defined for the language will contain all

17 International Journal of Pure and Applied Mathematics Special Issue

18 International Journal of Pure and Applied Mathematics Special Issue

Other Than the expressions/nodes showcased in the 4. Assemble the AST using LLVM above snippet there is also support for the following: a) Variable declaration In the final step we will be taking our AST and turning it b) Extern Declaration into byte-code/machine-code, which directly translates to c) Function Declaration taking every single available semantic node and turning it d) Return Statement into the corresponding set of machine level instructions Once the nodes are created we can then feed the nodes and that is where LLVM comes to the rescue and helps and tokens to bison with the -d flag(since the token us with most of our Work since LLVM abstracts most of declaration and definition are in different places) in order the actual instructions into something that is similar to to give a definition to our tokens. that of an AST. This is also where the codegen method comes into play and the following is a snippet of how codegen method is defined for LLVMs usage for N Method Call (node Method call).

5. Conclusion and Future Work translation in QEMU." Master's thesis, University of Adelaide, Australia (2009). In the above we described in detail the tools that were used to create the new language and the things that the [3] Lattner, Chris. "The LLVM compiler compiler and language supports, In Future the linking system."Bossa Conference on Open Source, Mobile support may be integrated and also will take into account Internet andMultimedia, Recife, Brazil. 2007. the various processor architectures that LLVM-IR has come to support over time, Also enabling us to support [4] Lattner, Chris. "LLVM and Clang: Next the new micro PCs such as RasberryPi , Arduino and generation compiler technology."The BSD Conference. android things and other products that may end up being 2008. used in IOT. [5] Lattner, Chris Arthur.LLVM: An infrastructure References for multi-stage optimization. Diss. University of Illinois at Urbana-Champaign, 2002. [1] Lattner, Chris, and VikramAdve. "LLVM: A compilation framework for lifelong program analysis & [6] Zakai, Alon. "Emscripten: an LLVM-to- transformation." Proceedings of the international JavaScript compiler."Proceedings of the ACM symposium on Code generation and optimization: international conference companion on Object oriented feedback-directed and runtime optimization. IEEE programming systems languages and applications Computer Society, 2004. companion. ACM,2011.

[2] Jeffery, Andrew. "Using the LLVM compiler [7] Tamboli, Teja, Thomas H. Austin, and Mark Stamp. infrastructure for optimised, asynchronous dynamic "Metamorphic code generation from LLVM bytecode."

19 International Journal of Pure and Applied Mathematics Special Issue

Journal of Computer Virology and Hacking Techniques 10.3 (2014): 177-187.

[8] Lattner, Chris, and VikramAdve. "The LLVM instruction set and compilation strategy." CS Dept., Univ. ofIllinois at Urbana-Champaign, Tech. Report UIUCDCS (2002).

[9] Neamtiu, Iulian, Jeffrey S. Foster, and Michael Hicks. "Understanding source code evolution using abstract syntax tree matching." ACM SIGSOFT Software Engineering Notes 30.4 (2005): 1-5.

[10] Wan, Bo, et al. "MCAST: An abstract-syntax- tree based model compiler for circuit simulation." Custom IntegratedCircuits Conference, 2003. Proceedings of the IEEE 2003. IEEE, 2003.

[11] P.Maragathavalli , S.Vanathi, “A Study On Enhancement Of System Security In Openflow Structure Utilizing Software Defined Networking”, International Innovative Research Journal of Engineering and Technology , vol 02, no 04, pp.125-129, 2017.

[12] T. Padmapriya and V. Saminadan, “Inter-cell Load Balancing technique for multi-class traffic in MIMO-LTE-A Networks”, International Journal of Electrical, Electronics and Data Communication (IJEEDC), ISSN: 2320- 2084, vol.3, no.8, pp. 22-26, Aug 2015.

[13] S.V.Manikanthan and K.Baskaran “Low Cost VLSI Design Implementation of Sorting Network for ACSFD in Wireless Sensor Network”, CiiT International Journal of Programmable Device Circuits and Systems,Print: ISSN 0974 – 973X & Online: ISSN 0974 – 9624, Issue : November 2011, PDCS112011008.

[14] Rajesh, M., and J. M. Gnanasekar. & quot; Congestion control in heterogeneous wireless ad hoc network using FRCC. & quot; Australian Journal of Basic and Applied Sciences 9.7 (2015): 698-702.

20 21 22