
UG3 Compiling Techniques Coursework January 2015 About this practical This handout provides you with the details of your coursework for the UG3 Compilin Tech! ni"ues course# The coursework comprises two individually marked parts that are wei hted $0% and &0%' respectively' and to ether they count for 25% of the marks for the course# Both of the e)ercises are individual exercises' not to *e undertaken in a roup# +iscus! sions, however' with your fellow students re ardin technical approaches to individual pro*lems of the coursework as well as infrastructure related "uestions are hi hly en! cour ed, Description of this practical -our task is to write a compiler for a simple su*set of C' small-C# .t is compulsory to use /0T12 and /34 for eneratin the compiler# .n the first sta e the compiler must check that a source pro ram is well-formed and in either case terminate racefully# 6or the sec! ond (final8 sta e it should enerate code 7*ytecode8 for the Java Virtual 4achine (JVM). The enerated code should *e reasona*ly short to solve the pro*lem. 0ote that some le itimate C pro rams will not *e le itimate small-C pro rams. The languages involved The implementation lan ua e must *e Java, The synta) of the su*set of C you must handle' small-C' is iven in the Appendi)# :ne of your first tasks is to translate the synta) dia rams to /0T12 input# .n doin so' you must *e careful not to chan e the lan ua e, -ou must also handle C comments. small-C is simply typed' it only has varia*les of type inte er or character# / string is a (pos! si*ly empty) se"uence of characters 7e)cludin newline8 *etween dou*le "uotes, e# # "hello world!"# / character is a C character of the form ;c; where c is a sin le key! *oard stroke' or \n' \t' \\' \; or EOF 7end-of-file8# -ou don;t have to handle all possi*le C 1 character definitions, for e)ample ;\045;# / number is a non-empty se"uence of di its. An ident is a non-empty se"uence of characters <a!=/!>0!?@A *e innin with an alpha*etic character# / varia*le must *e declared (e)actly once8 *efore used# All varia*les et the ini! tial value 0 or ;\000;# The *inary su*traction' addition' multiplication' division and modulus all associate to the left# So 5 B 3 B 2 evaluates to 0' not $# Unlike usual C' they can only *e applied to inte ers, not characters. Conditions in while and if statements are interpreted as follows: 0 stands for false' any other value stands for true# An else *inds to the closest previous un*ound then# The #include lines should *e i nored *y your compiler' they are included so that the usual C compiler will also work on the input# The read and readc procedures read an inte er and character respectively' while output and outputc output an inte er and character respectively# -ou must handle non-recursive procedures and functions. -ou should check that each varia*le or procedure or function is defined *efore it is called# Infrastructure • /0T12 httpCDDwww#antlr#or • /34 httpCDDasm.ow2#or D Useful references • The Java Virtual 4achine 3pecification httpCDDEava#sun#com/docs/*ooks/vmspec/ • /0T12 2eference Manual httpCDDwww#antlr#or Ddoc/inde)#html • ASM User Guide httpCDDdownload#for e#o*Eectwe*#or Dasm/asm-guide#pdf • Fikipedia Entry for Small-C httpCDDen#wikipedia#or DwikiDSmall-C • Fikipedia Entry for Bytecode httpCDDen#wikipedia#or DwikiDBytecode • Apache /0T Manual httpCDDant#apache#or DmanualD 2 art 1: Front end (40%) The 5rst part of your coursework is to develop a front end of a compiler for the small-C lan! ua e# .nitially' you will develop a le)er and parser for the small-C lan ua e *ased on the /0T12 le)er and parser eneration tool# / specification of the small-C lan ua e can *e found in the appendix of this document# .n a second step' you will extend your parser spec! ification with action rules for the construction of an a*stract synta) tree suita*le for further processin # 6inally' the /3T will *e written out to a plain text 5le (or any other human read! a*le format suita*le for representation of raphs) ena*lin you to inspect the /3T for any iven small-C pro ram. -our front end should accept correct small-C pro rams and construct an a*stract synta) tree *efore this tree is written to a file# 6urthermore' you front end should reEect incorrect small-C pro rams 7lexical and syntactical checkin ,8 and provide the user with meanin ful error messa es *efore terminatin racefully# Herfectin error recovery' however' is not the main oal of this e)ercise and you should focus on the correctness of your Small-C ram! mar implementation and the /3T construction, Hints: • 6ocus on the pure le)er and parser *efore you approach the construction of the /3T, :nce you have implemented a *asic /0T12 specification for the iven rammar you can then extend this with the necessary annotations for the construction of the /3T# • By default' /0T12 enerates flat a*stract synta) trees 7J lists as de enerated trees). Kowever' with a few extensions to your rammar 5le you can et /0T12 to enerate proper trees. 4ake use of this facility# • Hlease use /0T12 v#3 or v#$# +on;t use any older version of the /0T12 tool# • .t is advisa*le to use a *uild tool such as /0T for your proEect# • There are many /0T12 rammars availa*le for other pro rammin lan ua es. -ou are encoura ed to read them and use them as templates for your own work! • Frite a couple of small Small-C test pro rams to e)ercise the various aspects of your front end# .nclude correct and incorrect Small-C pro rams and check whether they are correctly accepted or reEected# Also inspect the enerated text file and compare the /3T to what you would e)pect here# Su*mit your test pro rams alon with your other code, • +ocument your code, • 3tate what you have provided in a separate 2G/+ME file# This should include all features that you have implemented' tests that you have performed and features that you have found to *e incompleteDincorrect' *ut haven;t mana ed to 5), 3 art 2: Back end (60%) The second part of your coursework is to develop a *ack end of a compiler for the Small-C lan ua e tar etin the Java Virtual Machine# .nitially' you will need to perform some se! mantic checks on the a*stract synta) tree enerated *y your front end developed in part 1# Semantic analysis is dependent on context information' hence' you will need to develop a symbol ta*le storin information a*out varia*les, functions and types. .n the second sta e of the part of the practical' you will enerate JVM *ytecode# -ou do not need to deal with low-level issues in *ytecode eneration' *ut will make use of the /34 li*rary# Essentially' this li*rary will provide you with a hi h-level /H. to *ytecode eneration# Traversin the a*! stract synta) tree you will enerate (simple8 code for each visited node usin the informa! tion stored in the symbol ta*les. Hints: • .f you prefer not to re!use your front end developed in part 1 you will *e provided with a *inary version of a front end for the Small-C lan ua e# • .t is advisa*le to maintain separate symbol ta*les for varia*les, functions, and types. • .t is sufficient to check properties durin the semantic analysis sta eC .denti5er declared *efore it is used' compati*ility of types in expressions, le al destination of assi nments# • .f in dou*t a*out how to translate a Small-C construct to JVM *ytecode' write a small Java pro ram that contains this feature' compile it and inspect the enerated *ytecode# The javap Java Class 6ile +isassem*ler is very useful for this task! • 2ead the ASM manual# Use existin e)amples as templates for your own work, • All your enerated code for a Small-C pro ram can o into a sin le (Java8 class. Each Small-C function corresponds to a (static) Java method# The Small-C int and char types map to the accordin Java types. • As *efore' document all your code, • Similarly' provide a 2G/+4G file statin what you have implemented and document the tests that you have performed# .nclude your test files. $ Deadlines and submission The deadline for completion of part ! of the practical e)ercise is #rida,- #ebruar, !3th- )&!. at %"&&pm- and the deadline for completion of part ) is #rida,- March )&th- )&!. at %"&&pm0 Please su*mit your source code (and the /0T12 specification for the small-C parser) us! in the su*mit command' e# # submit ct cw1 .... 7for part 18 submit ct cw2 .... 7for part 28 /dditional information on the electronic su*mission system can *e found on the .nformat! ics +.CE machines usin the man submit command# Please su*mit only source codeC no compiled class files. There is no need to su*mit the files enerated *y the /0T12 le)er and parser enerator# +on;t tar/zipD### your code *efore su*mission' the submit command allows you to su*mit an entire directory, Assessment procedure -our Java su*missions will *e tested *y *ein compiled and e)ecuted# A su*mission which uses proprietary (e# # Microsoft8 Java extensions and fails to compile with a version of the JDK will lose credit# -our su*mission will *e assessed on the correctness and clarity of your Java code and use of the tools infrastructure# -ou should follow ood o*Eect-oriented pro rammin practice *y encapsulatin information where it is appropriate to do so and providin a well-de5ned in! terface for other application pro rammers# -our /0T12 rammar specification will *e as! sessed on its correctness and clarity# Queries and clarification .f you have any "uestions or uncertainties a*out this practical e)ercise please contact your lecturer' Christophe +u*ach 7christophe#du*achNed#ac.uk8 or (EOrn 6ranke 7*frankeN! inf#ed#ac.uk' .61#0$8' *y email or in person# 5.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages5 Page
-
File Size-