Type-Safe Java Bytecode Modification Library Based on Javassist
Total Page:16
File Type:pdf, Size:1020Kb
TALLINN UNIVERSITY OF TECHNOLOGY Faculty of Information Technology Institute of Computer Science Chair of Theoretical Informatics Type-safe Java bytecode modification library based on Javassist Bachelor thesis Student: Sven Laanela Student code: IAPB020471 Advisor: Pavel Grigorenko Tallinn 2015 I declare that this thesis is the result of my own research except as cited in the references. The thesis has not been accepted for any degree and is not concurrently submitted in candidature of any other degree. Signature Name Sven Laanela Date May 26, 2015 Abstract Dynamic (runtime) Java bytecode modification enables various fra- meworks to greatly enhance Java programs. They can provide logging, caching, dependency injection, object-relational mapping, etc. with- out cluttering the application source code, or even when the source code is not available. Existing Java bytecode modification libraries do not provide compile-time type safety or IDE-assisted code completion capabilities with regard to the class under modification. This thesis researches a problem of type-safe Java bytecode modification, build- ing on top of Javassist { a Java bytecode modification library { and taking into consideration the most common Java bytecode modifica- tion use-cases. Scenarios implemented in Javassist are described and an alternative approach that implements those scenarios in a type-safe manner is proposed. An implementation library is presented as proof of concept for the proposed approach to Java bytecode modification. An analysis of the applicability of the approach is provided, outlining both the benefits and trade-offs when compared to Javassist. Annotatsioon D¨unaamiline(s.t. rakenduse t¨o¨otamiseaegne) Java baitkoodi muut- mine v~oimaldaboluliselt t¨aiendadaJava rakendusi ilma l¨ahtekoodi ris- ustamata. See lubab lisada programmile logimist, puhverdamist, kom- ponentidevahelisi s~oltuvusi,objektrelatsioonilist konfiguratsiooni jpm. ilma rakenduse l¨ahtekoodi muutmata { kood keskendub p~ohiprobleemi lahendamisele. Olemasolevad lahendused ei v~oimaldakompileerim- isaegset modifitseeritava klassi p~ohistt¨u¨ubikindlustega arendust¨o¨ova- hendites koodi automaatset l~opetamist. Seda isegi mitte siis, kui muude- tav klass on kompilaatorile n¨ahtav (klassirajal). K¨aesolevl~oput¨o¨o uurib t¨u¨ubikindlatJava baitkoodi modifitseerimist Javassist'i { ¨uks levinumaid Java baitkoodi muutmise teeke { baasil ning v~otabaluseks k~oigetihemini kasutatavad Java baitkoodi transformeerimise kasutus- juhud. Need stsenaariumid on v¨aljatoodud nii Javassisti baasil kui ka alternatiivse l¨ahenemise alusel, mis s¨ailitabkompileerimisaegse t¨u¨ubi- kindluse. T~oendamaksv¨aljapakutudlahenduse rakendatavust Java baitkoodi muutmisel, sisaldab l~oput¨o¨oka esiletoodud kasutusjuhtusid lahendavat protot¨u¨up-implementatsiooni. Lisaks on v~orreldudv¨alja- pakutud lahenduse eeliseid ja puuduseid v~orreldesJavassistiga. Contents 1 Introduction 1 1.1 Background and motivation . .1 1.2 Goal . .3 1.3 Outline of the thesis . .4 2 Prerequisites 5 2.1 Type systems . .5 2.1.1 Advantages of static typing . .5 2.2 Javassist . .6 2.2.1 Special identifiers . .7 2.2.2 Method body manipulation . .7 2.2.3 Adding new fields and methods . .9 2.2.4 Method and Constructor instrumentation . 11 2.2.5 Modifying access modifiers . 12 2.2.6 Type-safety of Javassist API . 13 2.3 Java Annotation Processing . 13 2.3.1 Annotation Processing Rounds . 14 2.3.2 Available APIs and Capabilities . 14 3 Implementation 15 3.1 Mirror class generation . 15 3.1.1 Namespacing and registry . 18 3.1.2 Constructor/method/field generation considerations . 18 3.1.3 Superclasses without a default constructor . 18 3.1.4 Inner classes . 19 3.1.5 Synthetic methods/constructors . 19 3.1.6 Library/class versions . 20 3.2 Extension class capabilities . 20 3.2.1 Method modification . 20 3.2.2 Static methods . 22 3.2.3 Field modification . 23 3.2.4 Constructor modification . 24 3.2.5 Adding new methods . 26 3.2.6 Adding new fields . 28 3.2.7 Adding new constructors . 28 3.2.8 Modifying access modifiers . 29 3.2.9 Method and Constructor instrumentation . 29 3.3 Compile-time type safety checking . 29 3.4 Wiring code generation . 30 3.4.1 Extension class transformation . 31 3.4.2 Original class transformation . 32 3.4.3 Extension class lifecycle . 32 3.4.4 Field modification . 32 3.4.5 Method modification . 32 3.4.6 Constructor modification . 33 3.4.7 Added interfaces . 33 3.4.8 Added superclasses . 34 3.4.9 Added fields . 34 3.4.10 Added methods . 34 3.4.11 Added constructors . 34 3.4.12 Transformation implementation . 35 3.5 Runtime lookup and execution of the wiring code . 35 4 Analysis 36 4.1 Benefits . 38 4.2 Trade-offs . 38 4.3 Summary . 39 5 Related work 40 5.1 ObjectWeb ASM . 40 5.2 CGLib . 41 5.3 AspectJ . 42 5.4 Apache BCEL . 43 6 Conclusion 44 6.1 Future work . 45 Appendix A Patches.java 49 Appendix B Modify.java 50 Appendix C Transformer.java 51 Appendix D TypesafeBytecodeModificationProcessor.java 52 Appendix E MirrorClassGenerator.java 54 Appendix F ExtensionClassValidator.java 58 Appendix G WiringClassGenerator.java 61 Appendix H cbp.vtl 66 1 Introduction 1.1 Background and motivation Java nowadays is not only a mature programming language, but a platform with well established ecosystem. It allows to develop scalable and distributed applications for almost any domain. Java programs are compiled into byte- code which at runtime is interpreted by the Java Virtual Machine (JVM). Java bytecode is intended to be platform-independent and secure [1]. The bytecode is the driving force which makes Java platform rich in libraries, frameworks and various JVM-based languages. There are a lot of useful frameworks and libraries which rely on dynamic (runtime) bytecode manip- ulation which enhance and change the behaviour of Java programs. Some examples are object-relational mapping frameworks (Hibenate, etc), aspect- oriented approach for adding logging and caching (AspectJ), dependency injection, application profiling (constructing method execution traces), dy- namic code updates and many more. There is a number of bytecode gener- ation and manipulation tools: ASM [2], Javassist [3], BCEL [4], cglib [5]. The present work uses Javassist [3] as a Java library for Java class- file bytecode modification. It allows the developer to programmatically change the structure of a class without requiring extensive knowledge of the Java Virtual Machine Specification [6]. The Javassist API includes constructs for manipulating the class structure and hierarchy, for example javassist.CtClass, javassist.CtField, javassist.CtMethod for mod- ifying the class itself or the fields and methods contained therein. The declaration, definition and access modifiers can be altered by invoking meth- ods such as ctField.setModifiers(int modifiers), ctMethod.setBody( String body), ctClass.addField(CtField field), ctClass.addMethod( CtMethod method) on the Ct* object. For example, given a class SampleClass with a private field instance- Field and a public method instanceMethod(String input) that uses this instance variable: public class SampleClass { private final String instanceField ="InstanceField"; private final String instanceMethod(String input) { return input + instanceField; } public final String publicMethod(String input) { return instanceMethod(input); } } 1 The Javassist code for getting a handle on the javassist.CtClass ob- ject represting this class during runtime would look like the following: public static CtClass getClass(String className) throws NotFoundException { ClassPool classPool = ClassPool.getDefault(); return classPool.get(className); } Subsequently, getting a handle on the javassist.CtMethod object rep- resenting this class during runtime would look like the following: public static CtMethod getMethod(String className, String methodName, String... argClassNames) throws NotFoundException { CtClass theClass = getClass(className); CtClass[] paramClasses = getClasses(argClassNames); return theClass.getDeclaredMethod(methodName, paramClasses); } And finally, modifying the body of the instanceMethod method to add a null check for the input would be done by: public void addNullCheckToInstanceMethod() throws Exception { CtMethod ctMethod = JavassistHelper.getMethod(SampleClass.class, "instanceMethod", String.class); ctMethod.insertBefore( "if($1 == null){"+ " return null"+ "}"); } Looking over the above examples and running them with Javassist, the following drawbacks of the API become apparent: First, supplying Java code as a String argument to the body modification methods leads to runtime syntax checking instead of checking for those errors during compile-time. For example, the above example is missing a semicolon at the end of the return null call leading to a CannotCompileException thrown when the the addNullCheckToInstanceMethod() method is run. To avoid these issues, extra care must be taken to verify that the Java statement supplied as the argument is correct and that there are no typos in the code. Second, the type-safety of the code is also enforced only during run- 2 time, consequently, method calls and field accesses on the class itself and its contributing classes can also result in a CannotCompileException be- ing thrown. Calling stringVariable.indexof('c'); on a variable of type String is syntactically correct, but since the indexOf method is defined as indexOf(char c), the code fails. Furthermore, care must be taken to ei- ther supply a fully-qualified classname of a contributing class, used from the modified method body, or adding an import statement by calling