Compressing Java Class Files Java Class Files

Compressing Java Class Files Java Class Files

Compressing Java Class files Compressing Java Class files Java Class files • Compiled Java source programs generates 1999 ACM SIGPLAN Conference on (lots of) class files Programming Language Design and • Architecture neutral Implementation • Standard distribution format – Java programs can be compiled to native William Pugh executables, but I won’t talk about that Dept. of Computer Science Univ. of Maryland Class file contents Java ARchive (jar) files • class files contain lots of symbolic • A collection of of class files and other information resources (e.g., images) – For javac, only 21% of uncompressed class file • same format as zip archives is bytecode – individual files can be compressed with zlib • Information for linking • includes manifest • Allows code compiled against an old library – Information such as code signatures – to be linked with a new library – so long as dependent functionality still there Executing Java programs Download individual class over the net files as needed • Download and install program • TCP set-up costs for each class file – compact archive format – unless you use persistent http connection • Execute as downloaded • No compression – class files as needed, or • Get just the class files you need – compact and progressive archive format – some class files are needed only for verification – entire class file is needed if you need only one method • This approach isn’t used in practice – for non-trivial applications William Pugh, Univ. of Maryland 1 Compressing Java Class files A Wire format for Download jar archive class files? • 1 TCP connection • Bandwidth most important • zlib compression of individual class files • Decompression time relatively important – about a factor of 2 savings – Compression time not very important • may download class files that are not used • Progressive • entire jar archive must be downloaded • Not random access before any class files can be accessed • Can translate into jar archive or class files – or load directly into JVM Debugging information Cleanup • Java class files often contain debugging • When comparing my format to jar files information – Clean up first – source file • Remove debugging information – line number • Garbage collect constant pool – local variables • Sort constant pool • Will not include debugging information in – improves compression wire format • Exclude non-class files from archive – could do so; would compress fairly well Effects of Clean up Easy Java class file wire j0r jar sjar no yes yes compressed? format: Collective Zip no yes no debugging info? yes no yes Cleaned up? • In standard Jar archive, Hanoi 86 57 46 – files are compressed individually icebrowserbean 226 125 116 • gzip/zlib finds repeating patterns javafig_dashO 269 136 131 javafig 357 198 170 • lots of patterns repeat between but not as jmark20 309 189 173 much within class files _213_javac 516 274 226 ImageEditor 454 359 257 • Generate jar file without individual tools 1,557 950 737 compression visaj 2,189 1,524 1,157 swingall 3,265 2,193 1,657 • Compress the entire resulting jar file rt 8,937 5,726 4,652 William Pugh, Univ. of Maryland 2 Compressing Java Class files Effectiveness of A closer look at collective zip class file contents Size of Collective Zip swingall javac Original Collective as % of Total size 3,265 516 Benchmark Size Zip Original excluding jar overhead 3,010 485 Hanoi 46 31 67% Field definitions 36 7 Method definitions 97 10 IBM Host on demand 98 85 87% Code 768 114 ICE Browser 105 88 84% Other 72 12 JavaFig 171 144 84% Constant pool 2,037 342 tools 737 513 70% Utf8 entries 1,704 295 visaj 1,157 703 61% if shared 372 56 swingall 1,657 998 60% if shared and factored 235 26 JDK 1.2 runtime 4,652 2,820 61% What did we just learn? Beating collective zip is hard • A substantial part of class files consists of • A lot of the things you could do constant pools – e.g., share constant pool entries – Bytecode is the only other substantial • are already done by a collective zip component • Most of the space in constant pools is taken • You can work very hard up by Utf8 entries – and find that you don’t beat collective zip by • Sharing Utf8 entries across class files is a much huge win Compressing uniform Drawbacks to sharing streams • Increases # of constant pool entries • class files are jumbles of different types – How do we encode a reference? – Utf8 encodings, bytecodes, constant pool • For most class files, less than 255 entries entries – can encode in a single byte • Most compression algorithms work better if given a more uniform stream – separate out class files into streams for each type of information – compress each stream individually William Pugh, Univ. of Maryland 3 Compressing Java Class files Compressing bytestreams Encoding references • Zlib (and most compression algorithms) are • How do you encode a reference to an object designed to work on bytestreams (e.g., a constant pool entry) you may have • How do you compress a stream of shorts? seen before? – Standard serialization mixes types – so that most references are encoded in 1 byte – Could use separate streams for high and low • Overload id’s based on type bytes – In almost all cases, know the type of the object – Use variable length encoding being referenced • Hope that most entries can be encoded in a single byte Encoding references Move to front queue (continued) • Tried several schemes • Maintain a list of all the objects seen • One that worked best was a move-to-front previously queue • To encode an object seen previously – Suggested by Ernst et al. – encode its position (1 for first entry) – Long history in compression literature – move it to the front of the list • To encode an object not seen previously – encode 0 – put it at front of list Implementation of Factoring? Move-to-front queues • Use a modified skip-list • The string “java.awt” occurs in the Utf8 – links record distance they travel encoding of many class names • In decoder, a move-to-front operation on • Method and field signatures contain element k requires O(log k) time separate Utf8 encoding of class names – regardless of total number of elements in list • String f(String s) is recorded as having type • In encoder, requires O(log n) time (Ljava/lang/String;)Ljava/lang/String; – L is to differentiate between references and primitive types William Pugh, Univ. of Maryland 4 Compressing Java Class files Reorganize class file Compressing bytecodes • Factor information to avoid as much • Separating out opcodes from operands helps redundancy as possible • Use separate streams for : – packageNames – opcodes – simpleClassNames – different register types – classNames – branch offsets – method type (array of classnames) – integer constants – ... – constant pool references (already separate) Compression results Compression ratios collective packed/ j0r.gz Jazz Packed Benchmark jar size zip packed Czip/ jar jar Hanoi 46 31 14 67% 30% 100% IBM Host on demand 98 85 44 87% 45% 80% ICE Browser 105 88 36 84% 35% JavaFig 171 144 64 84% 37% 60% Cinderella 625 - 171 27% tools 737 513 204 70% 28% 40% Lotus eSuite Sheet 1,101 - 549 50% 20% visaj 1,157 703 238 61% 21% Size as % of jar file Lotus eSuite Chart 1,387 - 633 46% 0% swingall 1,657 998 338 60% 20% Mockingbird 2,350 - 506 22% 1 10 100 1,000 10,000 Reservation System 3,067 - 736 24% Size of jar file (KBytes) JDK 1.2 runtime 4,652 2,820 1,069 61% 23% Download times Execution Times Jar Pjar Pack • For swingall.jar 1000.0 – Jar format: 1,657 KBytes – compressed size: 338 Kbytes 100.0 – decompression time (Ultra 5 333Mhz) • to memory: 3 secs 10.0 • to jar file: ~ 14 secs – time to load classes : 5.8 secs Download time (secs) • time to define, resolve and verify 1.0 1 10 100 1,000 Download speed (KBytes/sec) William Pugh, Univ. of Maryland 5 Compressing Java Class files Decoder size & security Providing Jar functionality • Decoder is about 35Kbytes • Jar archives contain more than class files – Could be downloaded – images, text files, resources • not useful for small archives – manifest (signatures, …) – Could be installed as extension • Add a stream of non-class files • Decoder either needs permission to write to – a zip archive, without individual compression a temporary file or permission to create a but with overall compression class loader – Can do this under 1.2 security model Complication for signatures Related work - lots! • Compression and decompression changes a • Used few ideas that hadn’t been considered class file previously – by renumbering the constant pool • Compression of executable code • Signatures from source class files won’t – Ernst, Evans, Fraser, Lucco and Proebsting, work on decoded class files PLDI97 • Decompress once, sign decompressed class • Compression of Java Classfiles files, use those signatures – Nigel Horspool et al. – decompression is deterministic Jax from IBM Combining Jax and Pack jar size Jax'd & • Java Application eXtraction Benchmark (Kbytes) Jax'd Packed Packed – available from www.alphaworks.ibm.com Hanoi 46 46% 30% 15% • Extracts just the classes and methods IBM Host on demand 98 84% 45% 37% ICE Browser 105 87% 35% 32% needed by application JavaFig 171 79% 37% 31% • Very useful if your application uses small Cinderella 625 65% 27% 17% part of a large library Lotus eSuite Sheet 1,101 35% 50% 11% Lotus eSuite Chart 1,387 43% 46% 14% – eliminates need to ship entire library Mockingbird 2,350 13% 22% 4% Reservation System 3,067 58% 24% 14% William Pugh, Univ. of Maryland 6 Compressing Java Class files Combining Jax with Packing Software release Jax'd

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    7 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us