Techniques for Real-System Characterization of Java Virtual Machine Energy and Power Behavior Gilberto Contreras Margaret Martonosi
Department of Electrical Engineering Princeton University
1 Why Study Power in Java Systems?
The Java platform has been adopted in a wide variety of devices Java servers demand performance, embedded devices require low-power Performance is important, power/energy/thermal issues are equally important How do we study and characterize these requirements in a multi-layer platform?
2 Power/Performance Design Issues
Java Application
Java Virtual Machine
Operating System
Hardware
3 Power/Performance Design Issues
Java Application
Garbage Class Runtime Execution Collection LoaderJava VirtualCompiler MachineEngine
Operating System
Hardware How do the various software layers affect power/performance characteristics of hardware? Where should time be invested when designing power and/or thermally aware Java virtual Machines?
4 Outline
Approaches for Energy/Performance Characterization of Java virtual machines Methodology Breaking the JVM into sub-components Hardware-based power/performance characterization of JVM sub-components Results Jikes & Kaffe on Pentium M Kaffe on Intel XScale Conclusions
5 Power & Performance Analysis of Java
Simulation Approach √ Flexible: easy to model non-existent hardware x Simulators may lack comprehensiveness and accuracy x Thermal studies require tens of seconds granularity Accurate simulators are too slow
Hardware Approach √ Able to capture full-system characteristics and effects √ Data gathering is comparable to hardware speeds x Only applicable to existent hardware
6 Hardware-based Characterization
Hardware
Virtual Machine 0010 CH0 Track CH1 code Class Loader CH2 region ID: 0001 CH3
DAQ CPU Garbage Collector ID: 0010 CH- Power measurements Memory CH+ scheduler Compiler ID: 0100
Execution Engine ID: 1000
7 Two Virtual Machines
Jikes RVM Kaffe JVM
High performance Flexibility and Design goal portability
Architecture High-end processors High-end to embedded support
Garbage Multiple collectors Mark-and-sweep collection Runtime compiler with Just-in-time compiler different optimization Runtime optimizations levels
8 Two Platforms Pentium M (P6) Intel XScale
High-performance mobile High-end handheld Platform Type computers devices
Configuration 1.6Ghz, 512MB RAM 400Mhz, 32MB RAM
Theoretical Max Power 31W 1.4W
Jikes RVM JVM Kaffe Kaffe Used
9 Outline
Approaches for Energy/Performance Characterization of Java virtual machines Methodology Breaking the JVM into sub-components Power/performance hardware-based characterization of JVM sub-components Results Jikes & Kaffe on Pentium M Kaffe on Intel XScale Conclusions
10 Jikes Energy Distribution on P6
SemiSpace Garbage Collector app gc cl base opt_comp 100%
80%
60%
40% Energy Usage 20%
0%
8 2 8 8 2 8 8 8 32 2 3 2 32 2 3 2 32 4 2 1 1 1 1 128 1 Heap size (MB) db fop jess jack javac compress
11 Jikes Energy Distribution on P6
SemiSpace Garbage Collector app gc cl base opt_comp 100%
80%
60%
40% Energy Usage 20%
0%
8 2 8 8 2 8 8 8 32 2 3 2 32 2 3 2 32 4 2 1 1 1 1 128 1 Heap size (MB) db fop jess jack javac
compress •JVM: Up to 60% of the total energy •GC: Average 37% of the total energy of SpecJVM98
12 Jikes Energy-Delay Product on P6
SemiSpace MarkSweep GenMS GenCopy
3500
3000
2500
2000
1500 EDP (J*sec)
1000
500
0
4 8 4 8 2 6 2 6 2 8 32 6 96 2 32 6 96 3 64 9 3 64 9 3 64 96 2 1 12 128 128 1 Heap size (MB)
b c ck ress ss d va ja je ja mp co
Jikes: heap size has a significant impact on energy efficiency EDP decrease across heap sizes due to a decrease in application execution time
13 Jikes Power Consumption on P6
app gc cl GenCopy Garbage Collector
18
16
14
12
10
8 Watts 6
4
2
0
8 8 2 4 4 32 64 96 32 64 96 32 64 96 2 3 6 96 32 6 96 48 80 12 128 1 128 128 112 Heap size (MB) b jess d jack javac fop compress
Average power for JVM varies little across heap-sizes Garbage collector is high energy consumer, but low power
14 Jikes Peak Power on P6 app gc cl
20 18 16 14 12 10 Watts 8 6 4 2 0
8 8 8 32 64 96 12 32 64 96 128 32 64 96 12 32 64 96 128 32 64 96 12 48 80 112 Heap size (MB)
b d jess jack javac fop compress
Execution engine has the highest peak-power
15 Jikes versus Kaffe: Energy Distribution on P6
Jikes Kaffe
app gc cl base_comp opt_comp app gc cl jit 100% 100%
80% 80%
60% 60%
40% 40%
20% 20%
0% 0%
2 6 8 2 8 4 6 2 6 8 2 4 8 8 0 2 4 8 4 8 4 8 4 8 4 8 8 2 3 64 9 3 64 96 32 6 9 3 64 9 3 6 96 4 8 32 6 96 32 6 96 2 32 6 96 2 32 6 96 2 32 6 96 2 4 80 12 12 128 12 12 11 12 1 1 1 1 11
s k s k s b c p s b c s a o s a d ac f es d ac p e jes j e j j o r av r av f p j p j m m o o c c
Kaffe: high application energy caused by long execution times Kaffe: 8% of total average energy goes to virtual machine
16 Kaffe Across Platforms
Pentium M Intel XScale
app gc cl jit 100% app gc cl jit
80% 100%
80% 60%
60%
40% 40%
20% 20%
0% 0% 2 8 2 6 8 6 8 6 8 6 8 2 3 64 96 2 3 64 9 2 32 64 9 2 32 64 9 2 32 64 9 2 48 80 1 2 0 6 1 1 1 1 1 1 12 16 20 24 28 3 12 16 20 24 28 32 12 16 20 24 28 32 12 16 2 24 28 32 12 1 20 24 28 32
s b c k s b c k ss s d c e e a ss s d c r j va j e e va a a r j j p j ja p m m o o c c
XScale: no classes are included in the binary XScale: GC only represents 6% of the total energy consumed
17 Conclusions
Methodology
The complexity of the Java virtual machine calls for a more in-depth power/energy analysis
Hardware-based characterization of the virtual machine’s sub-components allow long execution times and trustworthy measurements Lessons learned
In both platforms, JVM energy overhead is considerable
Jikes: the GC is low power but high energy consumer (up to 37% on average)
For Kaffe on XScale, the class loaded becomes high-energy consumer (18% for measured benchmarks)
18 Thank you!
19