The Jrpm System for Dynamically Parallelizing Java Programs

Total Page:16

File Type:pdf, Size:1020Kb

The Jrpm System for Dynamically Parallelizing Java Programs The Jrpm System for Dynamically Parallelizing Java Programs Michael K. Chen Kunle Olukotun Stanford University Stanford University [email protected] [email protected] Abstract analysis can parallelize Fortran-like numerical applications on traditional multiprocessors We describe the Java runtime parallelizing machine [2][5][20][38]. Unfortunately, numerous challenges (Jrpm), a complete system for parallelizing sequential have made automatic compiler parallelization of general programs automatically. Jrpm is based on a chip integer programs difficult. Analyzing pointer aliasing, multiprocessor (CMP) with thread-level speculation control flow, irregular array accesses, and dynamic loop (TLS) support. CMPs have low sharing and limits as well as handling inter-procedural analysis communication costs relative to traditional complicate static dependence analysis [3]. These multiprocessors, and thread-level speculation (TLS) difficulties introduce imprecision into dependence simplifies program parallelization by allowing us to relations, limit the accuracy of parallelism estimates, and parallelize optimistically without violating correct force conservative synchronization to safely handle sequential program behavior. Using a Java virtual potential dependencies. machine with dynamic compilation support coupled with Our paper describes the Java runtime parallelizing a hardware profiler, speculative buffer requirements and machine (Jrpm), a complete system that automatically inter-thread dependencies of prospective speculative parallelizes sequential programs using thread-level thread loops (STLs) are analyzed in real-time to identify parallelism. This system takes advantage of recent the best loops to parallelize. Once sufficient data has developments that now enable a different approach to been collected to make a reasonable decision, selected automatic parallelization for small multiprocessors (2 – 8 loops are dynamically recompiled to run in parallel. CPUs). The key components of this system are: a chip Experimental results demonstrate that Jrpm can multiprocessor that provides low-latency inter-processor exploit thread-level parallelism with minimal effort from communication, thread speculation support that allows us the programmer. On four processors, we achieved to parallelize optimistically, a hardware profiler for speedups of 3 to 4 for floating point applications, 2 to 3 identifying parallel loops from program execution, and a on multimedia applications, and between 1.5 and 2.5 on virtual machine environment where dynamic integer applications. Performance was achieved by optimizations can be performed without modifying automatic selection of thread decompositions by the source binaries. hardware profiler, intra-procedural optimizations on • Chip multiprocessor – Jrpm is based on the code compiled dynamically into speculative threads, and Hydra chip multiprocessor (CMP) [32]. Decreasing some minor programmer transformations for exposing feature size and increasing transistor counts now allow parallelism that cannot be performed automatically. chip multiprocessors to be a reality [6][14][24][42]. Chip multiprocessors combine several CPUs onto one 1. Introduction and Overview die with a tightly coupled memory interface. In this As the limits of instruction-level parallelism (1 – 10s configuration, inter-processor sharing and of instructions) with a single thread of control are communication costs are significantly less than in approached [44], we must look elsewhere for traditional multiprocessors. The reduced communication architectural improvements that can speedup sequential costs make it possible to take advantage of fine-grained program execution. Coarser grained parallelism, like thread-level parallelism. fine-grained thread-level parallelism (10 – 1,000s of • Thread-level speculation – Hydra includes instructions), is a potential area for exploration. This support for thread-level speculation (TLS) type of parallelism is not exploited today due to [10][21][29][40]. TLS allows a sequential program to be limitations of current multiprocessor architectures and divided into arbitrary chunks of code, or threads, to be parallelizing compilers. executed in parallel. TLS hardware ensures memory Traditional symmetric multiprocessors (SMPs) accesses between threads maintain the original sequential [13][22] have been most effective at exploiting coarse program order. grained parallelism (>1,000s of instructions). The high Traditional multiprocessors must synchronize cost of inter-processor communication in these systems, conservatively to preserve correctness when static generally >100s of cycles, eliminates any potential analysis cannot determine with complete certainty that a speedups for fine-grained parallel tasks that either have dependency does not exist. For non floating-point true dependencies or closely shared cache lines. applications, this makes it difficult to find regions that Modern compilers that perform array dependence can be parallelized safely and still achieve good speedups. With TLS, it is possible to parallelize © 2003 IEEE. Published in the Proceedings of ISCA-30, 9-11 June 2003 in San Diego, CA, USA. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the IEEE. Contact: Manager, Copyrights and Permissions / IEEE Service Center / 445 Hoes Lane / P.O. Box 1331 / Piscataway, NJ 08855-1331, USA. Telephone: + Intl. 732-562-3966. aggressively rather than conservatively. Since sequential compiled with instructions annotating local variables and ordering is guaranteed for parallel execution of any thread decompositions is executed as a sequential program decomposition under TLS, the problem of program on a single Hydra processor. The trace compilation is reduced to finding regions of program hardware collects statistics in real-time for the execution with significant parallelism. prospective decompositions. Once sufficient data has • Hardware profiler – Static parallelizing compilers been collected by the trace hardware, regions predicted often have insufficient information to analyze dynamic to have the largest speedup and most coverage are dependencies effectively. Dynamic analysis to find dynamically recompiled into speculative threads. parallelism complements a TLS processor’s ability to This dynamic parallelization system has other parallelize optimistically and use hardware to guarantee potential benefits: correctness. Tracer for Extracting Speculative Threads • Reduced programmer effort – Explicit coding of (TEST) [9] is hardware support in Jrpm that analyzes parallelism is better suited for coarser grained and sequential program execution in real-time to find the best distinct tasks. Manually identifying fine-grained parallel regions to parallelize. This system provides accurate decompositions can be a time-consuming effort for estimates of dynamic dependency behavior, thread size, programs without obvious critical sections. and buffering requirements that are needed for selecting • Portability – The code maintains platform good decompositions and that would be difficult to independence because the binaries are not modified derive statically. Estimates show the tracer requires explicitly for thread speculation. minimal hardware additions to our CMP and causes only • Retargetability – Decompositions can be tailored minor slowdowns to programs during analysis. dynamically for specific hardware or data. Different • Virtual machine – The Java virtual machine decompositions may be chosen for future CMPs with (JVM) [28] acts as an abstraction layer to hide dynamic more CPUs, larger speculative buffers, and different analysis and thread-level speculation from the program. cache configurations. In some applications, Virtual machines like the JVM and Microsoft’s .NET decompositions can also be optimized for specific data VM have become commercially popular recently for set sizes. supporting platform independent applications. Binaries • Simplified analysis – Compared to traditional are distributed as portable, platform independent parallelizing compilers, the Jrpm system relies on more bytecodes [28] that are dynamically compiled into native hardware for TLS and profiling support, but reduces the instructions of the underlying architecture and run within complexity of the analysis required to extract exposed the protected environment of the VM. This virtualization thread-level parallelism. allows us to seamlessly support a new execution model Simulations demonstrate Jrpm’s ability to exploit without modifying the source binaries. thread-level parallelism automatically with minimal effort from the programmer. On our CMP with four Java bytecode CPUs, we achieve speedups of 3 to 4 on floating point Application 1 applications, 2 to 3 on multimedia applications, and between 1.5 and 2.5 on integer applications. Selection of Nati ve code CFG / DFG appropriate thread decompositions by the hardware + Nati ve TLS JIT Compiler Annotation code profiler, low-level optimizations on code dynamically instructions 4 compiled into speculative threads, and some manual Profile analyzer Java VM 2 5 programmer transforms for exposing parallelism when it 3 cannot be found automatically are all key to getting good TEST profiler TLS support Hydra CMP performance.
Recommended publications
  • Dynamic Helper Threaded Prefetching on the Sun Ultrasparc® CMP Processor
    Dynamic Helper Threaded Prefetching on the Sun UltraSPARC® CMP Processor Jiwei Lu, Abhinav Das, Wei-Chung Hsu Khoa Nguyen, Santosh G. Abraham Department of Computer Science and Engineering Scalable Systems Group University of Minnesota, Twin Cities Sun Microsystems Inc. {jiwei,adas,hsu}@cs.umn.edu {khoa.nguyen,santosh.abraham}@sun.com Abstract [26], [28], the processor checkpoints the architectural state and continues speculative execution that Data prefetching via helper threading has been prefetches subsequent misses in the shadow of the extensively investigated on Simultaneous Multi- initial triggering missing load. When the initial load Threading (SMT) or Virtual Multi-Threading (VMT) arrives, the processor resumes execution from the architectures. Although reportedly large cache checkpointed state. In software pre-execution (also latency can be hidden by helper threads at runtime, referred to as helper threads or software scouting) [2], most techniques rely on hardware support to reduce [4], [7], [10], [14], [24], [29], [35], a distilled version context switch overhead between the main thread and of the forward slice starting from the missing load is helper thread as well as rely on static profile feedback executed, minimizing the utilization of execution to construct the help thread code. This paper develops resources. Helper threads utilizing run-time a new solution by exploiting helper threaded pre- compilation techniques may also be effectively fetching through dynamic optimization on the latest deployed on processors that do not have the necessary UltraSPARC Chip-Multiprocessing (CMP) processor. hardware support for hardware scouting (such as Our experiments show that by utilizing the otherwise checkpointing and resuming regular execution). idle processor core, a single user-level helper thread Initial research on software helper threads is sufficient to improve the runtime performance of the developed the underlying run-time compiler main thread without triggering multiple thread slices.
    [Show full text]
  • Chapter 2 Java Processor Architectural
    INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. ProQuest Information and Leaming 300 North Zeeb Road, Ann Arbor, Ml 48106-1346 USA 800-521-0600 UMI" The JAFARDD Processor: A Java Architecture Based on a Folding Algorithm, with Reservation Stations, Dynamic Translation, and Dual Processing by Mohamed Watheq AH Kamel El-Kharashi B. Sc., Ain Shams University, 1992 M. Sc., Ain Shams University, 1996 A Dissertation Submitted in Partial Fulfillment of the Requirements for the Degree of D o c t o r o f P h il o s o p h y in the Department of Electrical and Computer Engineering We accept this dissertation as conforming to the required standard Dr. F. Gebali, Supervisor (Department of Electrical and Computer Engineering) Dr.
    [Show full text]
  • Debugging Multicore & Shared- Memory Embedded Systems
    Debugging Multicore & Shared- Memory Embedded Systems Classes 249 & 269 2007 edition Jakob Engblom, PhD Virtutech [email protected] 1 Scope & Context of This Talk z Multiprocessor revolution z Programming multicore z (In)determinism z Error sources z Debugging techniques 2 Scope and Context of This Talk z Some material specific to shared-memory symmetric multiprocessors and multicore designs – There are lots of problems particular to this z But most concepts are general to almost any parallel application – The problem is really with parallelism and concurrency rather than a particular design choice 3 Introduction & Background Multiprocessing: what, why, and when? 4 The Multicore Revolution is Here! z The imminent event of parallel computers with many processors taking over from single processors has been declared before... z This time it is for real. Why? z More instruction-level parallelism hard to find – Very complex designs needed for small gain – Thread-level parallelism appears live and well z Clock frequency scaling is slowing drastically – Too much power and heat when pushing envelope z Cannot communicate across chip fast enough – Better to design small local units with short paths z Effective use of billions of transistors – Easier to reuse a basic unit many times z Potential for very easy scaling – Just keep adding processors/cores for higher (peak) performance 5 Parallel Processing z John Hennessy, interviewed in the ACM Queue sees the following eras of computer architecture evolution: 1. Initial efforts and early designs. 1940. ENIAC, Zuse, Manchester, etc. 2. Instruction-Set Architecture. Mid-1960s. Starting with the IBM System/360 with multiple machines with the same compatible instruction set 3.
    [Show full text]
  • Ultrasparc-III Ultrasparc-III Vs Intel IA-64
    UltraSparc-III UltraSparc-III vs Intel IA-64 vs • Introduction Intel IA-64 • Framework Definition Maria Celeste Marques Pinto • Architecture Comparition Departamento de Informática, Universidade do Minho • Future Trends 4710 - 057 Braga, Portugal [email protected] • Conclusions ICCA’03 ICCA’03 Introduction Framework Definition • UltraSparc-III (US-III) is the third generation from the • Reliability UltraSPARC family of Sun • Instruction level Parallelism (ILP) • Is a RISC processor and uses the 64-bit SPARC-V9 architecture – instructions per cycle • IA-64 is Intel’s extension into a 64-bit architecture • Branch Handling • IA-64 processor is based on a concept known as EPIC (Explicitly – Techniques: Parallel Instruction Computing) • branch delay slots • predication – Strategies: • static •dynamic ICCA’03 ICCA’03 Framework Definition Framework Definition • Memory Hierarchy • Pipeline – main memory and cache memory – increase the speed of CPU processing – cache levels location – several stages that performs part of the work necessary to execute an instruction – cache organization: • fully associative - every entry has a slot in the "cache directory" to indicate • Instruction Set (IS) where it came from in memory – is the hardware "language" in which the software tells the processor what to do • one-way set associative - only a single directory entry be searched – can be divided into four basic types of operations such as arithmetic, logical, • two-way set associative - two entries per slot to be searched (and is extended to program-control
    [Show full text]
  • Sun Blade 1000 and 2000 Workstations
    Sun BladeTM 1000 and 2000 Workstations Just the Facts Copyrights 2002 Sun Microsystems, Inc. All Rights Reserved. Sun, Sun Microsystems, the Sun logo, Sun Blade, PGX, Solaris, Ultra, Sun Enterprise, Starfire, SunPCi, Forte, VIS, XGL, XIL, Java, Java 3D, SunVideo, SunVideo Plus, Sun StorEdge, SunMicrophone, SunVTS, Solstice, Solstice AdminTools, Solstice Enterprise Agents, ShowMe, ShowMe How, ShowMe TV, Sun Workstation, StarOffice, iPlanet, Solaris Resource Manager, Java 2D, OpenWindows, SunCD, Sun Quad FastEthernet, SunFDDI, SunATM, SunCamera, SunForum, PGX32, SunSpectrum, SunSpectrum Platinum, SunSpectrum Gold, SunSpectrum Silver, SunSpectrum Bronze, SunSolve, SunSolve EarlyNotifier, and SunClient are trademarks, registered trademarks, or service marks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. UNIX is a registered trademark in the United States and in other countries, exclusively licensed through X/Open Company, Ltd. FireWire is a registered trademark of Apple Computer, Inc., used under license. OpenGL is a trademark of Silicon Graphics, Inc., which may be registered in certain jurisdictions. Netscape is a trademark of Netscape Communications Corporation. PostScript and Display PostScript are trademarks of Adobe Systems, Inc., which may be registered in
    [Show full text]
  • Robert Garner
    San José State University The Department of Computer Science and The Department of Computer Engineering jointly present THE HISTORY OF COMPUTING SPEAKER SERIES Robert Garner Tales of CISC and RISC from Xerox PARC and Sun Wednesday, December 7 6:00 – 7:00 PM Auditorium ENGR 189 Engineering Building CISCs emerged in the dawn of electronic digital computers, evolved flourishes, and held sway over the first half of the computing revolution. As transistors shrank and memories became faster, larger, and cheaper, fleet- footed RISCs dominated in the second half. In this talk, I'll describe the paradigm shift as I experienced it from Xerox PARC's CISC (Mesa STAR) to Sun's RISC (SPARC Sun-4). I'll also review the key factors that affected microprocessor performance and appraise how well I predicted its growth for the next 12 years back in 1996. I'll also share anecdotes from PARC and Sun in the late 70s and early 80s. Robert’s Silicon Valley career began with a Stanford MSEE in 1977 as his bike route shifted slightly eastward to Xerox’s Palo Alto Research facility to work with Bob Metcalfe on the 10-Mbps Ethernet and the Xerox STAR workstation, shipped in 1981. He subsequently joined Lynn Conway’s VLSI group at Xerox PARC. In 1984, he jumped to the dazzling start-up Sun Microsystems, where with Bill Joy and David Patterson, he led the definition of Sun’s RISC (SPARC) while co-designing the Sun-4/200 workstation, shipped in 1986. He subsequently co-managed several engineering product teams: the SunDragon/SPARCCenter multiprocessor, the UltraSPARC-I microprocessor, the multi-core/threaded Java MAJC microprocessor, and the JINI distributed devices project.
    [Show full text]
  • A Novel Dynamic Scan Low Power Design for Testability Architecture for System-On-Chip Platform
    International Journal of Applied Engineering Research ISSN 0973-4562 Volume 13, Number 7 (2018) pp. 5256-5259 © Research India Publications. http://www.ripublication.com A Novel Dynamic Scan Low Power Design for Testability Architecture for System-On-Chip Platform John Bedford Solomon1, D Jackuline Moni2 and Y. Amar Babu3 1Research Scholar, Karunya University, Coimbatore, Tamil Nadu, India. 2Professor, Karunya University, Coimbatore, Tamil Nadu, India. 3Associate Professor , LBRCE, Mylavaram, Andhra Pradesh, India. Abstract acceleration modules and hardware debug module. The MicroBlaze based SoC platform uses Xilinx microkernel and Design for Testability and Low power design issues are UART custom peripherals. It also support third party demanding requirements for System-On-Chip plat form to RTOS[6-14]. The MicroBlaze required 525 slices in Spartan-3 provide high performance and highly reliable chips. In this FPGA and executes 65 D-MIPS at 80 MHz. The PicoBlaze is paper a novel dynamic scan low power design for testability a 8 bit RISC core, requires 192 logic cells and 212 macrocells architecture has been proposed for reducing test power on in Spartan-3 FPGA[14], executes 37 MIPS at 74MHz. The System-On-Chip platform. The area analysis of proposed Sun Microsystems UltraSPARC T2 microprocessor is also architecture for soft core Microblaze based System-On-chip used to study the performance of proposed dynamic scan low observes 5% reduction of area and test power also reduced by power DFT architecture. 10% approximately. The area and power analysis of proposed architecture for Microblaze based SoC platform, Picoblaze The dynamic scan low power DFT architecture has been based SoC platform and Ultrasparc T2 architectures has been implemented on above mentioned architectures to analyze observed for optimized area and power design metrics.
    [Show full text]
  • Hotchips '99 Hotchips
    MAJCTM Microprocessor Architecture for Java Computing MarcMarc Tremblay,Tremblay, ChiefChief ArchitectArchitect SunSun MicrosystemsMicrosystems IncInc.. HotChipsHotChips ‘99‘99 MicroprocessorMicroprocessor ArchitectureArchitecture forfor JavaJava ComputingComputing Four-year program to develop a new microprocessor family based on the following four assumptions: 1) Current and future compute tasks are/will be very different than benchmarks used to develop CISC and RISC processors – Algorithms - compute intensive, ratio of compute operations to memory operations is much higher • e.g.: IIR filters, VOIP, MPEG2: 3-16 ops/flops per memory op. – Data types - digitized analog signals, lots of data – I/O - streaming, mostly linearly AssumptionsAssumptions (continued)(continued) 2) The dominant platform for networked devices, networked computers, terminals, application servers, etc. will be the JavaTM platform. – > 1 million Java developers, 72% of Fortune 1000 companies using Java (80% client, 100% server) by 2000, present in Navigator, Explorer, AOL browser, PalmPilots, 2-way pagers, cellular phones, etc. – Multi-threaded applications • Observing Java applications with 10’s and 100’s of threads – Run time optimizations is the prime focus – Architecture must consider virtual functions, exception model, garbage collection, etc. AssumptionsAssumptions (continued)(continued) 3) Parallelism is available at multiple levels – ILP improves through: •Data and control speculation – non-faulting loads, explicit MP predication, and traditional methods Thread
    [Show full text]
  • Sun Fire V880z Server and Sun XVR-4000 Graphics Accelerator
    Sun Fire™ V880z Server and Sun™ XVR-4000 Graphics Accelerator Installation and User’s Guide Sun Microsystems, Inc. 4150 Network Circle Santa Clara, CA 95054 U.S.A. 650-960-1300 Part No. 817-2400-10(v2) May 2003, Revision A Submit comments about this document at: http://www.sun.com/hwdocs/feedback Copyright 2003 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, California 95054, U.S.A. All rights reserved. Sun Microsystems, Inc. has intellectual property rights relating to technology embodied in the product that is described in this document. In particular, and without limitation, these intellectual property rights may include one or more of the U.S. patents listed at http://www.sun.com/patents and one or more additional patents or pending patent applications in the U.S. and in other countries. This document and the product to which it pertains are distributed under licenses restricting their use, copying, distribution, and decompilation. No part of the product or of this document may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any. Third-party software, including font technology, is copyrighted and licensed from Sun suppliers. Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a registered trademark in the U.S. and in other countries, exclusively licensed through X/Open Company, Ltd. Sun, Sun Microsystems, the Sun logo, AnswerBook2, docs.sun.com, Sun Fire, Java3D, Java, OpenBoot, and Solaris are trademarks or registered trademarks of Sun Microsystems, Inc.
    [Show full text]
  • Power4 Focuses on Memory Bandwidth IBM Confronts IA-64, Says ISA Not Important
    Power4 Focuses on Memory Bandwidth IBM Confronts IA-64, Says ISA Not Important by Keith Diefendorff Looking for Parallelism in All the Right Places With Power4, IBM is targeting the high-reliability servers Not content to wrap sheet metal around that will power future e-businesses. The company has said Intel microprocessors for its future server that Power4 was designed and optimized primarily for business, IBM is developing a processor it servers but that it will be more than adequate for workstation hopes will fend off the IA-64 juggernaut. Speaking at this duty as well. The market IBM apparently seeks starts just week’s Microprocessor Forum, chief architect Jim Kahle de- above small PC-based servers and runs all the way up scribed IBM’s monster 170-million-transistor Power4 chip, through the high-end high-availability enterprise servers which boasts two 64-bit 1-GHz five-issue superscalar cores, a that run massive commercial and technical workloads for triple-level cache hierarchy, a 10-GByte/s main-memory corporations and governments. interface, and a 45-GByte/s multiprocessor interface, as Much to IBM’s chagrin, Intel and HP have also aimed Figure 1 shows. Kahle said that IBM will see first silicon on IA-64 at servers and workstations. IA-64 system vendors Power4 in 1Q00, and systems will begin shipping in 2H01. such as HP and SGI have their sights set as high up the server scale as IBM does, so there is clearly a large overlap between No Holds Barred the markets all these companies covet.
    [Show full text]
  • Java Microarchitectures
    Book Review / Revue de livre Java Microarchitectures he need for an architectural neutral language that could be T used to produce code that would run on a variety of CPUs by: Purnendu Sinha, Nikhil Varma & Vasudevan Janarthanan, under differing environments led to the emergence of the Concordia University, Montreal, QC. Java programming language. With the rise of the World Wide Web, Java has propelled to the forefront of computer language design, because the web, too, demands portable programs. The environ- Book Edited by: Vijaykrishnan Narayanan, Mario I.Wolczko mental change that prompted Java was the need for platform- independent programs destined for distribution on the Internet. The Java Published by: Kluwer Academic Publishers platform with its target platform neutrality, simplified object model, ISBN: 1-4020-7034-9 strong notions of security and portability, as well as multithreading sup- 2002 port, provides many advantages for a new generation of networked, embedded and real-time systems. All those features would not have been possible without appropriate hardware support. This book delves in-depth into the various hardware requirements (with suitable case- processor architecture for the hardware execution of the bytecodes and studies and examples) for realizing the advantages of Java. resolves the issue of stack dependency by the use of a hardware byte- code folding algorithm. The architecture provides a dual processing Java’s portability is attained by compiling Java programs to Java byte- capability of execution of bytecodes and native binaries. This chapter codes (JBCs) and interpreting them on the platform-independent Java discusses and analyzes the bytecode processing in various stages of the Virtual Machine (JVM).
    [Show full text]
  • Highly Threaded SPARC Architectures
    Highly Threaded SPARC Architectures James Hughes Sun Fellow Sun Microsystems Note: future dates and features within this presentation reflect Sun's current plans but may change without notice. Driving Volume to Value Open Source Free Access Revenue Making SW Development Choice Increases Volume Business Deployment Solaris • Free RTU Solaris 10 NetBeans (unsupported) Java ES Glassfish • Extended Try and Buy ID Management Program Single sign-on Java CAPS • Connected to Sun All Middleware Mobile '(")* +,)$%-.,*/%01"#)"2 : %&#;< %=147>?4@A=;&34?B;#=@4 C $D;EF;%GH;I8F;J@=31H;KLM0N;(O C DPL3=47N;Q>JJNL011A?@04@R3;%*'F *'# %&# : !"#9,$;< %=43S37;!53?>4@A=;#=@41 C P;4B730T1;1B073;30?B;>=@4 C !53?>431;A=3;@=43S37; @=147>?4@A=,?N?J3 !"#9 !"#$ : '(#;< 'A0T,(4A73;#=@4 C KEF;UGH;$DF;J@=31H;PLM0N;(O ++#, C $8KL3=47N;Q>JJNL011A?@04@R3; -.*. U*'F : &/#;< &JA04@=S,/70VB@?1;#=@4 : ()#;< (4730W;)7A?311@=S;#=@4 ()# &/# '(# C X7NV4AS70VB@?;0??3J3704@A= : *'#;< *70V;'AS@?;#=@4 C #VT0431;W0?B@=3;14043H;B0=TJ31; 35?3V4@A=1;0=T;@=4377>V41 /01234 : ++#;< +3WA7N;+0=0S3W3=4;#=@4 C -07TM073;406J3M0J2;Y-.*.Z 5607,'8 C KEFH;DPEFH;P+FH;8[D+F;V0S31 !"#$%%%& ()*% K !7,,GLH(H&,(E&5$M&.I&H*$ !"#$%&'()*&+$!(,& BCD5 $$$$$$-./0.123$ 5(>(5&?)@+*N 0%?0J O =(>(5&?)@+* (E$HD+D,,&,$ +)A +)0 P(*Q$D55$C+$R7,*(H,G$ CH&+D*(CE)$CS$C*Q&+$ :*C+& !"# *Q+&D5) 9E*&'&+?F+GH*C :C7+I&) !"$ K !"#$H&+SC+R)$(E*&'&+$ 455 89:$0;< =(>? R7,*(H,GT$5(>(5&T$HCH7,D*(CE$ !"% 67, :@+* IC7E* !"& K 67,*(H,(&+$&EQDEI&R&E*)$ 9E*&'&+?F+GH*C SC+$RC57,D+$D+(*QR&*(I$ %&)7,* !"' CH&+D*(CE) !( O U7(,*L(E$DII7R7,D*C+
    [Show full text]