Motivation Systematic Mapping Threats to Validity Concluding Remarks

A Systematic Mapping Study on High-level Language Virtual Machines

Vinicius H. S. Durelli, Katia R. Felizardo, and Marcio E. Delamaro

Computer Systems Department University of S˜aoPaulo (ICMC-USP) 13560-970 – S˜aoCarlos – SP – Brazil {durelli,katiarf,delamaro}@icmc.usp.br

October 17, 2010

1 / 24 Motivation Systematic Mapping Threats to Validity Concluding Remarks Agenda

1 Motivation Research on HLL VMs First Step Towards Filling in Such a Gap

2 Systematic Mapping Overview Steps Data Extraction and Mapping

3 Threats to Validity

4 Concluding Remarks

2 / 24 Motivation Systematic Mapping Research on HLL VMs Threats to Validity First Step Towards Filling in Such a Gap Concluding Remarks Research on High-level Language Virtual Machines A great deal of the contemporary high-level languages have their execution environment based upon high-level language virtual machines (HLL VMs).

There is a large body of literature on research in virtual machine for high-level languages.

A mature research area means a sharp increase in the number of results made available, thus it becomes essential to summarize and provide an overview of such area.

To the best of our knowledge there are no comprehensive studies focusing on an overview of this research area and its most investigated subjects.

3 / 24 Motivation Systematic Mapping Research on HLL VMs Threats to Validity First Step Towards Filling in Such a Gap Concluding Remarks Motivation: First Step Towards Filling in Such Gap

In order to fill in such a gap it is needed to ascertain the nature, extent, and quantity of published research papers. Contributions:

1 Areas that have been most subjected to investigation. Side effect: Areas that require further research. 2 The relevant publication forums.

3 HLL VM implementations that are the most widely used within the academic community.

4 / 24 Motivation Overview Systematic Mapping Steps Threats to Validity Data Extraction and Mapping Concluding Remarks Evidence-based Paradigm

Definition → Systematic Mapping Methodology that involves searching the literature in order to aggre- gate and categorize primary studies, thereby yielding a synthesized view of the research area under consideration [Petersen et al., 2008].

Advantages:

The approach used for searching and inclusion and exclusion criteria are defined in a research protocol and reported as an outcome. Side effects: Transparent; Replicable; Updatable.

5 / 24 Motivation Overview Systematic Mapping Steps Threats to Validity Data Extraction and Mapping Concluding Remarks Systematic Mapping Process: Overview

Figure: The systematic mapping process [Petersen et al., 2008].

6 / 24 Motivation Overview Systematic Mapping Steps Threats to Validity Data Extraction and Mapping Concluding Remarks Research Questions

Research questions must embody the mapping study purpose.

RQ1: which functionalities/features/characteristics of HLL VMs have been most investigated?

RQ2: which are the mainstream HLL VM implementations within the academic community?

7 / 24 Motivation Overview Systematic Mapping Steps Threats to Validity Data Extraction and Mapping Concluding Remarks Search for Primary Studies

Search String → combination of these keywords and acronyms virtual machine, VM, high-level language virtual machine, and HLL VM. We used the search string on the following electronic databases:

ACM Digital Library, EngineeringVillage, IEEE Xplore, Springer Lecture Notes in Computer Science (LNCS), and ScienceDirect.

No limits were placed on date of publication.

8 / 24 Motivation Overview Systematic Mapping Steps Threats to Validity Data Extraction and Mapping Concluding Remarks Screening: Inclusion Criteria

The inclusion criteria devised and applied are:

if several papers reported similar studies, only the most recent was selected;

papers describing more than one study had each study individually evaluated;

it has to describe at least a prototypical implementation of the proposed improvement, thereby mentioning the HLL VM implementation that was modified.

9 / 24 Motivation Overview Systematic Mapping Steps Threats to Validity Data Extraction and Mapping Concluding Remarks Screening: Exclusion Criteria (i)

and the following exclusion criteria:

papers that do not present studies pertaining to HLL VMs, e.g., papers describing research on system VMs;

studies describing the introduction of improvements that consist in solely modifying the intermediate language of the HLL VM under consideration;

studies whose proposed enhancements do not imply in making changes to the underlying HLL VM, e.g., papers describing features implemented atop HLL VMs;

10 / 24 Motivation Overview Systematic Mapping Steps Threats to Validity Data Extraction and Mapping Concluding Remarks Screening: Exclusion Criteria (ii)

studies whose target HLL VM is either a co-designed (e.g., composed of both software and hardware portions) or an entirely implemented in hardware HLL VM;

technical reports, documents that are available in the form of either abstracts or presentations (i.e., elements of “grey” literature), and secondary literature reviews (i.e., mapping studies).

11 / 24 Motivation Overview Systematic Mapping Steps Threats to Validity Data Extraction and Mapping Concluding Remarks Final Set of Selected Primary Studies

Electronic Database Number ACM Digital Library 1554 EngineeringVillage 1395 IEEE Xplore 309 Springer LNCS 640 ScienceDirect 1123 Total 5021 Candidates 142 Final set 128

Table: Papers retrieved from each electronic database, total of candidate studies, and the final set.

12 / 24 Motivation Overview Systematic Mapping Steps Threats to Validity Data Extraction and Mapping Concluding Remarks Keywording The aim of this step is to devise our own classification scheme and categories for the selected primary studies.

Certain sections are read for the purpose of finding keywords and concepts that reflect their contribution.

13 / 24 Motivation Overview Systematic Mapping Steps Threats to Validity Data Extraction and Mapping Concluding Remarks Resulting Categories

Categories Optimization Garbage Collection (GC) Debugging Memory Leak Tolerance (MLT) New Language Construct (NLC) Profiling Aspect-Oriented Programming (AOP) (ES) Security Real-Time Distributed Computing (DC) Fault Tolerance (FT) Resource Sharing among HLL VMs (RSVM) Testing

14 / 24 Motivation Overview Systematic Mapping Steps Threats to Validity Data Extraction and Mapping Concluding Remarks Resulting Category Frequencies

35 34 33 30

25 24

20

15 Frequency

13 10 9 8 8 7 5 4 4 3 2 2 2 0 MLT NLC FT DC AOP ES GC Testing RSVM Security ProfilingReal-Time Debugging Optimization Category Figure: Frequency of studies in each category*.

∗ Certain studies were grouped in more than one category 15 / 24 Motivation Overview Systematic Mapping Steps Threats to Validity Data Extraction and Mapping Concluding Remarks Most Researched Subjects Evolution

According to our results, these are the “trendy” subjects:

7 Optimization GC 6 ES

5

4

Frequency 3

2

1

0

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 Year

Figure: Year-wise distribution of publications on the most investigated categories.

16 / 24 Motivation Overview Systematic Mapping Steps Threats to Validity Data Extraction and Mapping Concluding Remarks Distribution of Primary Studies by Electronic Database

Electronic Databases

EngineeringVillage Electronic Database Number

30.0% ACM Digital Library 62 EngineeringVillage 38 Springer LNCS 16 ACM 48.0% 9.0% IEEE Xplore 12 Digital IEEE Xplore Library ScienceDirect 0 13.0%

Springer LNCS

17 / 24 Motivation Overview Systematic Mapping Steps Threats to Validity Data Extraction and Mapping Concluding Remarks Distribution of Primary Studies by Publication Type

Publication Type

Conference Publication Type Number 36.0% Conference 46 Journal 31 Journal 24.0% Symposium 25 6.0% Workshop Book Chapter 18

14.0% Workshop 8 20.0%

Book Chapter

Symposium

18 / 24 Motivation Overview Systematic Mapping Steps Threats to Validity Data Extraction and Mapping Concluding Remarks Map: Year-Wise Distribution (detailed)

Map

RSVM 1 1

MLT 1 1

GC 1 1 4 1 2 4 3 5 6 3 3

Real-Time 1 1 1 3 1 4 1 1

DC 2 3 1 1

Optimization 1 3 2 2 4 4 5 3 6 4

Security 1 1 1

Profiling 1 3 2 3

Testing 1 1

FT 1 2 1

ES 2 1 5 7 2 2 2 3

Debugging 1 1 1 2 1 1 1

NLC 2 1 1

AOP 1 1 2 1 2 1

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010

19 / 24 Motivation Overview Systematic Mapping Steps Threats to Validity Data Extraction and Mapping Concluding Remarks Map: The Most-Widely Used HLL VM Implementations

Map

RSVM 2

MLT 2

GC 1 3 1 1 3 4 1 2 1 1 1 12 1

Real-Time 3 2 1 2 2 1

DC 2 3 1 1

Optimization 1 1 2 1 3 3 6 1 3 12 1

Security 1 1 1

Profiling 1 1 2 1 5

Testing 1 1

FT 1 1 1 1

ES 2 3 5 1 1 2 5 1 1 1

Debugging 1 1 1 2 3

NLC 1 2 1

AOP 1 2 4 1

J9 CVM CLR KVM IVM ORP VM* JITS OVM OCVM TclVM MONO VMKit CEJVM JeRTy GForth Maxine CACAO JamVM Jamaica HotSpot SICStus SableVM JESSICA2 Harmony Exact VM Jikes RVM SimpleRTJ Steamloom IBM's J2ME

20 / 24 Motivation Systematic Mapping Threats to Validity Concluding Remarks Threats to Validity

We cannot rule out threats from a quality assessment perspective.

(We wanted to be as inclusive as possible) We simply selected studies without assigning any scores.

Another threat consists in whether we have properly identified and selected all relevant publications.

Whether our resulting classification scheme and cate- gories are coherent also represents a threat to validity.

21 / 24 Motivation Systematic Mapping Threats to Validity Concluding Remarks Concluding Remarks

The mapping study results, although not entirely surprising (some may argue), can be used to support several claims that are frequently made but not scientifically backed up.

Our mapping study reveals that the majority of research into HLL VMs focuses on optimizing these execution environments, improving their memory management capa- bilities, and tailoring them to resource-constrained settings.

As for the publication types, the majority of the studies are conference publications.

Another contribution of this paper is the map we have created.

22 / 24 Motivation Systematic Mapping Threats to Validity Concluding Remarks References

K. Petersen, R. Feldt, S. Mujtaba, and M. Mattsson. Systematic Mapping Studies in . 12th International Conference on Evaluation and Assessment in Software Engineering (EASE), pages 71–80, 2008 J. E. Smith and R. Nair The Architecture of Virtual Machines. Computer 38(5):32–38, 2005. J. E. Smith and R. Nair Virtual Machines: Versatile Platforms for Systems and Processes. Morgan Kaufmann, 656 pages, 2005.

23 / 24 Motivation Systematic Mapping Threats to Validity Concluding Remarks

A Systematic Mapping Study on High-level Language Virtual Machines

Vinicius H. S. Durelli, Katia R. Felizardo, and Marcio E. Delamaro

Computer Systems Department University of S˜aoPaulo (ICMC-USP) 13560-970 – S˜aoCarlos – SP – Brazil {durelli,katiarf,delamaro}@icmc.usp.br

October 17, 2010

24 / 24