Java-To-Javascript Translation Via Structured Control Flow Reconstruction of Compiler IR
Total Page:16
File Type:pdf, Size:1020Kb

Load more
Recommended publications
-
Differential Fuzzing the Webassembly
Master’s Programme in Security and Cloud Computing Differential Fuzzing the WebAssembly Master’s Thesis Gilang Mentari Hamidy MASTER’S THESIS Aalto University - EURECOM MASTER’STHESIS 2020 Differential Fuzzing the WebAssembly Fuzzing Différentiel le WebAssembly Gilang Mentari Hamidy This thesis is a public document and does not contain any confidential information. Cette thèse est un document public et ne contient aucun information confidentielle. Thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Technology. Antibes, 27 July 2020 Supervisor: Prof. Davide Balzarotti, EURECOM Co-Supervisor: Prof. Jan-Erik Ekberg, Aalto University Copyright © 2020 Gilang Mentari Hamidy Aalto University - School of Science EURECOM Master’s Programme in Security and Cloud Computing Abstract Author Gilang Mentari Hamidy Title Differential Fuzzing the WebAssembly School School of Science Degree programme Master of Science Major Security and Cloud Computing (SECCLO) Code SCI3084 Supervisor Prof. Davide Balzarotti, EURECOM Prof. Jan-Erik Ekberg, Aalto University Level Master’s thesis Date 27 July 2020 Pages 133 Language English Abstract WebAssembly, colloquially known as Wasm, is a specification for an intermediate representation that is suitable for the web environment, particularly in the client-side. It provides a machine abstraction and hardware-agnostic instruction sets, where a high-level programming language can target the compilation to the Wasm instead of specific hardware architecture. The JavaScript engine implements the Wasm specification and recompiles the Wasm instruction to the target machine instruction where the program is executed. Technically, Wasm is similar to a popular virtual machine bytecode, such as Java Virtual Machine (JVM) or Microsoft Intermediate Language (MSIL). -
The Architecture of Decentvm
The Architecture of the DecentVM Towards a Decentralized Virtual Machine for Many-Core Computing Annette Bieniusa Johannes Eickhold Thomas Fuhrmann University of Freiburg, Germany Technische Universitat¨ Munchen,¨ Germany [email protected] fjeick,[email protected] Abstract We have accomplished these goals by combining an off-line Fully decentralized systems avoid bottlenecks and single points transcoder with a virtual machine that executes modified bytecode. of failure. Thus, they can provide excellent scalability and very The transcoder performs the verification and statically links the robust operation. The DecentVM is a fully decentralized, distributed required classes of an application or library. Thereby, the resulting virtual machine. Its simplified instruction set allows for a small binary – a so-called blob – is significantly smaller than the original VM code footprint. Its partitioned global address space (PGAS) class files, and its execution requires a significantly less complex memory model helps to easily create a single system image (SSI) virtual machine. Both these effects result in the low memory across many processors and processor cores. Originally, the VM was footprint of DecentVM. designed for networks of embedded 8-bit processors. Meanwhile, All libraries that are required by an application are identified via it also aims at clusters of many core processors. This paper gives a cryptographic hashes. Thus, an application’s code can be deployed brief overview of the DecentVM architecture. as if it was a statically linked executable. The software vendor can thus test the executable as it runs on the customer site without Categories and Subject Descriptors C.1.4 [Processor Architec- interference from different library versions. -
Interaction Between Web Browsers and Script Engines
IT 12 058 Examensarbete 45 hp November 2012 Interaction between web browsers and script engines Xiaoyu Zhuang Institutionen för informationsteknologi Department of Information Technology Abstract Interaction between web browser and the script engine Xiaoyu Zhuang Teknisk- naturvetenskaplig fakultet UTH-enheten Web browser plays an important part of internet experience and JavaScript is the most popular programming language as a client side script to build an active and Besöksadress: advance end user experience. The script engine which executes JavaScript needs to Ångströmlaboratoriet Lägerhyddsvägen 1 interact with web browser to get access to its DOM elements and other host objects. Hus 4, Plan 0 Browser from host side needs to initialize the script engine and dispatch script source code to the engine side. Postadress: This thesis studies the interaction between the script engine and its host browser. Box 536 751 21 Uppsala The shell where the engine address to make calls towards outside is called hosting layer. This report mainly discussed what operations could appear in this layer and Telefon: designed testing cases to validate if the browser is robust and reliable regarding 018 – 471 30 03 hosting operations. Telefax: 018 – 471 30 00 Hemsida: http://www.teknat.uu.se/student Handledare: Elena Boris Ämnesgranskare: Justin Pearson Examinator: Lisa Kaati IT 12 058 Tryckt av: Reprocentralen ITC Contents 1. Introduction................................................................................................................................ -
Extending Basic Block Versioning with Typed Object Shapes
Extending Basic Block Versioning with Typed Object Shapes Maxime Chevalier-Boisvert Marc Feeley DIRO, Universite´ de Montreal,´ Quebec, Canada DIRO, Universite´ de Montreal,´ Quebec, Canada [email protected] [email protected] Categories and Subject Descriptors D.3.4 [Programming Lan- Basic Block Versioning (BBV) [7] is a Just-In-Time (JIT) com- guages]: Processors—compilers, optimization, code generation, pilation strategy which allows rapid and effective generation of run-time environments type-specialized machine code without a separate type analy- sis pass or complex speculative optimization and deoptimization Keywords Just-In-Time Compilation, Dynamic Language, Opti- strategies (Section 2.4). However, BBV, as previously introduced, mization, Object Oriented, JavaScript is inefficient in its handling of object property types. The first contribution of this paper is the extension of BBV with Abstract typed object shapes (Section 3.1), object descriptors which encode type information about object properties. Type meta-information Typical JavaScript (JS) programs feature a large number of object associated with object properties then becomes available at prop- property accesses. Hence, fast property reads and writes are cru- erty reads. This allows eliminating run-time type tests dependent on cial for good performance. Unfortunately, many (often redundant) object property accesses. The target of method calls is also known dynamic checks are implied in each property access and the seman- in most cases. tic complexity of JS makes it difficult to optimize away these tests The second contribution of this paper is a further extension through program analysis. of BBV with shape propagation (Section 3.3), the propagation We introduce two techniques to effectively eliminate a large and specialization of code based on object shapes. -
Emscripten: an LLVM to Javascript Compiler
Emscripten: An LLVM to JavaScript Compiler Alon Zakai Mozilla What? Why? Compiling to JavaScript ● The web is everywhere – PCs to iPads – No plugins, no installation required – Built on standards ● The web runs JavaScript Existing Compilers to JavaScript ● Google Web Toolkit: Java (Gmail, etc.) ● CoffeeScript ● Pyjamas: Python ● SCM2JS: Scheme ● JSIL: .NET bytecode ● (and many more) ● But C and C++ are missing! Emscripten ● Enables compiling C and C++ into JavaScript ● Written in JavaScript ● Open source http://emscripten.org https://github.com/kripken/emscripten Demos! ● Bullet ● SQLite ● Python, Ruby, Lua ● Real-world code – Large, complex codebases ● Manual ports exist – Typically partial and not up to date The Big Picture C or C++ LLVM Bitcode JavaScript Low Level Virtual Machine (LLVM) ● A compiler project (cf. GCC) ● Intermediate Representation: LLVM bitcode – Very well documented – Great tools ● Much easier to compile LLVM bitcode than compile C or C++ directly! How? Code Comparison #include <stdio.h> int main() { printf(“hello, world!\n”); return 0; } Code Comparison @.str = private unnamed_addr constant [15 x i8] c"hello, world!\0A\00", align 1 define i32 @main() { entry: %retval = alloca i32, align 4 call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([15 x i8]* @.str, i32 0, i32 0)) store i32 0, i32* %retval ret i32 %retval } Code Comparison define i32 @main() { function _main() { entry: %retval = alloca i32, var _retval; align 4 call i32 (i8*, ...)* _printf (..); @printf (..) store i32 0, i32* _retval = 0; %retval ret -
Machine Learning in the Browser
Machine Learning in the Browser The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:38811507 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA Machine Learning in the Browser a thesis presented by Tomas Reimers to The Department of Computer Science in partial fulfillment of the requirements for the degree of Bachelor of Arts in the subject of Computer Science Harvard University Cambridge, Massachusetts March 2017 Contents 1 Introduction 3 1.1 Background . .3 1.2 Motivation . .4 1.2.1 Privacy . .4 1.2.2 Unavailable Server . .4 1.2.3 Simple, Self-Contained Demos . .5 1.3 Challenges . .5 1.3.1 Performance . .5 1.3.2 Poor Generality . .7 1.3.3 Manual Implementation in JavaScript . .7 2 The TensorFlow Architecture 7 2.1 TensorFlow's API . .7 2.2 TensorFlow's Implementation . .9 2.3 Portability . .9 3 Compiling TensorFlow into JavaScript 10 3.1 Motivation to Compile . 10 3.2 Background on Emscripten . 10 3.2.1 Build Process . 12 3.2.2 Dependencies . 12 3.2.3 Bitness Assumptions . 13 3.2.4 Concurrency Model . 13 3.3 Experiences . 14 4 Results 15 4.1 Benchmarks . 15 4.2 Library Size . 16 4.3 WebAssembly . 17 5 Developer Experience 17 5.1 Universal Graph Runner . -
Dynamically Loaded Classes As Shared Libraries: an Approach to Improving Virtual Machine Scalability
Dynamically Loaded Classes as Shared Libraries: an Approach to Improving Virtual Machine Scalability Bernard Wong Grzegorz Czajkowski Laurent Daynès University of Waterloo Sun Microsystems Laboratories 200 University Avenue W. 2600 Casey Avenue Waterloo, Ontario N2L 3G1, Canada Mountain View, CA 94043, USA [email protected] {grzegorz.czajkowski, laurent.daynes}@sun.com Abstract machines sold today have sufficient amount of memory to hold a few instances of the Java virtual machine (JVMTM) Sharing selected data structures among virtual [2]. However, on servers hosting many applications, it is machines of a safe language can improve resource common to have many instances of the JVM running utilization of each participating run-time system. The simultaneously, often for lengthy periods of time. challenge is to determine what to share and how to share Aggregate memory requirement can therefore be large, it in order to decrease start-up time and lower memory and meeting it by adding more memory is not always an footprint without compromising the robustness of the option. In these settings, overall memory footprint must be system. Furthermore, the engineering effort required to kept to a minimum. For personal computer users, the main implement the system must not be prohibitive. This paper irritation is due to long start-up time, as they expect good demonstrates an approach that addresses these concerns TM responsiveness from interactive applications, which in in the context of the Java virtual machine. Our system contrast is of little importance in enterprise computing. transforms packages into shared libraries containing C/C++ programs typically do not suffer from excessive classes in a format matching the internal representation memory footprint or lengthy start-up time. -
Proceedings of the Third Virtual Machine Research and Technology Symposium
USENIX Association Proceedings of the Third Virtual Machine Research and Technology Symposium San Jose, CA, USA May 6–7, 2004 © 2004 by The USENIX Association All Rights Reserved For more information about the USENIX Association: Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: [email protected] WWW: http://www.usenix.org Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. MCI-Java: A Modified Java Virtual Machine Approach to Multiple Code Inheritance Maria Cutumisu, Calvin Chan, Paul Lu and Duane Szafron Department of Computing Science, University of Alberta {meric, calvinc, paullu, duane}@cs.ualberta.ca Abstract Java has multiple inheritance of interfaces, but only single inheritance of code via classes. This situation results in duplicated code in Java library classes and application code. We describe a generalization to the Java language syntax and the Java Virtual Machine (JVM) to support multiple inheritance of code, called MCI-Java. Our approach places multiply-inherited code in a new language construct called an implementation, which lies between an interface and a class in the inheritance hierarchy. MCI-Java does not support multiply-inherited data, which can cause modeling and performance problems. The MCI-Java extension is implemented by making minimal changes to the Java syntax, small changes to a compiler (IBM Jikes), and modest localized changes to a JVM (SUN JDK 1.2.2). The JVM changes result in no measurable performance overhead in real applications. -
7. Control Flow First?
Copyright (C) R.A. van Engelen, FSU Department of Computer Science, 2000-2004 Ordering Program Execution: What is Done 7. Control Flow First? Overview Categories for specifying ordering in programming languages: Expressions 1. Sequencing: the execution of statements and evaluation of Evaluation order expressions is usually in the order in which they appear in a Assignments program text Structured and unstructured flow constructs 2. Selection (or alternation): a run-time condition determines the Goto's choice among two or more statements or expressions Sequencing 3. Iteration: a statement is repeated a number of times or until a Selection run-time condition is met Iteration and iterators 4. Procedural abstraction: subroutines encapsulate collections of Recursion statements and subroutine calls can be treated as single Nondeterminacy statements 5. Recursion: subroutines which call themselves directly or indirectly to solve a problem, where the problem is typically defined in terms of simpler versions of itself 6. Concurrency: two or more program fragments executed in parallel, either on separate processors or interleaved on a single processor Note: Study Chapter 6 of the textbook except Section 7. Nondeterminacy: the execution order among alternative 6.6.2. constructs is deliberately left unspecified, indicating that any alternative will lead to a correct result Expression Syntax Expression Evaluation Ordering: Precedence An expression consists of and Associativity An atomic object, e.g. number or variable The use of infix, prefix, and postfix notation leads to ambiguity An operator applied to a collection of operands (or as to what is an operand of what arguments) which are expressions Fortran example: a+b*c**d**e/f Common syntactic forms for operators: The choice among alternative evaluation orders depends on Function call notation, e.g. -
A Survey of Hardware-Based Control Flow Integrity (CFI)
A survey of Hardware-based Control Flow Integrity (CFI) RUAN DE CLERCQ∗ and INGRID VERBAUWHEDE, KU Leuven Control Flow Integrity (CFI) is a computer security technique that detects runtime attacks by monitoring a program’s branching behavior. This work presents a detailed analysis of the security policies enforced by 21 recent hardware-based CFI architectures. The goal is to evaluate the security, limitations, hardware cost, performance, and practicality of using these policies. We show that many architectures are not suitable for widespread adoption, since they have practical issues, such as relying on accurate control flow model (which is difficult to obtain) or they implement policies which provide only limited security. CCS Concepts: • Security and privacy → Hardware-based security protocols; Information flow control; • General and reference → Surveys and overviews; Additional Key Words and Phrases: control-flow integrity, control-flow hijacking, return oriented programming, shadow stack ACM Reference format: Ruan de Clercq and Ingrid Verbauwhede. YYYY. A survey of Hardware-based Control Flow Integrity (CFI). ACM Comput. Surv. V, N, Article A (January YYYY), 27 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn 1 INTRODUCTION Today, a lot of software is written in memory unsafe languages, such as C and C++, which introduces memory corruption bugs. This makes software vulnerable to attack, since attackers exploit these bugs to make the software misbehave. Modern Operating Systems (OSs) and microprocessors are equipped with security mechanisms to protect against some classes of attacks. However, these mechanisms cannot defend against all attack classes. In particular, Code Reuse Attacks (CRAs), which re-uses pre-existing software for malicious purposes, is an important threat that is difficult to protect against. -
Control-Flow Analysis of Functional Programs
Control-flow analysis of functional programs JAN MIDTGAARD Department of Computer Science, Aarhus University We present a survey of control-flow analysis of functional programs, which has been the subject of extensive investigation throughout the past 30 years. Analyses of the control flow of functional programs have been formulated in multiple settings and have led to many different approximations, starting with the seminal works of Jones, Shivers, and Sestoft. In this paper, we survey control-flow analysis of functional programs by structuring the multitude of formulations and approximations and comparing them. Categories and Subject Descriptors: D.3.2 [Programming Languages]: Language Classifica- tions—Applicative languages; F.3.1 [Logics and Meanings of Programs]: Specifying and Ver- ifying and Reasoning about Programs General Terms: Languages, Theory, Verification Additional Key Words and Phrases: Control-flow analysis, higher-order functions 1. INTRODUCTION Since the introduction of high-level languages and compilers, much work has been devoted to approximating, at compile time, which values the variables of a given program may denote at run time. The problem has been named data-flow analysis or just flow analysis. In a language without higher-order functions, the operator of a function call is apparent from the text of the program: it is a lexically visible identifier and therefore the called function is available at compile time. One can thus base an analysis for such a language on the textual structure of the program, since it determines the exact control flow of the program, e.g., as a flow chart. On the other hand, in a language with higher-order functions, the operator of a function call may not be apparent from the text of the program: it can be the result of a computation and therefore the called function may not be available until run time. -
Onclick Event-Handler
App Dev Stefano Balietti Center for European Social Science Research at Mannheim University (MZES) Alfred-Weber Institute of Economics at Heidelberg University @balietti | stefanobalietti.com | @nodegameorg | nodegame.org Building Digital Skills: 5-14 May 2021, University of Luzern Goals of the Seminar: 1. Writing and understanding asynchronous code: event- listeners, remote functions invocation. 2. Basic front-end development: HTML, JavaScript, CSS, debugging front-end code. 3. Introduction to front-end frameworks: jQuery and Bootstrap 4. Introduction to back-end development: NodeJS Express server, RESTful API, Heroku cloud. Outputs of the Seminar: 1. Web app: in NodeJS/Express. 2. Chrome extensions: architecture and examples. 3. Behavioral experiment/survey: nodeGame framework. 4. Mobile development: hybrid apps with Apache Cordova, intro to Ionic Framework, progressive apps (PWA). Your Instructor: Stefano Balietti http://stefanobalietti.com Currently • Fellow in Sociology Mannheim Center for European Social Research (MZES) • Postdoc at the Alfred Weber Institute of Economics at Heidelberg University Previously o Microsoft Research - Computational Social Science New York City o Postdoc Network Science Institute, Northeastern University o Fellow IQSS, Harvard University o PhD, Postdoc, Computational Social Science, ETH Zurich My Methodology Interface of computer science, sociology, and economics Agent- Social Network Based Analysis Models Machine Learning for Optimal Experimental Experimental Methods Design Building Platforms Patterns