Static Optimization in PHP 7

Popov, N.; Cosenza, B.; Juurlink, B.; Stogov, D. Static Optimization in PHP 7 Conference paper | Accepted manuscript (Postprint) This version is available at https://doi.org/10.14279/depositonce-7080 Popov, N., Cosenza, B., Juurlink, B., & Stogov, D. (2017). Static optimization in PHP 7. In Proceedings of the 26th International Conference on Compiler Construction - CC 2017. ACM Press. https://doi.org/10.1145/3033019.3033026 Terms of Use © Authors | ACM, 2017. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in Proceedings of the 26th International Conference on Compiler Construction - CC 2017, http://dx.doi.org/10.1145/3033019.3033026. Static Optimization in PHP 7 Nikita Popov Biagio Cosenza Ben Juurlink Dmitry Stogov Technische Universitat¨ Berlin, Germany Zend Technologies, Russia [email protected], {cosenza, b.juurlink}@tu-berlin.de [email protected] Abstract In order to support its more dynamic features, PHP, like many PHP is a dynamically typed programming language commonly other scripting languages, has traditionally been implemented using used for the server-side implementation of web applications. Ap- an interpreter. While this provides a relatively simple and portable proachability and ease of deployment have made PHP one of the implementation, interpretation is notoriously slower than the exe- most widely used scripting languages for the web, powering im- cution of native code. For this reason, an increasingly common av- portant web applications such as WordPress, Wikipedia, and Face- enue to improving the performance of dynamic languages is the book. PHP’s highly dynamic nature, while providing useful lan- implementation of just-in-time (JIT) compilers [2], such as the guage features, also makes it hard to optimize statically. HHVM compiler for PHP [3]. On the other hand, JIT compilers This paper reports on the implementation of purely static byte- carry a large cost in terms of implementation complexity. code optimizations for PHP 7, the last major version of PHP. We In this work, we pursue a different approach: purely static, trans- discuss the challenge of integrating classical compiler optimiza- parent, bytecode-level optimization. By this we mean that a) run- tions, which have been developed in the context of statically-typed time feedback is not used in any form, b) no modification to the languages, into a programming language that is dynamically and virtual machine or other runtime components is required and c) op- weakly typed, and supports a plethora of dynamic language fea- timizations occur on the bytecode of the reference PHP implemen- tures. Based on a careful analysis of language semantics, we adapt tation. The latter point implies that, unlike many alternative PHP static single assignment (SSA) form for use in PHP. Combined with implementations, we must support the full scope of the language, type inference, this allows type-based specialization of instructions, including little used and hard to optimize features. as well as the application of various classical SSA-enabled com- This static approach is motivated by the PHP execution model, piler optimizations such as constant propagation or dead code elim- which uses multiple processes to serve short-running requests ination. based on a common shared memory bytecode cache. As this makes We evaluate the impact of the proposed static optimizations on runtime bytecode updates problematic, many dynamic optimization a wide collection of programs, including micro-benchmarks, li- methods become inapplicable or less efficient. We pursue interpre- braries and web frameworks. Despite the dynamic nature of PHP, tative optimizations partly due to the success of PHP 7, whose our approach achieves an average speedup of 50% on micro- optimized interpreter implementation performs within 20% of the benchmarks, 13% on computationally intensive libraries, as well HHVM JIT compiler for many typical web applications [4]. as 1.1% (MediaWiki) and 3.5% (WordPress) on web applications. Our optimization infrastructure is based on static single assignment (SSA) form [5] and makes use of type inference, both to en- Categories and Subject Descriptors D.3.4 [Programming Lan- able type-based instruction specialization and to support a range of guages]: Processors—Compilers; D.3.4 [Programming Languages]: classical SSA-based optimizations. Because PHP is dynamically Processors—Optimization typed and supports many dynamic language features such as scope introspection, the application of classical data-flow optimizations, Keywords PHP, static optimization, SSA form which have been developed in the context of statically typed languages, is challenging. This requires a careful analysis of problem- 1. Introduction atic language semantics and some adaptations to SSA form and the In order to keep pace with the rapidly increasing growth of the used optimization algorithms. Web, web application development predominantly favors the use of Parts of the described optimization infrastructure will be part of scripting languages, whose increased productivity due to dynamic PHP 7.1. Our main contributions are: typing and an interactive development workflow is valued over the better performance of compiled languages. 1. A new approach to introducing SSA form into the PHP lan- PHP is one the most popular [1] scripting languages used for the guage, including adaptation for special assignment semantics server-side implementation of web applications. It powers some of and enhancement of type inference using π-nodes. the largest websites such as Facebook, Wikipedia and Yahoo, but 2. The implementation and analysis of a wide range of SSA- also countless small websites like personal blogs. enabled optimizations for a dynamic language. 3. An experimental evaluation on a collection of micro-benchmarks, libraries and applications, including WordPress and MediaWiki. © Authors | ACM, 2017. This is the author's version of the work. It is The remainder of the paper is structured as follows: section 2 posted here for your personal use. Not for redistribution. The describes related work on dynamic language optimization. Sec- definitive Version of Record was published in Proceedings of the 26th tion 3 presents relevant PHP language semantics and section 4 dis- International Conference on Compiler Construction - CC 2017, cusses the use of SSA form in PHP. SSA-enabled static optimiza- http://dx.doi.org/10.1145/3033019.3033026. tions investigated in this work are described in section 5. An experi- 1 mental evaluation on micro-benchmarks, libraries and applications representation is used, instead all operations are performed on is presented in section 6. The paper closes with a discussion and the AST level. HPHPc does not support some of PHP’s dynamic conclusion in sections 7 and 8. language features and requires all code to be known in advance. The phc compiler [30] also translates PHP to C. A large focus of 2. Related Work the phc implementation is on accurately modeling the aliasing be- havior of references. To achieve this, flow- and context-sensitive SSA Static single assignment form [5] has become the preferred alias analysis, type inference and constant propagation are per- intermediate representation for program analysis and optimizing formed simultaneously and prior to construction of Hashed SSA code transformations, and is used by many modern optimizing form. In our work we will largely ignore this aspect, because accu- compilers [6–8]. Data-flow algorithms are often simpler to imple- rate handling of references has become much less important after ment, more precise and more performant when implemented on PHP 5.4 removed support for call-time pass-by-reference. Addi- SSA form. Typical examples include sparse conditional constant tionally, issues that will be discussed in section 3.5 effectively pre- propagation [9] and global value numbering [10]. More recently, vent this kind of analysis if PHP’s error handling model is fully SSA form has become of interest for compiler backends as well, supported. because the chordality of the SSA inference graph simplifies regis- A number of alternative PHP implementations leverage existing ter allocation [11]. JIT implementations. Phalanger [31] and its successor Peachpie Specific applications often require or benefit from extensions [32] target the .NET CLR, while Quercus [33] and JPHP [34] target of the basic SSA paradigm. Array SSA form [12] modifies SSA the JVM. HippyVM [35] uses the RPython toolchain. While many to capture precise element-level data-flow information for arrays of these projects report improvements over PHP 5, they cannot for use in parallelization. Hashed SSA form [13] extends SSA to achieve the same level of performance as a special-purpose JIT handle aliasing, by introducing additional µ (may-use) and χ (may- compiler such as HHVM. define) nodes. The ABCD algorithm [14] introduces π-nodes to improve the accuracy of value range inference. In this work, we further extend this idea for use in type inference. 3. Optimization Constraints A focus of recent research has been on the formal verification PHP supports a number of language features that complicate static of SSA-based optimizations [15–17], as well as SSA construction analysis. In the following, we discuss how they affect optimization [18], and destruction [19]. and also justify why we consider certain optimization approaches to be presently impractical. Some of the mentioned issues apply to Dynamic

Static Optimization in PHP 7

Website Migrate Upgrade SLA Hosting Final

An Execution Model for Serverless Functions at the Edge

Report on Biodiversity and Tropical Forests in Indonesia

Toward IFVM Virtual Machine: a Model Driven IFML Interpretation

A Brief History of Just-In-Time Compilation

A Parallel Program Execution Model Supporting Modular Software Construction

CNT6008 Network Programming Module - 11 Objectives

The CLR's Execution Model

Kkcbmco 6.90 for 14.2 Million 65, Over

Pfc6168.Pdf (438.8Kb)

Executing UML Models

Phalanger Compiling PHP for .NET and Silverlight