DIPLOMARBEIT

Caching Strategies for Load Reduction on High Traffic Web Applications

ausgef¨uhrt am Institut f¨urComputersprachen der Technischen Universit¨atWien

unter Anleitung von Ao.Univ.Prof. Dipl.-Ing. Dr. Franz Puntigam

durch Alexander Kirk Stolberggasse 12/12, 1050 Wien

May 9, 2005 Datum Unterschrift 2 3

Abstract

In this thesis we discuss the problem of web applications that have to work under heavy load of a high number of visitors. We evaluate the application Bandnews.org as an example and tune it using various caching strategies. They include caching by a proxy server, a compiler cache, database caching using a query cache and application based caching using Smarty.

This work shows that gain in speed is possible if methods are applied care- fully. We compare and combine caching strategies to come to a stage where every page is generated in reasonable time even under high load.

Kurzfassung

In dieser Diplomarbeit wird das Problem von Web Applikationen behandelt, die unter hoher Last und einer großen Zahl von Benutzern arbeiten m¨ussen. Die Applikation Bandnews.org wird als Beispiel untersucht und mittels ver- schiedener Caching Strategien beschleunigt. Dies beinh¨alt das Cachen mit- tels einem Proxy Server, einem Compiler Cache, Datenbank Caching mittels Query Cache und applikationsbasiertes Caching mittels Smarty.

Diese Arbeit zeigt, dass Geschwindigkeitssteigerungen m¨oglich sind, wenn die Methoden umsichtig eingesetzt werden. Die Caching Strategien werden miteinander verglichen und kombiniert, um eine Stufe zu erreichen in der jede Seite in vertretbarer Zeit geladen wird, sogar unter hoher Last. 4 Contents

Contents 5

1 Introduction 9 1.1 Motivation ...... 9 1.2 Method ...... 10 1.3 Expected Results ...... 11 1.4 Outline of the Thesis ...... 11

2 Terms 13 2.1 Caching ...... 13 2.1.1 Invalidation ...... 14 2.1.2 Privacy ...... 14 2.2 Load ...... 15 2.2.1 Using the uptime command ...... 15 2.2.2 Using the top command ...... 15 2.2.3 Load Averages ...... 16

I Environment 17

3 Application 19 3.1 Bandnews.org ...... 19 3.1.1 Technology ...... 20 3.1.2 Page Structure ...... 21 3.1.3 myBandnews ...... 21 3.1.4 BandnewsCMS ...... 22

4 Tools 23 4.1 Apache ...... 23 4.1.1 History ...... 23 4.1.2 Features ...... 24

5 6 CONTENTS

4.1.3 Alternatives ...... 25 4.2 PHP ...... 26 4.2.1 History ...... 26 4.2.2 Language Basics and Structure ...... 26 4.2.3 Integration with the web server ...... 28 4.2.4 Additional Libraries ...... 29 4.2.5 Alternatives ...... 29 4.3 MySQL ...... 31 4.3.1 PEAR::DB ...... 31 4.3.2 Query Cache ...... 31 4.3.3 Alternatives ...... 32 4.4 Smarty ...... 33 4.4.1 Template Basics ...... 34 4.4.2 Alternatives ...... 36 4.5 Squid ...... 38 4.5.1 Use cases ...... 38 4.5.2 HTTP Acceleration ...... 38 4.5.3 Alternatives ...... 39 4.6 Advanced PHP Cache ...... 40 4.6.1 Concept ...... 41 4.6.2 Alternatives ...... 42 4.7 Advanced PHP Debugger ...... 43 4.7.1 Debugging ...... 43 4.7.2 Profiling ...... 43 4.7.3 Alternatives ...... 43 4.8 ApacheBench ab ...... 45 4.8.1 Alternatives ...... 45

II Tuning the Application 47

5 Evaluation 49 5.1 Goal definition ...... 49 5.2 Processing a Request ...... 51 5.3 Possible Hooking Points ...... 53 5.3.1 Client Request ...... 53 5.3.2 PHP Module ...... 54 5.3.3 Database ...... 54 5.3.4 Application ...... 55 5.4 Bandnews.org ...... 56 CONTENTS 7

5.4.1 Skeleton page ...... 56 5.4.2 Index page index. ...... 57 5.4.3 Search page search.php ...... 58 5.4.4 Links page links.php ...... 58 5.5 Testing ...... 59 5.5.1 Preparations ...... 60 5.5.2 Testing environment ...... 61

6 Squid 63 6.1 Considerations ...... 63 6.1.1 Caching of whole pages ...... 64 6.1.2 Programmer’s view ...... 66 6.1.3 Expected Results ...... 68 6.2 Preparation ...... 69 6.2.1 Configuring Apache ...... 69 6.2.2 Configuring Squid ...... 69 6.3 Results ...... 71 6.3.1 skeleton-t.php ...... 71 6.3.2 pres-skel-t.php ...... 72 6.3.3 index.php ...... 74 6.4 Conclusions for Squid ...... 74

7 APC 75 7.1 Considerations ...... 75 7.1.1 Compiler Cache ...... 75 7.1.2 Code Optimization ...... 76 7.1.3 Outputting Data ...... 77 7.1.4 Programmer’s View ...... 77 7.2 Preparation ...... 78 7.2.1 Output Buffering ...... 78 7.3 Results ...... 79 7.3.1 Results for output testing ...... 82 7.4 Conclusions for APC ...... 84

8 MySQL 85 8.1 Considerations ...... 85 8.1.1 MySQL Query Cache ...... 85 8.1.2 Persistent Connections ...... 86 8.1.3 Query Tuning ...... 86 8.2 Preparation ...... 87 8 CONTENTS

8.3 Results ...... 89 8.3.1 Query Cache ...... 89 8.3.2 Persistent Connection ...... 92 8.4 Conclusions for MySQL ...... 94

9 Smarty Caching 95 9.1 Considerations ...... 95 9.1.1 Caching Page Parts ...... 95 9.1.2 Database Usage ...... 96 9.2 Preparation ...... 98 9.3 Results ...... 100 9.4 Conclusions for Smarty Caching ...... 103

10 Conclusions 105 10.1 Further Work ...... 106

A File Sources 109 A.1 Benchmark Script ...... 109 A.2 Patch Files ...... 112

B List of Figures 119

C List of Tables 121

D List of Listings 123

References 125 Chapter 1

Introduction

1.1 Motivation

As the Internet resp. the World Wide Web (WWW) is gaining more and more popularity, servers have to handle more requests accordingly. The more people (or simply clients) request resources (in this case files) from web servers, the faster servers have to accept and process the requests. To cope with these requirements programmers as well as system administrators must take countermeasures.

From the very beginning of the WWW the requirements for servers have not only changed from the view of traffic, but also from the type of content they deliver to the client. Initially static pages had to be served, today – in 2005 – content is usually taken from a database, and dynamically generated pages are to be transferred.

This development takes the main source of load away from the operating system responsible for reading the files from the hard disk or another type of memory and shifts it to the program that dynamically generates the page.

Also computer hardware has evolved. This makes it possible to have web pages generated the way they are today. Generally speaking, servers are capable of serving most pages in quite a reasonable amount of time. This is true as long as only a small number of visitors request pages to be generated. The larger the number of clients, the more pages have to be generated simultaneously. Multi-tasking enables servers to do so, but CPU capacity is

9 10 CHAPTER 1. INTRODUCTION limited.

If it was only for system administrators, they would add more hardware power (for instance clustering servers, load balancing). Often this can be done only to a certain extent, mainly due to financial but also for logistical reasons. From a programmer’s view, however, algorithms can be optimized (consider an algorithm in O(n2) on a fast computer which can easily be overtaken by a slower one running an O(n)) but also by caching techniques.

The basis for this diploma thesis will be the analysis of caching strategies for this scenario. They will be used to speed up an existing application. The combination of various methods will be tested and benchmarked to reach a stage at which the application runs at reasonable speed even under high load.

1.2 Method

We will explore the topic of this thesis using an existing web site (Band- news.org) as an example to which the caching strategies are applied.

The site consists of an underlying structure which is common to each page. Therefore the examination is not solely restricted to standard pages but also a skeleton page is taken into account. To compare the pages we measure the time for delivery on a single system (i.e. on an Intel PC, see 5.5.2). Due to the nature of different computer systems these results are only valid in a relative way. This method still produces significant results because the differences between versions are at a similar level on faster or slower systems.

We simulate high load on the page using a load generator which effectively makes the server deliver pages simultaneously.

We examine single pages using a profiler – a tool that measures not only the overall performance of page generation, but also the time consumed by single function calls. 1.3. EXPECTED RESULTS 11

1.3 Expected Results

As a result of this work we expect a web application, that delivers pages mul- tiple times faster than an uncached version of the site (considering repetitive calls to have the caching taken into account).

As methods for revealing bottlenecks within the application also faster de- livery is expected for the first call of a web page. This is only considered as a side effect. The thesis will concentrate on caching pages or parts of pages.

1.4 Outline of the Thesis

The paper is organized as follows:

In the first part we will present the application as well as the used tools. As application the web site Bandnews.org (see Section 3.1) was chosen. Tools used are the Apache HTTP Server (Section 4.1), PHP (Section 4.2), MySQL (Section 4.3), Smarty (Section 4.4), Squid (Section 4.5), APC (Section 4.6), APD (Section 4.7), and ab (Section 4.8). The second part, the central part of this diploma thesis, describes and eval- uates the caching strategies to be applied.

In Section 5 we test the original site and chose pages for later evaluation.

The following sections deal with each technique in detail and provide bench- marking results which are analyzed and discussed. These sections include Squid (Section 6), APC (Section 7), MySQL (Section 8), and Smarty Caching (Section 9).

In the conclusion (Section 10) we review the results as a whole. Section 10.1 gives an outlook of how future work can further improve the performance.

The appendix includes source listings and lists of figures, tables and list- ings. 12 CHAPTER 1. INTRODUCTION Chapter 2

Terms

In this section we explain important terms used throughout the thesis.

2.1 Caching

Caching (noun: cache, from the French word cacher – to hide) is the tem- porary storage of data for later retrieval. Necessary for this approach is a certain persistance of the data to be stored. The motivation for caches is the gain of speed whilst trying not to deliver outdated content. The gain of speed manifests in three points (compare with [Wes01]):

• Reduced system load: The retrieval or generation of content is avoided, instead a copy from fast memory is delivered.

• Reduced latency: The decision whether to take a copy from cache or have the data delivered from the orignial source can often be made very quickly. In combination with fast media, a very short response time can be established.

• Less bandwidth consumption: The following goes primarily for hardware and web client proxies: the data does not need to be retrieved over a slow connection but uses a faster one (either because of physical distance or because of higher capacity). Storing the cached copy using compression can also lead to less bandwidth consumption.

13 14 CHAPTER 2. TERMS

Quick retrieval requires fast memory. That is the reason in hardware often small and expensive memory – but fast one – is used.

Generally speaking, a cache should never be visible to the user, it should be a transparent means for speeding up the delivery of data.

2.1.1 Invalidation

A cache needs to provide means for invalidating its contents or parts of it in order to avoid the delivery of outdated data. There are two main concepts for invalidation:

• Invalidation by command: a copy remains in the cache until the cache is explicitly told to dismiss or – which is more likely – to replace it with fresh data.

• Deleting a file in cache by the appliance of a rule, such as an expiry date or a certain number of retrievals.

Both strategies have their advantages. Chosing the right one depends on the circumstances. Command based invalidation proxies need very little logic, but are also very susceptible for delivering invalid data, for example, if an invalidation command gets lost for some reason.

The second approach needs more intelligence for the proxy, but allows mini- mization of delivering old data. As it is difficult to define a lifetime for a cer- tain cached object, though, it is quite easy to implement a last-modification check which compares the version available in the cache to the “real” one. This can be done every few times the resource is requested or – if the check is inexpensive – upon each retrieval.

2.1.2 Privacy

A dangerous field for caching is the privacy of data. Often the contents to be stored can include sensitive data which has to remain uncached. Although this issue should be avoided by using encryption, it is – especially in the WWW – quite common to transmit user specific data through an insecure, plain text channel. 2.2. LOAD 15

Caches should therefore either be aware of a kind of “private flag” or support the tagging of content – storing extra information for data, marking it as a belonging to a certain user. Often this can be implemented in combination with an authenticated (e.g. by user passwords) proxying system.

2.2 Load

The topic of this diploma thesis includes the technical term “load”. Al- though most IT professionals know what load is they would not be able to define it clearly.

A common answer when asking for a definition of “load” is “the degree of occupation of the CPU”. In the Windows operating system the system load is indeed displayed (e.g. in the “Task Manager”) using a percentage between 0% and 100%.

2.2.1 Using the uptime command

In Unix the load is usually recognized by three values that can be displayed e.g. by using the uptime command.

Listing 2.1: Output of the uptime command alex@notebook:~$ uptime 12:32:52 up 2:23, 3 users, load average: 1.16, 1.13, 1.09

According to the man page (1) of uptime these are “the system load averages for the past 1, 5, and 15 minutes.” So these values do not actually represent the load of the system but average values, so this is a mean value for three periods of time.

2.2.2 Using the top command

The current load of the system can be shown by using the top command (see listing 2.2). It also includes the load averages but additionally a percentage for CPU load is being displayed.

Listing 2.2: A part of the output of the top command alex@notebook:~$ top 16 CHAPTER 2. TERMS top - 12:32:52 up 2:23, 3 users, load average: 1.16, 1.13, 1.09 Tasks: 84 total, 1 running, 83 sleeping, 0 stopped, 0 zombie Cpu(s): 5.0% us, 0.3% sy, 0.0% ni, 94.7% id, 0.0% wa, 0.0% hi, 0.0% si

The current occupation of the CPU is split into 7 parts to give a more precise overview. The abbreviations mean the following:

us is short for user and describes the amount of time the CPU spends in user mode (a kind of safe mode for user programs, see [Arc03]).

sy is short for system – the amount of time the CPU spends in kernel mode (e.g. for operating with hardware).

id is short for idle – on desktop computers CPUs spend most of their time doing nothing.

ni is short for nice, specifying the time being used for processes with lower priority (e.g. started using the command nice).

wa is short for I/O wait – the amount of time the system is waiting for an I/O device such as a hard disk.

si and hi are short for soft and hardware interrupt: the time the processor spends with dealing with such signals.

2.2.3 Load Averages

Although the top command gives a quite intuitive overview over the current system load, load averages are highly important for diagnosing the “work- ing” load for a system. The current state is not very informative when the system does not respond due to a highly CPU intense process – (even afterwards) the load average can still be helpful.

Interestingly there does not seem to be a single, valid definition on how the load average is calculated and how it shall be interpreted. According to [Gun03] the load averages are constructed using the CPU run queue and the number of jobs currently running on the CPU. The article describes the calculation in more detail and even provides a second part going even further into detail. Part I

Environment

17

Chapter 3

Application

In this section we describe the application chosen for this diploma thesis. It includes a listing of important criteria which lead to the decision of choosing this application. These do not solely apply to this web site, so many of the methods described will also have similar effects on other pages.

3.1 Bandnews.org

The application chosen for this diploma thesis is Bandnews.org (http:// www.bandnews.org/). It is a portal and search engine for the latest band related news. News items are taken directly from the official source, i.e. the bands’ websites. Pieces of information are extracted automatically and repeatedly throughout the day to present the most recent news first. The news are collected by a spider1, then easily made available on one page.

This web site is very suitable for the thesis, as the following points apply:

• It is continuously affected by changes: regularly band sites are checked for news. This results in about 3 new items per hour. It is even planned to aggregate news every five minutes. Furthermore, each news item can appear on several pages: band page, genre pages, search pages.

1A program that automatically “crawls” web sites, i.e. downloads a page and then moves on to the next one by following the links included in the first page.

19 20 CHAPTER 3. APPLICATION

Figure 3.1: Screenshot of Bandnews.org

• it retrieves its data from a database: at the time of writing – April 2005 – about 34,000 news items are stored in the database. This does not slow down retrieving data in general, but joining against this table becomes expensive. Also full text queries are required for searching.

• it delivers customized pages for each registered user: This makes caching extremely tricky as a page delivered to a user can’t be reused.

These points make the site require “non-standard” caching techniques. A single page is rarely delivered twice which makes a simple caching solution nearly useless.

3.1.1 Technology

The application was built in a so called LAMP environment, an abbreviation for Linux, Apache (see Section 4.1), MySQL (see Section 4.3), and PHP (see Section 4.2).

News aggregation is done via a robot program that fetches news by down- loading the news pages (specified as a “feed”) from the bands’ websites 3.1. BANDNEWS.ORG 21

(which is done with the tool curl). Because the topic of this diploma thesis concentrates on caching strategies for the main site, the process of retrieving is not described in detail.

Bandnews.org is a project by Nader Cserny and the author, Alexander Kirk. The design, public relations, texts, etc. is done by Nader, the author is responsible for programming. The page was built from scratch starting in September 2004, no other was integrated for the main site.

3.1.2 Page Structure

Figure 3.1 shows a screenshot of the main page (http://www.bandnews. org/). The page consists of single news items (the arbitrary number of 6 news items per page was chosen), this is the main content of the site. At the top of the page there is a language selector (German and English are provided) and a meta-navigation which stays the same on each page.

To the left there is the primary navigation: the search form, the band selec- tor (a two-step drop down for selecting a band), a listing of genres (which highlights the current genre and shows sub-genres if there are any). A news language selector (which controls the language of news items to be displayed), a list of recently added bands and statistics can also be found on the left.

The right bar (internally called “side bar”) is part of the myBandnews nav- igation. Site news (e.g. artist of the week, new features) are also displayed there as well as a Top Ten list which shows the bands clicked most often.

3.1.3 myBandnews myBandnews (http://my.bandnews.org/) adds to the complexity of the site as it leads to different pages for each user. When logged in to myBand- news the user can select his/her favorite bands (see Figure 3.2, page 22) his/her personal news page will consist of.

The news are also available as an RSS (Rich Site Summary, nowadays often called Really Simple Syndication) feed which can be used to receive news alerts quickly – there are many 3rd party tools (for the client) to check RSS feeds periodically for news and alert the user in such a case. 22 CHAPTER 3. APPLICATION

Figure 3.2: Screenshot of myBandnews while selecting personal bands

3.1.4 BandnewsCMS

Another interesting feature is the integrated CMS (Content Management System) which provides bands with an easy tool for creating, modifying and deleting their news. These news items are stored in the Bandnews.org database and are integrated with the bands’ homepages by using an iframe and their own style sheet. This is especially useful for bands who only have little budget for their web page and cannot afford their own CMS to make all members have the possibility to post news on their page.

One of the bands using this feature is “When the Music’s over” (http: //www.whenthemusicsover.com/ resp. http://bandnews.org/homepage/ When+The+Music’s+Over/). Chapter 4

Tools

In this section we describe the tools which will be used in this thesis. The sequence does not reflect their later use, it was chosen for reasons of bet- ter understanding. If not stated otherwise, the tools are Open Source and underlie the GNU General Public License (GPL).

4.1 Apache

In this thesis as a web server the Apache HTTP Server is being used. Ac- cording to [Net05] it is the web server software used on most hosts today.

4.1.1 History

The development of Apache started in April 1995 as an evolution of the pub- lic domain HTTP daemon developed by Rob McCool at the National Center for Supercomputing Applications, University of Illinois, Urbana-Champaign.

Originally Apache was a group of patches for the NCSA daemon, with not too many of those patches developed by the newly founded Apache Group. Soon it was evident, though, that the basis lacked extensibility, and so the server was developed with a new design from scratch. Apache 1.0 was re- leased on December 1, 1995.

Already in April 1996 the Apache web server moved to first place in web server popularity.

23 24 CHAPTER 4. TOOLS

The versions of Apache used today are 1.3 and 2.0. While 1.3 was an evolu- tion from the first version, extended with various modules, version 2.0 was once again a new design that intends to match the requirements to web servers in the World Wide Web today. Amongst those features is a better integration of the POSIX thread system and native IPv6 suppport.

Apache runs on several platforms, including Unix based systems and Win- dows NT. Still when referring to an Apache web server, one commonly refers to a Unix or even more often to a Linux system. The term LAMP (Linux Apache MySQL PHP – the environment used in this thesis) was coined representing a very common configuration.

The Apache Group has meanwhile approached several other projects that are very important in the field of Open Source software. Amongst others the most important projects are the Jakarta Tomcat web server (for Java based applications), Ant (build system (not only) for Java projects), and Struts (a framework for Java apps).

There is also an Apache License (which also applies to the HTTP server) that is primarily based on the BSD license. See [Hub04].

Today the market share of the Apache web server is very close to 70%.

4.1.2 Features

Basically the Apache web server is designed to serve data via the HTTP protocol (versions 1.0 and 1.1). Its functionality can be enhanced by a great variety of modules.

Apache HTTP server 1.3 implements a so-called pre-forking model. The term forking describes the generation of child processes for a father process controlling those sub-processes. With the web server a certain number of child processes is generated without immanent need for them. When several requests arrive, though, no time needs to be spent for forking a new pro- cess but the existing processes can be used. If there are not enough child processes, even more can be generated.

This preforking model can be seen as a replacement for threading. Version 2.0 of the Apache web server implements threads which can increase speed in many scenarios. This is because current operating systems heavily support 4.1. APACHE 25 threads as an alternative to forking; threads can be split to different CPUs in multiprocessor (SMP) systems.

Also third party modules are supported which lead to a large variety of new capabilities for the web server. For example, PHP (see 4.2) is commonly integrated as a module, allowing higher performance than CGI.

Other important modules are all sorts of authentication modules (via LDAP, MySQL, DBM, etc.), (highly configurable) logging and rewriting modules (modify the request URI before processing).

4.1.3 Alternatives

For a Linux system there are very few alternatives. The greatest rival ac- cording to [Net05] is Microsoft’s IIS which is only available for the desktop monopoly operating system Windows (NT).

The only real alternative to Apache under Linux is the Zeus Web Server (developed by Zeus Technology Ltd., receiving some coverage in [Mid02]), claiming to be the fastest web server available. As it is not available on a free basis (and is far from Open Source) it was not taken into greater consideration. 26 CHAPTER 4. TOOLS

4.2 PHP

The web application is implemented in the programming language PHP which stands for “PHP: Hypertext Preprocessor”. It is a nowadays quite commonly used language for creating dynamic web pages.

4.2.1 History

Development of PHP was started by Rasmus Lerdorf in 1995, at that time it was called “Personal Home Page Tools”, a set of Perl programs that did some tracking of accesses to his homepage. Later (1997) he re-coded it in C to provide some more features (e.g. easy access to databases) and called it PHP/FI (“Personal Home Page / Forms Interpreter”).

PHP 3 (released in June 1998) was the first version similar to PHP most web sites use today. It was highly extensible and provided a solid infrastructure for many databases and protocols. In the end of 1998 PHP hundreds of thousands of web servers reported to have PHP installed which was approx- imately 10% of the WWW’s servers.

The break through for PHP came with version 4 (released 1999, using the “”, named after the and ). Many more web servers were supported as well as new features for programmers such as HTTP session support and output buffering as well as security en- hanced methods for receiving user data (“magic quotes”). Several millions of sites report today that they use PHP 4 (about 20% of the WWW’s servers).

One of the biggest drawbacks of PHP was fixed with version 5, released only in 2004. It provides full-featured OOP as earlier versions only had rudimen- tary support for it, e.g. inheritance was supported but no encapsulation.

4.2.2 Language Basics and Structure

PHP is a language specialized on delivering web pages. PHP code is therfore simply integrated with (existing) HTML pages. A simple “Hello World” script would look like the one shown in listing 4.1. 4.2. PHP 27

Listing 4.1: Hello World in PHP – helloworld.php

1 Hello World Example

2

The code is declared as PHP by using special tags (; similar to ASP’s <% and %>). The text between opening and end tag is compiled and interpreted, the HTML code is sent untouched to the client.

PHP is a scripting language. This means that there is no need for the pro- grammer to compile the script before it can be executed. This enables quick prototyping which matches the requirements of web applications: modifica- tions have to be integrated quickly.

In PHP, variables (specified by a dollar sign, e.g. $variable) do not have to be declared before they can be used, they are from the programmer’s view type-free1, so you can use the same variable for calculations (e.g. as integer) and outputing (as a string) without having to take further care.

Reflecting the web specialized character, there are a few variables that make processing of web pages much easier. The $ GET and $ POST variables au- tomatically contain (in form of an array) the values received from an URL or HTML form, depending on the HTTP method the data was sent to the server.

Given the URL http://localhost/test.php?hello=world the $ GET vari- able would be a one-element array consisting of the key/value pair [hello] => "world".

The $ COOKIE variable contains the values of HTTP cookies (small portions of data to be stored on the client side). $ SERVER contains server-set data such as the file system path of the script being executed, or the IP address of the remote client. The $ SESSION variable is suitable for storing data which is persistent throughout multiple requests of the same client. With this variable in use, PHP takes care of generating a so-called session id (for identifying the user) and setting a cookie containing this id. If clients do not support cookies the session id is appended to each link (URL rewriting) and a hidden field containing this id is added to each form on the delivered page, too.

1This cannot be applied to completely different data types. For example, using an array as a string results in the text “Array”. 28 CHAPTER 4. TOOLS

In PHP, arrays play an important role. Every variable mentioned until now is by definition an array, i.e. a data structure that maintains key/value pairs, which is internally established by using a hash table. PHP provides many functions for arrays (e.g. foreach will cycle through each element) as in everyday use you mostly have to do with structured data. A database query will commonly return an array representing the data in a natural and intuitive way. Multi-dimensional arrays can be created at will (just by specifying an array as value), making it easy to juggle with data.

What makes the language quite special is the great number of functions provided. Compared to other languages there are few internal functions. These are mainly used for variable manipulation (for example for cropping strings). The majority of functions is provided by third party libraries which are integrated with PHP and provide enormous functionality which can be accessed easily because of the initial integration into PHP.

The scope of functions starts at database wrappers for very many database types (most important ones are MySQL, PostgreSQL, Oracle, and Berkeley DB), a library for image manipulation (GD), compression (such as gzip or bzip2) and encryption libraries (mcrypt), ending with libraries for accessing remote services such as up-/downloading, SOAP calls, or XML-RPC. So a large task of PHP is being a framework to third-party libraries.

4.2.3 Integration with the web server

As a external product, PHP needs to be integrated with a web server. Usu- ally this is Apache. Two scenarios are possible:

• Using a module (so-called mod php), the PHP compiler is included with the web server, so if a file is requested that requires parsing, the module is loaded and the output of the executed script is returned.

• With CGI (see section 4.2.5) the PHP compiler is executed as an external program.

The CGI variant is available for all web servers that support a cgi-bin directory, such as IIS on the Windows platform. Generally speaking, this approach should be dismissed when integrating PHP, as a web server module is available which provides an enormous gain of speed. 4.2. PHP 29

4.2.4 Additional Libraries

The popularity of PHP causes the rise of a large number of third party tools which provide even more functionality.

On the one hand there is a “semi-official” database of tools which is called PEAR (PHP Extension and Application Repository). It covers various top- ics such as protocol implementations, abstraction layers for databases, and reference implementations of algorithms (e.g. cryptographic algorithms). The number of maintainers is limited, though. As most of them are ex- perienced programmers, this ensures a high quality of the included com- ponents. Additionally a QA (Quality Assurance) team checks for a high standard. Documentation is provided for each project, often voluminous tutorials, too.

On the other hand many source code repositories exist (e.g. Hotscripts.com) which are usually user-contributed. This has both its good and its bad sides: these repositories contain thousands of pieces of source code, so there are not too many “common problems” which have not been solved yet. As everybody (may the programmer be experienced or not) can contribute anything the quality of code becomes (naturally) highly diverse. What is more critical: documentation is commonly bad.

4.2.5 Alternatives

There are many projects on the market that have either developed their own programming language or modified an existing language for use with the WWW. The alternatives can be classified in two categories: scripted and compiled languages.

• ASP.NET is a solution by Microsoft which is based on the .NET framework and therefore solely supported on Microsoft Windows sys- tems2. In contrast to other projects it is not restricted to a single language, commonly only C# and VB.NET are used, though. For

2The Mono project supports ASP.NET on various platforms by reimplementing the proprietary libraries. Their scope will never be the same as on the Windows platform, but usually little adjustment is needed for getting ASP.NET programs running by using the Mono compiler. 30 CHAPTER 4. TOOLS

supported languages an API is provided which includes common func- tions for web use. In ASP.NET only code compiled to an intermediate language (MSIL) can be executed.

• Java can also be used for web projects, either in form of JSP (Java Server Pages) which is similar to PHP in its design, or as Java Servlets which are “normal” Java programs that implement a certain interface. Java also has to be compiled (the result is an intermediate language, though) and requires a special web server, e.g. Apache Tomcat.

• Perl has been used for generation of dynamic pages for a long time already. Commonly it is integrated with the web server Apache by using the module mod perl which integrates the script interpreter. Perl was not primarily designed for web use, but there are modules for Perl (e.g. CGI.Pm) that provide useful functions for processing web data. Additionally projects such as Mason provide a whole framework that implements templating and has many other useful features for web development.

• (Fast) CGI (Common Gateway Interface) is not a language by itself, but an interface that enables any program that uses stdin, stdout, and stderr for I/O to provide its services via a web server. Usually a directory cgi-bin contains the program or script. When a client requests a file in that directory it is not delivered diretly, but instead the operating system executes the file as an own process, and the web server delivers the output of the program. Usually only the output from stdout is returned, error messages which are usually written to stderr are saved to the web server’s error log. Data provided by the client is sent to the program which allows e.g. form processing. This is either done via command line argument or – for POST requests – via stdin. Using the CGI interface, fast languages such as C can be used for solving CPU intensive problems. On the other hand, there are no supportive functions, so issues trivial to solve e.g. in PHP require a lot of code when using CGI. Fast CGI keeps the program in memory which dismisses the time for loading the program resulting in a gain of speed. 4.3. MYSQL 31

4.3 MySQL

As a DBMS (Database Management System, actually RDBMS with R mean- ing Relational) MySQL was chosen. Throughout the thesis it also will be referred to as “database” which is a quite common mis-naming.

MySQL was originally designed to achieve high performance and believed to be one of the fastest DBMSs currently on the market. MySQL uses the GPL as license which makes it open source and therefore freely available.

An API for PHP is provided which is commonly integrated with PHP and adds to its function pool. Actually this integration is the way MySQL is most commonly used today, the rise of PHP also helped MySQL to emerge.

4.3.1 PEAR::DB

PEAR::DB is not a part of the MySQL distribution and is not solely de- pendant on MySQL either. It is rather an abstraction layer from the PEAR repository (see section 4.2.4) that provides a DBMS independant layer for retrieving data.

Apart from SQL which has to be understood by the DBMS used (many systems use their own flavour of SQL, so does MySQL) switching the DBMS can be easily done by just switching the DSN string (Data Service Name). Also methods for retrieving data (as associative hash, as “normal” array, etc.) do not differ.

4.3.2 Query Cache

Recent versions of MySQL (since 4.0.1) provide a so called query cache – also referred to as Query Folding [Qia96]. SELECT statements are stored together with their results which allows very fast responses when the same query (the exact same string has to be used for querying) is executed the second time.

It is quite common when using a database in connection with a web server that tables do not change very frequently and the same queries are executed over and over. So a large increase of speed can be expected when activating the query cache. 32 CHAPTER 4. TOOLS

4.3.3 Alternatives

There are some alternatives to MySQL that provide additional features which may make them more favourable for certain uses.

• Oracle is a well known, fast and commercial RDBMS. Due to its performance and scalability it has a wide distribution. With the inte- grated programming language PL/SQL many problems can be solved at database level resulting in a fast alternative to their solution at the top level programming language.

• PostgreSQL is often referred to as free Oracle. Indeed it provides a similar set of SQL commands but lacks PL/SQL. This DBMS should be used when DBMS speed is not a crucial point.

• In the latest versions of PHP SQLite is featured as an alternative for MySQL. SQLite is a C library that implements an embeddable SQL database engine. When integrated with PHP, database files are written directly – it does not act as wrapper for an external engine. Indeed for smaller projects SQLite is sufficient, but it should be avoided when dealing with large amounts of data.

MySQL was chosen for its common use in Open Source projects and its speed. Even though license problems arose in 2004 for using it with PHP, it can now be recommended as a special license for this case of appliance has been published. 4.4. SMARTY 33

4.4 Smarty

Another tool used in this diploma thesis is the Smarty Template Engine. It is a tool – written in PHP, created by Monte Ohrt and Andrei Zmievski in 2001 – to separate program logic, i.e. the PHP code, from design, stored in so-called template files.

Figure 4.1: The MVC design pattern

Model PHP Script

View Controller Smarty PHP

Figure 4.1 shows the MVC (Model-View-Controller [KP88]) design pattern which is often tried to be applied on web applications. Using Smarty this pattern can be implemented with separation into these components:

• The Model specifies the part of the application that handles the busi- ness logic, i.e. the actual problem is solved here. In this scenario this part is taken by the PHP scripts written by the programmer.

• Smarty is used for the View component which is responsible for han- dling the output and its formatting.

• The processing of the input is done by the Controller, in this case PHP. It handles, for example, the transistion from a parameter in the HTTP URI to a variable.

In a scenario without Smarty, View and Model are mixed. This would not only dismiss the design pattern but also reduce reusability of source code [Par04]. The use of Smarty contrasts the design goal PHP originally implements. In fact Smarty only acts as a layer within a PHP script – this is quite obvious as it is coded in PHP itself. 34 CHAPTER 4. TOOLS

Figure 4.2: Three-tier architecture

This also represents the common three-tier architecture (see Figure 4.2). It is quite desirable (also in other parts of information engineering) to split apart the data (first tier), the business logic (second tier) and the presentation (third) tier. The MVC model is a corresponding design pattern. More benefits from the three-tier architecture are discussed in [Swe01].

In a company the roles of programmer and layout designer are separate. This is supported and even pushed by Smarty because designer and programmer can concurrently work on the same page with the designer changing the appropriate .tpl file while the programmer makes changes to the PHP code. Therefore, the use of Smarty is highly recommended.

4.4.1 Template Basics

Template files are quite similar to “normal” PHP files, they embed their logic into HTML. A Hello World example using Smarty in combination with PHP would look like this:

Listing 4.2: Hello World in Smarty – hello.tpl

1

2 Hello World Example with Smarty

3 {$hello}

Listing 4.3: Hello World in Smarty – hello.php

1

2 include("Smarty.class.php");

3 $smarty = new Smarty();

4 $hello = "Hello World!";

5 $smarty->assign("hello", $hello);

6 $smarty->display("hello.tpl");

7 ?> 4.4. SMARTY 35

In this example, the variable $hello is displayed within the template file, just by putting it into curly brackets. This is the default setting for inte- grating logic and variables in .tpl files3. The variable does not go together with those from PHP. They have to be explicitly assigned to Smarty (line 5 of hello.php) to have it accessible in hello.tpl4. After that the Smarty command for displaying the template file is called.

Listing 4.4: Highlighting alternating lines – alternate.tpl

1 Alternate Backgrounds

2

3 {section name=d loop=$data}

4

5 {if $smarty.section.d.first}

6 bgcolor="#CC0000"

7 {elseif $smarty.section.d.index is even}

8 bgcolor="#CCCCCC"

9 { else }

10 bgcolor="#DDDDDD"

11 {/ if}

12 >{$data[d]}

13 {/ section }

14

For outputting arrays assigned from PHP in Smarty the helper functions foreach and section are available. In “sections” arrays are traversed with keys from 0 to n. foreach acts the same way as in PHP, provid- ing access to key and value for each entry of the array. While looping, the $smarty.section variable (resp. $smarty.foreach) is filled with values to be used for design functionality. As an example, listings 4.4 and 4.5 show how a table with alternating background colors is generated (see Figure 4.3 for a screenshot of a web browser displaying the page).

Listing 4.5: Highlighting alternating lines – alternate.php

1

2 include("Smarty.class.php");

3 $smarty = new Smarty();

3The delimiters can also be changed to e.g. to establish a certain degree of XML (parsing) confirmity 4The name of the assignment and the variable name do not have to be equivalent, this is only the case in this example. 36 CHAPTER 4. TOOLS

4 $data = array();

5 for ($i = 0; $i < 10; $i++) {

6 $data[] = "value" . $i;

7 }

8 $smarty->assign("data", $data);

9 $smarty->display("alternate.tpl");

10 ?>

Figure 4.3: Screenshot of the output of the alternating backgrounds example

In this short example several more aspects of Smarty and PHP are shown. The program logic of if/elseif/else is available to Smarty for doing sim- ple tasks (intended for design-oriented conditionals, something just like in the example above). The $smarty.section array provides common states, for instance the current index or whether it is the first or last iteration of the loop. Array values are accessed in a PHP like form (index within squared brackets) when using sections.

Finally it is important to state that it should be avoided to integrate logic that does not solely affect (visual) design.

4.4.2 Alternatives

The idea of templating PHP is quite common and various such projects exist.

• HTML Template IT is part of the PEAR repository, developed by Ulf Wendel. Contrary to Smarty no programming logic (such as if) 4.4. SMARTY 37

is provided. Repeated sections (for array output) are declared by specifying a beginning and ending position. The overall performance is said to be good, though, no caching is supported.

• patTemplate – created by Stephan Schmidt – heavily relies upon XML notation: Templates are defined using a certain tag, for alter- nating table rows as in the example above there is an even/odd clas- sification. Variables are inserted as XML tags.

Smarty was chosen for its features and its steady improvement. 38 CHAPTER 4. TOOLS

4.5 Squid

A proxy server is a program that acts in favour of a client by means of requesting data and returning it to the client. In computer security this would be referred to as a kind of man-in-the-middle. There exist proxies for various application. In this diploma thesis Squid is used as “a full-featured Web proxy cache”, i.e. it is capable of proxying requests of the protocols HTTP and FTP.

4.5.1 Use cases

Squid is a proxy server for use on Unix/Linux systems (Windows NT is only supported via cygwin). Usually it is installed on a server that acts as a gateway for a (local area) network. Several configurations are possible:

• Proxying HTTP or FTP requests for clients with non-routable ad- dresses (such as 192.168.x.x) as an alternative to NAT (Network Address Translation).

• Reducing traffic for frequently visited sites by acting as a caching proxy.

• Controlling the accessiblity of web pages: white or black lists, restric- tion on a time basis, access via user id and password.

4.5.2 HTTP Acceleration

Neither of these configurations really seems to match the topic of this diploma thesis and the idea of caching web applications itself. However there is another configuration called “HTTP acceleration” which forwards requests to a web server which resides on the same machine. This is often also known as reverse proxying.

The idea of using a proxy on the same server as a web server has to do with design goals of the two programs.

A web server has to provide several features for processing files to be served (in the case of PHP, for example, the interpreter is commonly integrated with the Apache web server via module). For each request a copy of the 4.5. SQUID 39 executable must be held in memory. Therefore, the larger the executable is, the higher the memory consumption will be for a number of requests.

Proxy servers are designed to be very light-weight programs that primarily serve the goal of collecting requests and – this is a crucial point – then do their proxying: contact the web server.

Considering several clients accessing the web server at the same time, for each request an executable has to be loaded and held in memory until the request is completed. With HTTP acceleration requests are collected by the small proxy program (which consumes particularily little memory) and can therefore take many more requests than the web server itself. Only after the request has been transmitted completly the web server is contacted to collect the pages.

4.5.3 Alternatives

• pound is a proxy that is specially designed for the use of reverse proxying as well as load balancing. It has been developed by Robert Segall since 2003.

• The mod proxy Apache module can also be used for proxying requests. This is only useful, though, when the proxy resides on its own machine.

Squid was chosen for its being commonly used in production environments and its availability through standard shipping of most Linux distributions. 40 CHAPTER 4. TOOLS

4.6 Advanced PHP Cache

APC is a tool that speeds up execution of PHP scripts by caching the compiled script in the immediate language which is eventually executed. It was written in 2000 by George Schlossnagle, Daniel Cowgill and Rasmus Lerdorf.

Figure 4.4: PHP script execution

compile main script

compile execute included script main script

execute included script

complete

The idea of reducing execution time is based on the mechanism how PHP executes a script (see Figure 4.4). This is basically done in two steps:

1. The source file is read, parsed and converted to intermediate language (“compiled”).

2. PHP, i.e. the Zend Engine virtual machine, executes the intermediate code.

These two steps have to be done every time a script is requested – the compiled result is dismissed after execution. The same goes for each file that is included during execution. 4.6. ADVANCED PHP CACHE 41

4.6.1 Concept

While this procedure is by design and fits the requirement of a scripting language to have the ability to make changes to a file without further ado, the amount of changes commonly exceeds the number of executions by far. What is more: for many scripts – especially those with many “includes” – it often takes PHP longer to convert the script into intermediate language than to execute it.

In fact step 1 stays the same for most requests (except when a modification was made). APC implements the idea of caching the compilation results until a modification was made to (one of) the PHP source file(s).

APC works as a loadable module for PHP which is simply integrated by specifying it in php.ini: extension = /usr/lib/php4/apc.so

Figure 4.5: Script execution with compiler cache

load compiled load compiled script main script

yes is script cached?

load compiled no execute included script has script yes main script no been modified?

execute included script load insert compile compiled script script script from to cache cache complete

It instantly starts working5 when PHP resp. the web server is restarted. The defaults reserve a total storage space of 30 mega bytes for caching compiled scripts. Figure 4.5 shows how the cache is being used. Grey boxes show where the cache repository is accessed.

5This is done by subclassing the file loading routines of PHP. 42 CHAPTER 4. TOOLS

4.6.2 Alternatives

There are quite a few compiler caches around also worth a try.

• Zend Accelerator is a commercial compiler with closed source. It was developed by the people who mainly designed the Zend Engine which makes this one quite fast. The drawback is that it is not for free.

• Turck MMCache is an open source compiler cache developed by the company Turck Software St. Petersburg. It is one of the fastest compilers but development stopped in November 2003.

• ionCube Accelerator was developed by Nick Lindridge and is dis- tributed by his company, ionCube, for free but with closed source.

The authors choice was APC for its ongoing development and the PHP open source license. 4.7. ADVANCED PHP DEBUGGER 43

4.7 Advanced PHP Debugger

The tool called APD is primarily a debugger that can be integrated with PHP. Mainly it provides functions and tools for debugging and profiling. APD acts as a PHP module and is activated and controlled using PHP functions which are provided by the module.

4.7.1 Debugging

The debugging functions of this tool provide the “standard” range of com- monly used debuggers. This includes the setting of break points, debugging output, printing of stacks and currently used variables, and overriding or renaming of functions.

For this diploma thesis debugging will not be thoroughly used as a working and approved application is being tested, supposing that no bugs affect the caching procedure (and if, only in a relative measure).

4.7.2 Profiling

Profiling is an important tool for reaching the goal of this diploma thesis. It can be used to spot inefficiencies in source code by measuring the amount of time the processor spent in each function. While the script is being executed, a trace file is generated including compiled information about the on-goings of the current execution.

Afterwards this trace file can be processed with the included tool pprof to gather the information recorded. The output of the tool can be customized to the needs of analysis through several options. For example, a call tree can be printed showing the functions called including their dependencies. The tool is also capable of listing totals for functions such as time and memory consumed or times of calls.

4.7.3 Alternatives

• Xdebug, created by Derick Rethans, is currently at the verge to ver- sion 2 which will enhance its functionality by great a deal. Also the 44 CHAPTER 4. TOOLS

older version 1 has its bonus points, for example the profiling output can be directly appended to the generated page.

• DBG is a program developed by Dmitri Dmitrienko. Quite a few IDE use it for their debugging capabilities. Under Linux a GDB (The GNU debugger) like interface is provided with a reduced command set. Therefore the visualisation tool DDD can be used as a GUI. Support for this application is still quite limited, though this might change in future versions. Contrary to APD, no modifications to source code need to be made.

Choosing the right tool for this thesis was hard, as all three of the intro- duced tools have good and distinct features. If it was for debugging only, a combination of all tools for different cases would have been the best choice. As mainly profiling is done, APD provides the best functions, especially the tool for processing trace files sets it apart from the other tools. 4.8. APACHEBENCH AB 45

4.8 ApacheBench ab

To retrieve measurable results a load generation tool is used for enumerating the amount of requests a server is able to process in a given time. ab is a tool that is capable of doing so. It ships together with the Apache web server. Adam Twiss of Zeus Technology Ltd started its development in 1996 which was continued in 1998 by the Apache Software Foundation.

ApacheBench is a tool that just does its task of load generation, not much more. The most important settings used are the number of requests (speci- fied by command line option -n) and the number of concurrent connections (-c).

4.8.1 Alternatives

• httperf was developed by David Mosberger and Tai Jin at Hewlett- Packard Research Labs in 1998. It allows a broader range of functions than ab. What sets it apart from ab is the ability to request multiple pages in form of a “user session”.

• Siege, created by Jeffrey Fulmer in 2000, is another load generator that has some modes for how multiple sites are used for stressing the web server (incrementally or randomly). 46 CHAPTER 4. TOOLS Part II

Tuning the Application

47

Chapter 5

Evaluation

5.1 Goal definition

Before the analysis of the web application can be started, the goal of the tuning has to be defined. This is done in order to find the points on where to install caching mechanisms.

Primarily the web application should be optimized regarding load of the web server. That means the generation of a single web page should not be very CPU intensive. There are mainly three spots where the CPU is involved heavily:

• The web server (program) takes processor time for reading direc- tories and files, forking to other processes and some other configured extensions or modules (such as mod rewrite for manipulating the URI before the request is processed further).

• The RDBMS needs the CPU for reading files, building and process- ing quite complex data structures (e.g. b+-trees), searching through indices and data manipulation.

• The script interpreter uses time for lexing and interpreting files, execution of the specified script and the handling of data structures the language provides.

49 50 CHAPTER 5. EVALUATION

Reducing the CPU load mainly works through following two schemes:

• Improving algorithms: Often there exist other algorithms that ful- fil the same requirements but differ regarding to speed and memory consumption. A better algorithm can decrease the load because it solves the problem in a more elegant way.

• Caching: Considering the daily usage of a web server, requested pages or scripts, return the same or very similar data on each request. It appears to be wasteful to re-execute the same code over and over (con- suming CPU time) and eventually receiving the same output anyway. So a good method for reducing load is to not even start to produce the load, but to deliver a cached copy.

For web servers the caching strategy seems to be the best and biggest chance to not only reduce the need for CPU time, but also for speeding up the service. Still, this is not that simple and evident because there are many points where caching can be applied.

What is more important: often it is not trivial to implement a caching module because of the complexity of the problem or application. 5.2. PROCESSING A REQUEST 51

5.2 Processing a Request

The sources of load as well as the recurring processes can best be understood by having a closer look at a script is requested by a client and returned by the server. This is shown in Figure 5.1.

Figure 5.1: Processing a Request

Client

1

8 Server 2

Web Server 3

4 7 5 6 PHP Script RDBMS PHP Module 7

1. The client establishes a connection to the web server via TCP, typically to port 801, where the web server is listening for connections.

2. The request for a document is sent to the server using the HTTP protocol. This would look typically like this

GET /index.php HTTP/1.1 Host: www.bandnews.org Accept-Language: en Content-encoding: utf-8

(The first two lines are a minimum request for using NameVirtualHosts – multiple web sites residing on one IP address – with HTTP/1.1)

3. As soon as the request is fully transmitted (this is only a short GET request, but especially POST requests can become very long, e.g. for

1This is a so called well-known port, defined by IANA. 52 CHAPTER 5. EVALUATION

transmitting files) it is processed by the web server: the correct vir- tual host is selected, preprocessing modules (e.g. URL rewriting) are executed, the existance of the requested file is checked, and eventually the action for the request is chosen and applied.

4. If the file is of a MIME type for which there exists a responsible mod- ule, it is loaded and processes the file. Otherwise the web server simply delivers the file (skip to step 8). Here an example for a MIME type definition in /etc/apache2/mods-avaliable/php4.conf (default lo- cation for debian-based systems):

AddType application/x-httpd-php .php .phtml .php3 AddType application/x-httpd-php-source .phps

5. The web server module (a script interpreter in this case) reads, parses and executes the file.

6. A data base connection can be invoked – either by establishing a new connection or reusing a persistent one. Also other third party tools (libraries, etc.) required by the script are invoked at this point.

7. Once the script has been executed, its output is delivered as if it was the content of a file.

8. The file (or output of the script) is returned to the client, reusing the TCP connection established by the client.

9. HTTP 1.1 [FGM+99] allows the client to reuse this TCP connection for further requests (this is called “keep alive” [Mog95], go back to step 2) if the server is configured to allow this behaviour. 5.3. POSSIBLE HOOKING POINTS 53

5.3 Possible Hooking Points

The description of the request-process reveals the following points where caching might have a good effect (to be evaluated). These considerations are kept fairly general first and will later in Section 5.4 be focussed on the example application, Bandnews.org.

5.3.1 Client Request

A client most commonly will request the home page (or index page) first. Usually this is the page named / or /index.php. This sounds like a good opportunity to cache those pages and not even have the script touched, and deliver a previously stored copy instead. Unforunately this is not easily possible with a script generated page. Is it randomly displays parts on the page or other highly dynamic contents; most commonly the page cannot be cached as a whole (but in parts, see Section 9).

The requests of clients are very similar, but the most important thing they have in common is that they have a slow connection to the server. This does not necessarily mean they add to the server load, but they have the web server program reside longer in memory than necessary: The program has to wait until the TCP connection is closed which will take longer if it is a slow one (the speed of the data to be transmitted can usually only be as fast as the slowest part of the connection).

A solution would be a cache to handle the transmission of the data. Rather this should be described as a one-time cache or buffer, as the data is dis- missed after transmission. It still fits in the field of caching.

A proxy program consuming very little memory will act in favour of the web server, i.e. listens on port 80, receives the connection, and waits for the client to submit its request (how ever long this takes). Then it transmits the request to the web server with full speed by using the loop-back interface when residing on the same machine, or over a fast LAN connection. The result of the request is transferred back again at high speed to the proxy. Now it handles the transmission of the data while the web server executable can be removed from memory or handle the next request. 54 CHAPTER 5. EVALUATION

5.3.2 PHP Module

Each time a script is requested by the client, PHP needs to go through this sequence: it reads the script, parses it, converts it to an executable intermediate code and executes it. If the script requires additional files (“included” files), this procedure has to be repeated for each file. Commonly there are many includes; especially when connecting to databases the login data resides in an included file, a wrapper library is loaded, and so on.

In the procedure necessary for executing a PHP script, the first steps are heavily recurring. The ration of the real need to re-“compile” – this is when the script has been modified – and when it is really done – every time the script is executed – is very unfortunate. For smaller scripts the time for compilation might be longer than the time for execution. For example, a script only outputting a few lines runs considerably faster if the compilation steps are skipped.

The repeated process of compiling can be left out quite easily when its result – the runnable intermediate code – is stored in a cache. To maintain the aspects of a scripting language the script file only has to be checked for modifications when its run – this is by far less expensive (in terms of time) than a re-compilation. If the script is modified, though, a little delay will be added, as the script is both checked for modification and then compiled.

5.3.3 Database

When using databases in combination with web servers there are also op- portunities for caching. Similar queries are executed over and over as most of the content of web sites are not personalized which means most of the queries usually do not differ. As the database changes comparably seldom, caching the database output has a good gain in speed in evidence.

The caching of results would commonly be assigned to the application, be- cause the database knows little about the way data is retrieved from the application. Moving the cache from application to database, allows more efficient cache invalidation, because table modifications can be caught by internal triggers. The cache does not need to be very “intelligent” as queries are re-executed by a computer program. Therefore repeated queries match 5.3. POSSIBLE HOOKING POINTS 55 byte-wise and are consequently easy to detect.

Another hooking point is the establishing of a connection to the database. This can be quite expensive and has to be done every time a script needs to connect to the database. The use of persistent connections can help in this scenario: The connection between application and database is not cut when the script ends but instead lives on merely forever. The drawbacks of these connection types must be kept in mind: bugs in the application can block a persistent connection forever causing the pool of available connections – which is limited by definition – to shrink.

5.3.4 Application

For an application general improvements can hardly be suggested. It heavily depends on the application and the requirements whether and what methods of improvement can be used. Here, the knowledge and experience of the application programmer is enforced. There are some “standard” approaches leading to the result of gaining speed.

An important point with caching is the recognition of recurring patterns. The more often a pattern appears, the better are the chances for efficient caching. Good points for caching are application arranged data combina- tions, for example a result set consisting of combined database queries. Also, intermediate results of algorithms are often worth caching: for instance, data structures used by an algorithm have to be built up first – frequently a pricey thing. 56 CHAPTER 5. EVALUATION

5.4 Bandnews.org

Evaluating opportunities for application level improvement requires a spe- cific look at the application itself. As stated in the introduction (see Section 1.2), a skeleton of a page as well as a few key pages will be examined.

When analyzing tuning potentials we take a two-stage approach: First the page is investigated regarding its structure and elements; for this overview caching opportunities are identified. The second stage involves a profiler which identifies other bottlenecks that suggest a search for better solutions for the critical regions.

5.4.1 Skeleton page

A skeleton page is a page common to every other page of the application. For web sites a separation between two different types of skeletons can be made: An “ultimate” skeleton and a “normal” skeleton that depends on the ultimate one. This can also be shown using a layer model, see Section 5.2.

Figure 5.2: Script layers of an application script

Script

Presentation Skeleton

Application Skeleton

• The Application Skeleton is common to every single page of the ap- plication. It is usually used to include db connection routines, authen- tication mechanisms, loading of modules, such as Smarty, and other common functions. For the application Bandnews.org these function- ality is integrated with the file inc/setup.inc.php. A simple skeleton page is therefore a PHP script that only includes this file. When benchmarking, this file will be called skeleton.php (sometimes variated with an appended -t or -n). 5.4. BANDNEWS.ORG 57

• The Presentation Skeleton builds on the Application Skeleton, so it provides all of its functionality. This is simply established by in- cluding inc/setup.inc.php. Calling the presentation skeleton in a webbrowser results in a page that includes all navigational and recur- ring features of a page. So the added functionality is restricted to presentational code, commonly including a meta navigation, primary navigation such as search box and/or a menu, as well as header and footer.

When looking at Listing 5.1, a separation into 4 documents is clearly visible.

Listing 5.1: A Presentation Skeleton – pres-skel.php

1

2 require(’inc/setup.inc.php’);

3

4 include(’inc/header.inc.php’);

5 include(’inc/menu.inc.php’);

6 include(’inc/sidebar.inc.php’);

7 include(’inc/footer.inc.php’);

8 ?>

If only the first file was included, the script would represent the application skeleton.

The additional files load the corresponding page parts, including database calls when needed.

5.4.2 Index page index.php

The index page belongs to a group of three page types (described below) and is a good representative for a common page of the site. In addition to the presentation skeleton, news items are displayed which underlie constant modification. The selection of the news items (equals the SELECT statement and its WHERE clause for an SQL database query) is based on the page type:

• The Index Page shows the most recent news from all bands and genres, selected by date in a descending order.

• The Genre Page displays the news for a certain genre and its sub genres, with a news item belonging to the genre of the band. As a band 58 CHAPTER 5. EVALUATION

can be classified into more genres, a weak entity is used for establishing this relation. For displaying this page a joining of tables is necessary.

• A Band Page is a filtered view of the Index Page with restriction to a specific band.

The index page which usually receives the most hits on a server and therefore is worth close consideration.

5.4.3 Search page search.php

A search page can be separated into two parts:

• The band search: The search query is used for finding bands which match the expression, not only taking bands into consideration that provide news, but also those that could not be integrated within Band- news.org.

• In the news search based on the query, matching news items are displayed: A selection of news items, just as described at section 5.4.2.

Both points can be included in the testing of the index page. This is due to the MySQL query cache which will be included in testing.

5.4.4 Links page links.php

The links page is somewhat unlike the other page types, but or rather be- cause of that it is worth to take a closer look at it: For each letter of the alphabet, matching bands are displayed on one page, including all relevant information:

• Band name

• Country

• Genres

• Homepage address 5.5. TESTING 59

The interesting thing about the page is that for database design reasons the information is split across several tables: Band name and Homepage address are stored in the bn band table, the country – for translation reasons – in bn country and the genres in bn band genre respectively in bn genre as a band can have more than one genre assigned (an m:n-relation, modeled using bn band genre as weak entity).

There are many (expensive) queries needed to build this page. It is therefore important to know, how the various caching strategies can optimize the speed of this page.

5.5 Testing

When testing the capacity of a web server, there are several things to be considered [BD99]. Aspects like the latency of a WAN connection are not taken into consideration in this thesis, all tests are done via the loopback interface.

In our scenario, we are not testing a web server program delivering text files, but the output of a script instead. This significantly decreases the speed. In a basic test on the author’s system, the Apache web server is able to serve about 3,000 pages of a 2 kilobyte document in a second, while a script generating 2 kb of random data only produces a through put of about 190 requests per second (lacking any tuning).

For each test a certain sequence of steps is maintained to produce comparable results. This is done by using a script developed for this purpose. More information and source code can be found in the appendix.

1. The configuration files are modified to reflect the changes to be tested.

2. The web server and database server are restarted to provide a fresh environment.

3. The script to be tested is requested – without measuring – one or more times (depending on the test case). This is used to load caches. This step can be skipped if the generation of the cache is to be included in the test. 60 CHAPTER 5. EVALUATION

4. A certain amount of time, for instance 10 seconds, the process is paused to ensure no trailing requests block anything.

5. The test run is started. The document is requested some 10 to some 1000 times, also parallel requests (concurrent requests, CCR) are pos- sible.

6. The log file of the test is parsed and converted to a format used for generating a chart.

5.5.1 Preparations

The author’s script for benchmarking can automate many tests as it auto- matically generates the possible testcases with each tool switched on or off. There are a few steps to take until a tool can be used with the benchmark.

Usually changes to one or more text files have to be made to configure a tool to be used and restart the appropriate server. Unfortunately it is not possible to exchange the whole configuration file as two tools might need a change in the same file. The configuration files are therefore patched using a file representing the changes to be made.

When the benchmark script is started it expects all configuration files to be disabled. This can be established by prepending a restoring script. The tools need to be turned off anyway for the process of generating a file that can be used to integrate the tool with the benchmark. See Listing 5.2 for how such a patch file is generated.

Listing 5.2: Creating a patch file for the MySQL query cache

1 cd / tmp

2 cp /etc/mysql/my.cnf .

3 vi my.cnf # do the necessary editing

4 diff -c /etc/mysql/my.cnf my.cnf > mqc

The result is a file containing only the modified lines (plus some contextual lines, so that the file can even be patched if the line numbers do not match). What such a file exactly looks like can be found in the appendix, Listing A.4. 5.5. TESTING 61

Figure 5.3: Typical output while benchmarking

When all necessary patch files are generated, the testing can be started. The benchmark script is invoked with the patch files as command line arguments. A single option can be tested just by specifying one argument. All in all 2n (with n being the number of arguments) test cases will be run through. The scripts to be tested (k) are hardcoded and can be overridden (see appendix A.1). As a grand total k ∗ 2n benchmarks are run.

5.5.2 Testing environment

As testing environment a Pentium 4 2.8 GHz system with 512 MB DDR RAM is used. As operating system Ubuntu Linux Hoary 5.04 has been installed. Versions of the used programs are shown in Lsting 5.3.

Listing 5.3: Program versions

1 $ uname -a

2 Linux main 2.6.10-5-686-smp #1 SMP Tue Apr 5 12:41:40 UTC 2005 i686 GNU/Linux

3

4 $ apache2 -v

5 Server version: Apache/2.0.53

6 Server built: Apr 1 2005 18:17:53

7

8 $ php -v

9 PHP 4.3.10-10ubuntu4 (cli) (built: Apr 1 2005 14:16:27)

10 Copyright (c) 1997-2004 The PHP Group 62 CHAPTER 5. EVALUATION

11 Zend Engine v1.3.0, Copyright (c) 1998-2004 Zend Technologies

12

13 $ mysqld -V

14 mysqld Ver 4.1.10a-Debian_2-log for pc-linux-gnu on i386 (Source distribution)

15

16 $ grep @version libs/Smarty.class.php

17 * @version 2.6.7

18

19 $ squid -v | head -1

20 Squid Cache: Version 2.5.STABLE8

21

22 $ pear info APC | grep Version

23 Version 2.0.4

24

25 $ pear info APD | grep Version

26 Version 0.9.2

27

28 $ ab -V

29 This is ApacheBench, Version 2.0.41-dev <$Revision: 1.141 $> apache-2.0

30 Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http ://www.zeustech.net/

31 Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/

The SMP kernel was installed because of the HyperThreading feature of Pentium 4. Chapter 6

Squid

The Squid Proxy server used here has been already introduced in Section 4.5.

6.1 Considerations

The primary idea which lead the author to integrate Squid and its Server Acceleration Mode1 was to speed up large requests by taking away the part of transfering the data from the client to the server and back from the web server. According to [Wes04] there are many more benefits from this mode:

• Whole pages can be deliverd from Squid’s cache if they have been requested before.

• Squid acts as a kind of dedicated firewall: no direct access to the web server is possible; if it is attacked or compromised, no stored data is lost.

• Load balancing can be quite easily established by using Squid as a reverse proxy.

The two latter points are not taken into greater consideration in this thesis. The first point, though, fits the topic and requires a closer look.

1RFC 3040 [CMT01] calls the proxy server a surrogate in this mode. The term “reverse proxy” is also quite common.

63 64 CHAPTER 6. SQUID

6.1.1 Caching of whole pages

As already mentioned earlier (see Section 5.3.1), it is often difficult to cache whole pages generated by a script. A web application usually also consists of static pages that can be cached. Typically (in scripted environments) these pages are wrapped through a script in order to provide the standard navigational environments (see presentation skeleton, Section 5.4.1).

To understand, how to cache scripted pages as a whole the mechanism of a caching proxy has to be examined in greater detail.

As stated earlier, one of the top priorities of a proxy server is to be trans- parent to the user. So it must be ensured that no stale (=old) copy of a document is delivered to the client. To achieve this a caching proxy has to rely on the HTTP header fields sent by the servers (so called response headers).

The most important fields (also from the view of the web developer) are (compare to [Wes01]):

• Date

• Last-Modified

• Expires

• Cache-Control

• Content-Length

The Date field can be used to detect clock skews that might interfer with other date based headers. The server sends its current time which can be compared to the time on the proxy server. In our case this is somehow useless as web and proxy server reside on the same machine. If the clock is out of sync, the system might have a schizophrenic problem.

Even if the clocks of the systems are not synchronized, the Date field can still be used to convert other fields specifying absolute timestamps to relative time spans. The issue of wrong timestamps is also discussed in [Mog99].

Of greater importance is the Last-Modified field. The time of the last modification (on the server) of the document is stated here. In combination 6.1. CONSIDERATIONS 65 with the Date field, on the one hand, the age of the document can be de- termined. On the other hand, the proxy can use this timestamp to decide whether its stored copy is stale and has to be refreshed or not.

In HTTP/1.0 the request field If-Modified-Since can be used by the proxy to ask the web server about a newer version of a document [BLFF96]. If it had been changed recently, the server simply would answer with a 200 OK response and sends the new version – it acts as if there had not been a If-Modified-Since field in the request. Otherwise the 304 Not Modified message would tell the proxy that its copy is still valid.

HTTP/1.1 provides a field called Entity which can be used to establish a more complex treatment of different versions of documents (e.g. in different languages) on the server. This is discussed in great detail in [Wes01] and [FGM+99].

The Expires field provides the proxy with information about the lifetime of the document. Until this timestamp is reached, the proxy can deliver the page without revalidating it. By providing a timestamp in the past, the proxy can be told to check the file upon the next request for sure.

By using the Cache-Control header, the proxy server can be given specific information about the caching options the document has. Most important values are public vs. private which specifies whether the document con- tains user-specific data. In the latter case the document can only be stored in the client’s personal cache. The no-cache value advises the proxy not to store the document under any circumstances.

For a quick stale check the field Content-length can be used. If the docu- ment size stored in the cache does not match the specified one, the document is likely to have been changed in the mean time. Decisions upon this field are rather risky, though. For example content encodings (e.g. UTF-8 vs. iso8859-1) can lead to different lengths of documents of the same content.

Because of this, proxy servers usually only use this value to verify that they do not store a document that has not been transferred successfully. A client must never receive incomplete data from the proxy. 66 CHAPTER 6. SQUID

6.1.2 Programmer’s view

Knowing the headers the proxy servers rely on, the programmer has to ensure that the web application provides the proxy with the correct headers.

As mentioned earlier there are two types of documents: static and dynamic ones. While the web server commonly takes care of the correct header fields for static data – it obtains, for instance, the last modification date from the operating system – the programmer is fairly left alone with dynamic documents.

Quite an easy case is the wrapping of a static document (mixture of a static and a dynamic document – semi-static), e.g. when an HTML document is embedded with navigational elements of the site. Listing 6.1 shows how the appropriate fields can be set. This piece of code is adapted and enhanced from sample code from [Sch04].

Listing 6.1: Last modification check

1

2 function last_modified_headers($mod_time) {

3 $gmt_mtime = gmdate(’D, d M Y H:i:s’, $mod_time) . ’ GMT’;

4

5 if ($_SERVER[’IF_MODIFIED_SINCE’] == $gmt_mtime) {

6 header(’HTTP/1.1 304 Not Modified’);

7 exit ;

8 } else {

9 session_cache_limiter("must-revalidate");

10 //header(’Cache-Control: must-revalidate’);

11 header(’Last-Modified: ’ . $gmt_mtime);

12 }

13 }

14 $document = ’static.html’;

15 last_modified_headers(filemtime($document));

16 include(’header.inc.php’);

17 include($document);

18 include(’footer.inc.php’);

19 ?>

Note line 9: When using PHP sessions (using the command session start) the header field Cache-Control is rewritten by PHP. When specifying a re- 6.1. CONSIDERATIONS 67

placement value with session cache limiter, this behaviour can be con- trolled. One has to be careful with this option because sessions can affect the users privacy (see Section 2.1.2).

A problem arises when either header or footer are changed. This script will still return an old header. Either the last modification date of all three documents need to be taken into consideration (see Listing 6.2), or, simply the modification time of the static page is changed by using the command touch static.html. It depends on the application (e.g. when data from a database are retrieved in either header or footer) whether or not to choose the former method.

Listing 6.2: Last modification check adapted – pres-skel-t.php

1

2 function last_modified_headers($mod_time) { /* code remains unchanged */ }

3

4 $documents = array(’header.inc.php’, ’static.html’, ’ footer.inc.php’);

5 $last_modification = -1;

6 foreach ($documents as $document) {

7 if ($last_modification < filemtime($document)) {

8 $last_modification = filemtime($document));

9 }

10 }

11 last_modified_headers($last_modification);

12 foreach ($documents as $document) {

13 include($document);

14 }

15 ?>

With dynamic pages it heavily depends on the content whether a time of last modification can be specified at all. Generally speaking the last modification date is the date of the “youngest” part of the page. If for a single part of the page (e.g. included content of a database) this time is unavailable, the last modification for the whole page is unavailable. This case is not unlikely.

That is the reason why caching of whole pages is problematic for dynamically generated pages. Therefore, the solution using a proxy does not suffice for web applications. 68 CHAPTER 6. SQUID

6.1.3 Expected Results

For static documents (or “statified” dynamic documents) we can expect a high gain in speed when using the caching capabilities of the proxy.

It is difficult to estimate what effect turning off a cache has. This case exactly matches the case when testing a purely dynamic script that is incapable of delivering a last modification date. A gain of speed can be expected with many concurrent requests or with slow clients. 6.2. PREPARATION 69

6.2 Preparation

To prepare Squid for the server acceleration mode, the configuration file has to be modified. Squid is by default configured to be a client-side caching proxy.

This section will only give an overview of the most important options. All options that have been modified for the benchmark tests can be found in the patch file in the appendix, Listing A.2.

6.2.1 Configuring Apache

The first consideration is to have the proxy listen on the port of the web server. This is usually the well-known port 80. Before that, the web server must be told to listen on another port because only one program can occupy one port at the same time. The new port of the web server is arbitrary, the proxy server needs to be configured to forward requests to this port anyway.

The necessary change needs to be made to the file /etc/apache2/ports. conf. The line Listen 81 will make Apache listen on port 81, instead of the default port 80.

6.2.2 Configuring Squid

For the Squid proxy server there are some more changes needed. As we have configured the web server to listen on another port, Squid should listen on port 80 instead. This is done via the command http port 80 (see Listing 6.3).

Now we need to tell the proxy where the web server resides. This can be done using the httpd accel host and httpd accel port option. The values 127.0.0.1 (=localhost via loopback interface) and 81 do the correct thing.

As with the default configuration of Squid the developers have taken care of security, there is still an option to be changed. We need to allow everyone (this is the usual purpose of a web server, opposed to the audience of a proxy) to access the proxy-accelerator. The option http access allow all does exactly that. 70 CHAPTER 6. SQUID

These options suffice for turning on server acceleration mode. The modifica- tion of both an Apache configuration file and the Squid file requires a restart of both programs (typically done via the commands2 sudo /etc/init.d/ apache2 restart and sudo /etc/init.d/squid restart). Still there are two more options worth considering:

The acceleration switches automatically turn off the caching-proxy function. It is be advisable to turn the function on again. This can be done via the extra option httpd accel with proxy on. When using the web server in domain virtual host mode (when more than one (sub)domain point to one IP address) the HTTP 1.1 request field Host needs to be transferred via the proxy, too. This is turned off due to se- curity reasons once again. The appropriate option is httpd accel uses host header on.

Listing 6.3: A reduced Squid configuration file – /etc/squid/squid.conf

53 http_port 80

1847 http_access allow all

2185 httpd_accel_host 127.0.0.1

2186 httpd_accel_port 81

2215 httpd_accel_with_proxy on

2235 httpd_accel_uses_host_header on

2The command sudo allows a standard user to execute a command with the rights of the super user (typically root). The scripts are executed with user privileges but for certain commands more rights are required. 6.3. RESULTS 71

6.3 Results

Let us now have a look at the first results.

The testings started with optimized versions of skeleton.php and pres- skel.php (see listings 6.1 and 6.2), providing last modification headers), but without activated Squid.

For the tests there were 10,000 documents requested by the load generator, for the comparison the number of requests per seconds is taken as a measure (the total time of the test run can easily be calculated). Because static documents receive a extra ordinarily high gain of speed (compare Table 6.1 with Table 6.2), that large a number was chosen to receive representative results.

Table 6.1: Benchmarking results (Requests per second): Without Squid Concurrent requests File (.php) 1 5 10 25 50 100 1000 skeleton-t 78.56 91.43 84.69 77.24 69.82 62.08 65.91 pres-skel-t 4.42 5.64 5.70 5.99 5.62 5.99 3.44* * Aborted after 172 requests (because of a time limit of 60s per request)

6.3.1 skeleton-t.php

The (optimized) ultimate skeleton skeleton-t.php is the only page that is completely independent of a database. The results for this page are therefore quite good (i.e. fast, see Figure 6.1). This demonstrates mainly the “full power” of the server, so this is quite the upper border: 3868.6 requests per seconds with Squid turned on and 25 concurrent requests.

Table 6.2: Benchmarking results (Requests/s): With Squid Concurrent requests File (.php) 1 5 10 25 skeleton-t 2,500.3 3,590.4 3,868.6 3,788.3 pres-skel-t 1,699.9 2,349.8 2,523.4 2,470.1 72 CHAPTER 6. SQUID

6.3.2 pres-skel-t.php

This skeleton is a more realistic test candidate (see Figure 6.2). Static pages (such as the About page or the contact form of the application Band- news.org) wrapped through a script behave very similar to pres-skel-t.php. This is only the case if the presentation skeleton stays the same upon each request – a desirable state.

Table 6.3: Benchmarking results (Requests/s): With Squid (cont.) Concurrent requests File (.php) 50 100 1000 skeleton-t 3,681.1 3,385.7 2,709.5 pres-skel-t 2,005.9 1,940.1 1,912.9

As the difference in speed is so extra-ordinarily high (and, therefore, the positive effect cannot be overlooked) we are concentrating on another aspect of high load: Concurrent requests. In the other tests we will only take a look at a maximum of 2 different concurrency rates.

Figure 6.1: Squid benchmark: skeleton-t.php skeleton−t.php 10,000 squid_disabled squid_enabled 3590.44 3868.60 3788.34 3681.12 3385.67 2500.32 2709.50

1,000

100 91.43 Requests per second 78.56 84.69 77.24 69.82 62.08 65.91

10 1 5 10 25 50 100 1,000 Concurrent Requests 6.3. RESULTS 73

Figure 6.2: Squid benchmark: pres-skel-t.php pres−skel−t.php 10,000 Squid_disabled Squid_enabled 2349.80 2523.40 2470.10 1699.90 2005.90 1940.10 1912.90 1,000

100 Requests per second 10 4.42 5.64 5.70 5.99 5.62 5.99 3.44

1 1 5 10 25 50 100 1,000 Concurrent Requests

Figure 6.3: Squid benchmark: index.php index.php 3.00 Squid_disabled Squid_enabled 2.50

2.00 2.642.64 2.25 2.20

1.50

1.00 Requests per second

0.50

0.00 10 100 Concurrent Requests 74 CHAPTER 6. SQUID

Table 6.4: Benchmarking results (Requests per second): index.php Concurr. req. Squid 10 100 off 2.64 2.25 on 2.64 2.20

6.3.3 index.php

Figure 6.3 (Table 6.4) shows benchmark results of index.php with Squid turned on and off. There is no significant difference. In the case of 100 concurrent requests, turning Squid on adds so much overhead that it shows up in the benchmark.

6.4 Conclusions for Squid

Squid can accelerate static and partially static pages massively (up to the factor 420) when using the caching functionality of the proxy.

For dynamic pages Squid is no solution. It can even decrease speed due to the added overhead. This is because Squid can only cache whole pages while often no date of the last modification can be specified.

Considering even higher traffic web applications (when referring to static pages) the (disk) I/O will become a great bottleneck. [Wes04] deals with this topic and possible solutions in great detail.

For dynamic pages a proxy server does not suffice for acceleration. There- fore, more testing is necessary in the following sections. Chapter 7

APC

In this section the Advanced PHP Cache introduced in section 4.6 will be used.

7.1 Considerations

The idea behind APC has been discussed in detail already (see sections 4.6 and 5.3.2). Nevertheless here is a short overview of what APC does:

7.1.1 Compiler Cache

For PHP being a scripting language, the script code has to be compiled to a runnable intermediate format each time the script is executed. This matches the idea of quick prototyping as a change to the script file is immediately applied when the file is saved.

The ratio between necessary compilation and useless recompilation is very bad, though. Especially when the application is finished the recompilations without a code change exceed necessary ones by far.

That is where compiler caches hook in. They store the result and the compi- lation and reuse this intermediate code for the next request. Before that, a quick check is made whether the script has been modified or not, of course. In that case the cached code is invalid and a recompilation is initiated.

The magnitude of this tuning even increases when you consider that the

75 76 CHAPTER 7. APC compilation process has to be initiated for every file that is included by the first script.

7.1.2 Code Optimization

A topic which has not yet been discussed but adds to a speed increase of the compiled PHP scripts is the optimization of code. It is worth spending some effort (and therefore time) on optimizing the PHP code before storing it in the cache. The cost for doing this is minimal considering that the optimized code will be reused a few thousand times at least.

Although there is no documentation for APC on the topic of code optimiza- tion there are still quite a few resources. On the one hand the author of another PHP cache, Nick Lindridge, has written an article covering code optimization [Lin02]. According to [Sch04] the optimizations done are quite similar. On the other hand the source code (long live open source!) is available and documented well enough to give a brief overview on what it does.

These code optimizations should not be compared to what “real compilers” like GCC1 do [Jon05]. They concentrate instead on common cases that gain much from small adjustments. Longer and very comprehensive analysis would eventually not be worth the effort.

Here is just a short overview of these so-called “Peephole optimizations:”

• Removing unnecessary NOOPs. APC simply strips out any NOOP codes it finds. Even that is not trivial as the jumps have to be modified to match the shifted code positions.

• Glue sequences of ADD STRING together. This is quite a drawback of PHP (due to the inline replacement of variables, actually) that it splits a string into parts and executes an ADD STRING command for each word.

• Converting $c++ to ++$c where possible. This is the case when the result is not instantly used (so-called void context). With $c++ an additional temporary variable is needed.

1The Gnu Compiler Collection. A set of compilers that is almost always used to compile open source software. 7.1. CONSIDERATIONS 77

• Strip multiple jumps. If an if clause does not contain an else branch, a jump command points to the next instruction which is the next one to be executed anyway.

7.1.3 Outputting Data

Especially the second point in the listing above shows the need for some more testing. Outputting is an important point for a script that is used to return data to the browser. Therefore the quickest method for transmitting data has to be determined.

In PHP there is a concept called “Output Buffering.”

Initially, output buffering was integrated with PHP because of the necessity to send HTTP headers before writing any output (compare to [Sur00]). This is because PHP instantly sends the output of the script to the browser (it “flushes” its buffer), but this can only be done after all headers have been sent to the client. If you wanted to set a cookie (which is done in the HTTP header with the Set-Cookie field) after you printed some text already, PHP returns an error message or the cookie is simply dismissed.

Now output buffering not only enables the programmer to set headers at any stage, but also performance benefits from the concept.

Instead of sending all data instantly to the browser, the data is stored in an internal buffer. Therefore, all headers can be modified until output buffering is terminated or the script is finished. At this point the headers and all data stored in the buffer are sent to the browser.

A fine thing is the optional callback function. It allows the programmer to modify the contents of the buffer before it is sent. This can be used to com- press the output, e.g. with gzip (see below) or to modify it for compatibility with character encodings [Kir05].

7.1.4 Programmer’s View

The boring thing about this section is that there is nothing to do for the programmer. 78 CHAPTER 7. APC

7.2 Preparation

APC is activated easily. In the php.ini file just a line

extension = /usr/lib/php4/apc.so

has to be added (see the patch file in the appendix, Listing A.3).

7.2.1 Output Buffering

For testing the output behaviour, three additional scripts have been created. Common to all of them is the generation of random output of a double quoted string (which is examined for contained variables by PHP) as shown in listing 7.1.

Listing 7.1: Generate lengthy random output

1

2 for ($i = 0; $i < 5000; $i++) {

3 echo str_repeat("This is a test string", rand(1, 4));

4 }

5 ?>

These lines were not integrated (although it would be obvious) with another file to be included in order to have these lines included with the compiler cache separately for each file.

no.ob-start(.php) uses no output buffering, i.e. the PHP function ob start() is never called.

ob-start.nogz(.php) just calls ob start() at the beginning of the script. This enables output buffering, but does not set a post-processing callback function.

ob-start.gz(.php) adds an internal callback function to output buffering: ob gzhandler. Before sending the data, it is compressed, commonly by using gzip. If the browser reports (using the header field Accept-Encoding) that it has no support for gzip, another supported compression method is chosen. For this test only gzip compression is used. 7.3. RESULTS 79

7.3 Results

The effects of switching on APC are not as extraordinary as those for Squid. They apply to all of the tested scripts.

Table 7.1: APC Benchmarking results (Requests/s) APC off APC on Concurrent requests File 10 100 10 100 skeleton-t.php 112.17 103.75 409.11 506.44 pres-skel-t.php 7.27 7.26 9.01 8.91 index.php 2.83 2.47 3.25 2.76 links.php 4.31 3.69 4.90 4.19

skeleton-t.php (Figure 7.1) receives the highest benefit from APC. This is due to two points: There is no need to connect to the database and only little output is made. The speed gained from dismissing the database is quite obvious, but an interesting point is the cost of outputting data. We will take a closer look at this point in Section 7.3.1.

Figure 7.1: APC benchmark: skeleton-t.php skeleton−t.php 600 apc_disabled apc_enabled 500 506.44

400 409.11

300

200 Requests per second

100 112.17 103.75

0 10 100 Concurrent Requests 80 CHAPTER 7. APC

Figure 7.2: APC benchmark: pres-skel-t.php pres−skel−t.php 10 apc_disabled 9 apc_enabled 9.01 8 8.91 7 7.27 7.26 6 5 4

Requests per second 3 2 1 0 10 100 Concurrent Requests

Figure 7.3: APC benchmark: index.php index.php 3.50 apc_disabled apc_enabled 3.00 3.25

2.50

2.00 2.83 2.47 2.76

1.50

Requests per second 1.00

0.50

0.00 10 100 Concurrent Requests 7.3. RESULTS 81

Figure 7.4: APC benchmark: links.php links.php 5.00 apc_disabled 4.50 apc_enabled 4.00 4.31 4.90 4.19 3.50 3.00 3.69 2.50 2.00

Requests per second 1.50 1.00 0.50 0.00 10 100 Concurrent Requests

Also scripts that need to build a connection to the database receive a gain in speed. The difference in speed lies between 10 and 25 percent, which is quite a value for literally uncommenting a line. In the rather uncommon case of no database connection (we will see later how to make scripts independent of a database) the gain hits a 500%.

The results for index.php (Figure 7.3) and links.php (Figure 7.4) show that with APC the same performance (when speaking of requests per second) for 100 concurrent requests2 can be achieved as for 10 CCR without APC.

When comparing requests per second for different concurrency rates it still has to be considered that the speed from the view of a client still varies. If benchmark A and B have the same rate of requests/s, the total time is also the same regardless of the concurrency rate.

The response time “felt” by the client is therefore longer for more concurrent requests.

Considering two benchmarks with the same speed of 25 requests per second, 25 users requesting the page at the same time will have to wait one second each. If the request rate stays the same and 250 users want to access the

2Throughout the thesis the abbreviation CCR will be used for concurrent requests. 82 CHAPTER 7. APC page at the same time, every user has to wait 10 seconds. The request rate is still 25 requests per second (and therefore a “good” result).

7.3.1 Results for output testing

In the previous section a cause for the achieved results has been claimed to be the output performance of PHP. Therefore, some extra testing was included to check how different settings for outputting data perform.

As stated before and can be seen in Figure 7.5, output buffering increases speed. When looking at the dark grey bars representing the results without APC we see that for 10 concurrent requests output buffering increases speed by only 2%. With APC the gap increases to about 5% (see Table 7.2).

For 100 CCR there is not a gain but a loss. This can be explained by the higher memory usage that output buffering takes. With many concurrent requests, much data has to be stored in a buffer: a mean size of 262,500 bytes sums up to 25 mega bytes only used for buffered data (there is an additional overhead, of course). With the additional overhead of Apache binaries this can quickly fill memory.

Table 7.2: Output benchmarking results (Requests/s)

Files (.php) CCR APC no.ob-start ob-start.nogz ob-start.gz skel-t 10 off 38.77 40.29 28.72 112.17 10 on 48.68 51.45 35.69 409.11 100 off 37.59 35.32 26.86 103.75 100 on 46.02 43.01 33.29 506.44

The slowest output comes from gzip compressed. This is due to the ex- pensive algorithm for compressing data. Still, using this mode can be rec- ommended as due to less data (text data is very suitable for compression [BK93]) that needs to be transferred, the bandwidth becomes less a factor for performance. It is also a cost factor: usually you have an account-based traffic limit; using compression you can serve more visitors at the same price.

The results for 100 CCR do not show a better performance for output buffer- 7.3. RESULTS 83

Figure 7.5: APC benchmark: Skeletons with(out) ob start() (10 CCR) 10 Concurrent requests 1,000 apc_disabled apc_enabled 409.11

100 112.17

48.68 51.45 Requests per second 40.29 38.77 35.69 28.72

10 no.ob−start ob−start.nogz ob−start.gz skeleton−t File

Figure 7.6: APC benchmark: Skeletons with(out) ob start() (100 CCR) 100 Concurrent requests 1,000 apc_disabled apc_enabled 506.44

100 103.75

46.02

Requests per second 43.01 37.59 35.32 33.29 26.86

10 no.ob−start ob−start.nogz ob−start.gz skeleton−t File 84 CHAPTER 7. APC ing than with 10 CCR. Instead, there is only a (very small) gain with APC on and gzip compression turned off.

For documents that do not return any contents (skeleton-t.php) the rate is evidently even higher since no output has to be stored or sent. Only the headers are sent in this case. This reduces the used network bandwidth even more. The headers are not compressed under any circumstances.

Although the gain in speed is not enormous, the use of output buffering (with enabled compression) is heavily recommended. For large projects also bandwith plays an important role and even a small loss of speed in combination with a reduction of traffic can save a lot of money.

7.4 Conclusions for APC

APC cache proves to speed up each PHP script requested more than once. This shows how much time is consumed for compilation when executing a PHP script.

Turning off APC or not installing a compiler cache is simply a waste of CPU time and it should always be considered to install and activate APC. Chapter 8

MySQL

8.1 Considerations

An important aspect in most web application is the database. As the content is dynamic, it is commonly stored in a database. Therefore, the speed of the database is nearly as important as the speed of the script. Or, to say it differently, a tuned database will also speed up the scripts.

There are two concepts that will be tested in this thesis: the MySQL Query Cache and Persistent Connections. The third concept of Query Tuning should also be taken into consideration but is a too extensive topic.

8.1.1 MySQL Query Cache

Version 4.0.1 of MySQL – the database used in this thesis, see Section 4.3 – supports a caching mechanism that allows quicker retrieval of common queries.

MySQL is commonly used as a database for a web application. Charactaris- tic for this scenario are little changes for the stored data and many identical queries. The MySQL query cache reserves a given amount of storage for sav- ing queries plus their results. A script typically requests exactly the same query (the SQL command has to be the same – byte-by-byte) several times and MySQL can use the cache to instantly return the results.

This concept works as long as nothing is changed in the database that affects

85 86 CHAPTER 8. MYSQL the query. The most common, “dangerous” commands are INSERT/REPLACE, UPDATE, and DELETE. When tables are modified, any relevant entries in the query cache are flushed; no stale data is ever returned.

When using the MySQL query cache, we want to consider certain queries not to be cached at all. There is a special command for this case that will be discussed in Section 8.2.

8.1.2 Persistent Connections

A script that wants to access the database has to connect to the database daemon first. This is called “establishing a link”. Usually this is done via a TCP connection and an authentication mechanism.

The cost for connection to a database and establishing a link depends heavily on the environment. Most important factors for the speed for connecting are the speed (and/or latency) of the network interface (can also be the very fast loop back interface if the DBMS resides on the same machine) and load on the database machine. Dependent on the configuration a certain overhead for connecting will slow the script down.

The concept of persistent connections is somewhat similar to preforking of Apache (see Section 4.1): a set of connections is ready to be reused without having to go through the whole connection phase. The higher the overhead for a connection is, the higher the gain from persistent connections will be.

The drawbacks of persistent connections are caused by their persistency. If for some reason a link is ruined (e.g. by a connection loss or a faulty script) it cannot be reused any more. There exist concepts to detect bro- ken connections and re-establish them, but they require extra overhead. A greater problem are table lockings that have been turned on by mistake. A programmer can avoid this by using a so called “shutdown function” that clears all locks when the script finishes.

8.1.3 Query Tuning

Apart from caching and speeding up the environment, one also has to make the queries behave well: take care that they are not (too) wasteful. 8.2. PREPARATION 87

Often indices and good database layouts can improve the speed even more than caching techniques. In combination these techniques result in the high- est speed, of course.

Tuning queries is very application dependent, but [ZB04] gives a good intro- duction and leads to good starting points for optimizing the queries. You can commonly start with the slowest queries of our application. They can be automatically logged by MySQL if we specify a threshold of x seconds.

8.2 Preparation

The changes needed for this testing can both be made in the configura- tion files of MySQL (for the query cache) and PHP (for generally enabling the persistent connection feature), but it has also to be ensured that the connection script code takes advantage of this feature.

To turn on the MySQL Query Cache (abbreviated MQC), in the MySQL configuration file /etc/mysql/my.cnf simply the size of the cache has to be set to a non-zero value (see also appendix, Listing A.4):

Listing 8.1: Activate MQC – /etc/mysql/my.cnf

59 query_cache_size = 26214400

The size is specified in bytes. Here a cache of 25 mega bytes is created.

Persistent connections are activated, by enabling them in /etc/php4/ apache2/php.ini. The number of possible links and persistent connections are usually set to unlimited (i.e. -1).

Listing 8.2: Activate persistent connections – /etc/php4/apache2/php.ini

598 mysql.allow_persistent = On

601 mysql.max_persistent = -1

604 mysql.max_links = -1

Additionally we have to asure that the scripts really use the persistent con- nections. When using plain PHP we need to use the function mysql pconnect instead of mysql connect to connect to the database. 88 CHAPTER 8. MYSQL

Wrapper APIs (like PEAR::DB, see Section 4.3.1) need individual care. In PEAR::DB either the construct

$db = DB::connect($dsn, true);

works. Listing 8.3 shows a more verbose and extensible solution.

Listing 8.3: Use persistent connections with PEAR::DB – db/db.php

3 require_once(’DB.php’);

4

5 $dsn = array(

6 ’phptype’ => ’mysql’,

7 ’username’ => ’bandnews_org’,

8 ’password’ => ’xyz’,

9 ’hostspec’ => ’localhost’,

10 ’database’ => ’bandnews_org’,

11 );

12

13 $options = array(

14 ’persistent’ => true,

15 );

16

17 $db = DB::connect($dsn, $options);

When using other Wrapper APIs the appropriate steps (usually well docu- mented) need to be taken too, of course. 8.3. RESULTS 89

8.3 Results

For these tests, APC was turned on. This allows the MySQL query cache to show its full potential and makes the following results to be the most useful ones so far.

8.3.1 Query Cache

As can be seen in figures 8.2–8.4 (p. 90–91) and Table 8.1, the important scripts now really gain speed and move to interesting regions regarding the possible requests per second. index.php moves up by 471% for 10 CCR and even by 784% for 100 concurrent requests. Also links.php gets faster by 3-digit percentage numbers, an increase between 104% and 178%. The underlying pres-skel-t.php receives similar gain.

Table 8.1: MQC Benchmarking results (Requests per second) MQC off MQC on Concurrent requests File 10 100 10 100 skeleton-t.php 429.61 338.48 404.08 308.76 pres-skel-t.php 9.38 9.27 56.31 53.12 index.php 3.29 2.07 18.80 18.30 links.php 5.05 3.66 10.33 10.19

The only exception is skeleton-t.php (Figure 8.1, page 90). There is no gain, but loss instead, although the numbers can be taken as equal due to measurement inaccuracies: The number of tested requests was only as low as 1,000, and the original request rate already started around 400 requests/s.

If a test took for any reason (e.g. an I/O event) 0.1 seconds longer than the original one (2.5s vs. 2.6s) the measured request rate would already drop from 400 to 385 requests per second. Another point is that due to the query cache of 25 mega bytes there is less memory available.

Table 8.2 (page 92) shows for the requests per second (rps) the correspond- ing (mean) number of seconds that it takes to generate a page (gt). This 1 follows the simple formula rps = gt. 90 CHAPTER 8. MYSQL

Figure 8.1: MQC benchmark: skeleton-t.php skeleton−t.php

500 mqc_disabled mqc_enabled 429.61 400 404.08

338.48 300 308.76

200 Requests per second

100

0 10 100 Concurrent Requests

Figure 8.2: MQC benchmark: pres-skel-t.php pres−skel−t.php 60 mqc_disabled 56.31 mqc_enabled 53.12 50

40

30

20 Requests per second

10 9.38 9.27

0 10 100 Concurrent Requests 8.3. RESULTS 91

Figure 8.3: MQC benchmark: index.php index.php 20 mqc_disabled 18 mqc_enabled 18.80 18.30 16 14 12 10 8

Requests per second 6 4 3.29 2 2.07 0 10 100 Concurrent Requests

Figure 8.4: MQC benchmark: links.php links.php 12 mqc_disabled mqc_enabled 10 10.33 10.19

8

6 5.05 4 Requests per second 3.66 2

0 10 100 Concurrent Requests 92 CHAPTER 8. MYSQL

Table 8.2: Comparison: Requests per second – Generation time File Rps G. time Rps G. time skeleton-t.php 429.61 0.0023s 404.61 0.0025s pres-skel-t.php 9.38 0.107s 56.31 0.0176s index.php 3.29 0.304s 18.80 0.053s links.php 5.05 0.198s 10.33 0.097s

8.3.2 Persistent Connection

This test was done with an activated MySQL query cache. skeleton-t.php was not tested since it does not use the database.

The results (Table 8.3) for this test do not show any significant change in speed. Only index.php (Figure 8.6) profits a little. This result is quite evident, though. None of the pro-persistent connection arguments really fits our scenario. The database server resides on the same machine, there is no network latency, the system load is low. Nevertheless, as the use of persistent connection does not slow down anything significantly, it is arbitrary to use them or not. The author feels more comfortable with reusing stuff that ain’t broke1.

Table 8.3: Persistent connection benchmarking results (R/s) Temp. conn Persist. conn Concurrent requests File 10 100 10 100 pres-skel-t.php 54.81 51.46 53.80 51.19 index.php 18.68 14.97 19.41 18.17 links.php 10.32 10.08 10.21 10.00

1Referring to the common saying (not only) amoung computer scientists: If it ain’t broke, don’t fix it. 8.3. RESULTS 93

Figure 8.5: Persistent connection benchmark: pres-skel-t.php pres−skel−t.php

persist_disabled 60 persist_enabled 54.81 53.80 50 51.46 51.19

40

30

Requests per second 20

10

0 10 100 Concurrent Requests

Figure 8.6: Persistent connection benchmark: index.php index.php

persist_disabled persist_enabled 20 19.41 18.68 18.17

15 14.97

10 Requests per second

5

0 10 100 Concurrent Requests 94 CHAPTER 8. MYSQL

Figure 8.7: Persistent connection benchmark: links.php links.php 12 persist_disabled persist_enabled 10 10.3210.21 10.08 10.00

8

6

4 Requests per second

2

0 10 100 Concurrent Requests

8.4 Conclusions for MySQL

Web applications such as Bandnews.org profit enormously from the MySQL query cache. Especially for sequential testing the results are amazing. Ac- cording to [AB04] the overhead is minimal even for frequently changed ta- bles. It should always be considered to turn on this feature.

The power of persistent connections did not quite show in the benchmarks. This is primarily due to the fact that script and database run on the same machine so that expensive factors for link establishing such as network la- tency do not come into play. Still the use of persistent connections can be recommended. Because of detection techniques for damaged connections and no real limits in connection numbers the minimal advantages of spon- taneous connections do not weigh much. Chapter 9

Smarty Caching

In this section the PHP scripts will be tuned using the Smarty caching feature (introduced in Section 4.4).

9.1 Considerations

This testing method is different from those described before. Based on the knowledge of the available tools used until now we will modify the scripts to receive the best results.

The tool Smarty also provides compiling and caching functionality. These abilities will be used here.

9.1.1 Caching Page Parts

As already discussed in Section 6.1.2, it is difficult for a dynamic script to report its last modification date since the data is received from a database. Therefore, no caching with Squid is possible.

As we defined earlier the last modification date is the date of the “youngest” part of the page. Furthermore, if for one part of the page no such date can be determined, the last modification date for the whole page is unavailable. This takes proxy servers, such as Squid, out of play.

If we descend a level and move the caching part to the script (for non-static pages), then we cannot do anything useful about the part for which the last

95 96 CHAPTER 9. SMARTY CACHING modification date is unavailable. We can cache all other parts that provide such a date and, therefore, the caching process can be applied to those parts.

Moving the caching part to the PHP script makes it more vulnerable to bugs regarding the delivery of outdated (stale) information. Therefore, there has to be taken special care of the implementation.

When speaking about caching in this field we usually mean a combination of compilation and caching. This is also the approach that Smarty takes.

The template files that follow their own syntax are transformed into a PHP script on the first loading. From that point on only the compiled template file is accessed as long as the template file is not modified. This can already be considered as a kind of caching.

When enabling the caching feature of Smarty also loops and sections are eliminated and the output of the script (corresponding to the template file) is stored and delivered. This causes another speed increase. To allow to have the programming constructs removed, further care has to be taken which is discussed in the next section.

9.1.2 Database Usage

As demonstrated in Section 7.3, scripts that do not even touch the database (such as skeleton-t.php) run considerably faster. The idea of caching only the parts of the page that allow the detection of a last modification date does not go far enough for that.

The concept of reducing (or even eliminating) database access can be com- pared to the MySQL query cache. The idea behind the concept is as follows: If the database did not change, the whole application (at least the parts that rely on database selects, commonly unpersonalized pages) can be stored on disk. A database change usually only affects certain parts of the application.

If the application is notified about the modification of a certain part of the page, it can clear the corresponding caches and have the other parts remain in cache.

When done carefully many dynamic pages can be made semi-static. Espe- cially the main page (index.php) is worth the effort as it is typically the most frequently accessed page in a web application. 9.1. CONSIDERATIONS 97

The parallelity to the MySQL query cache is evident (the data from SQL queries could be equally stored and received from the query cache). If it is turned on, the additional effort (modifying existing scripts) seems useless.

It depends on the complexity of post-processing data whether this is true. At the index page of Bandnews.org, for example, a news item is put to- gether from many tables (combining band name, genres, and news data) and requires additional processing (beautifying URIs, search keyword high- lighting). If there was no post-processing, the MySQL query cache suffices indeed.

From its first generation the news item stays about the same1. If stored at that point, the database is not needed at all to display it and therefore is not even invoked.

For notifying the application there exist several concepts. For example:

• A file can be stored on the hard disk and carry the modification infor- mation in its file modification time.

• More favourable is the direct deletion of corresponding cache parts. If the caching functionality of Smarty is used carefully, the cache for a single file can be spread to several directories, for example, dependent on the section where it is used or on the given parameters.

It is important to mention that this method needs additional effort and consideration for the administrative part of the site. All parts where we expect changes to an application need to be aware of the caching aspect and need to act accordingly. This can be done by centralizing the modification code (duplicated code should be eliminated anyway), e.g. by glueing together database call and cache clearance in one function.

1The feature of dynamic time display (x hours and y minutes ago) needs some PHP processing, still. This small function can be inserted to the cached document. 98 CHAPTER 9. SMARTY CACHING

9.2 Preparation

To easily install the use of the Smarty cache the author proposes to use a function for including files. The function is called load and introduces the following convention (with $file as a file to be included; without extension):

• The file inc/$file.inc.php will be loaded (and must therefore exist).

• The template file templates/$file.tpl is displayed.

We can determine stale documents either by using the last modification date of the included file or by choosing the second possibility (see previous section): if the cache file does not exist (i.e. if it has been deleted), the cached copy is built.

Listing 9.1: The load function in inc/setup.inc.php

218 function load($page, $cacheid = ’’, $load_php = true) {

219 global $smarty;

220

221 if ($load_php && !$smarty->is_cached($page . ’.tpl’, $cacheid, $_SESSION[’language’])) {

222 include(ROOT_PATH . ’inc/’ . $page . ’.inc.php’);

223 }

224

225 if ($page == ’newsitem’) {

226 include_once(ROOT_PATH . ’inc/genfunc.inc.php’);

227 $source = $smarty->fetch($page . ’.tpl’, $cacheid, $_SESSION[’language’]);

228 $source = fixtime($source);

229 echo $source;

230 } else {

231 $smarty->display($page . ’.tpl’, $cacheid, $_SESSION [’language’]);

232 }

233 }

The load function (see Listing 9.1) is quite a universal function but still adapted for Bandnews.org. For example, there is a special branch for the news items that dynamically replaces absolute time (e.g. “March 3, 2005 3:20 p.m.”) with relative time (e.g. “1 hour 3 minutes ago”). 9.2. PREPARATION 99

It also provides support for a caching ID: a template is often displayed with varying parameters in different contexts. This can be handled with caching IDs. Internally such an ID represents a directory in the cache. Even sub- directories can be specified using the pipe character (|) as separator. If $cacheid is specified carefully, (only) related cached files are placed in the same directory and can easily be deleted to invalidate the cache.

An important point is that the associated include file is only loaded if there is no cached copy available. This behaviour can eliminate database calls or expensive execution of other PHP code. If the load function was used (or the corresponding part, using the $smarty->is cached method), the PHP code is still executed and Smarty does not even touch the generated contents.

Listing 9.2 shows a modified version of the presentation skeleton. The last modification code (see page 67, listing 6.2) from pres-skel-t.php was not reused as it is taken for sure that no last modification date could be deter- mined anyway.

Listing 9.2: The adapted presentation skeleton – pres-skel-n.php

1

2 require(’inc/setup.inc.php’);

3

4 load(’header’);

5 load(’menu’);

6 load(’sidebar’);

7

8 load(’footer’);

9 ?> 100 CHAPTER 9. SMARTY CACHING

9.3 Results

The benchmarks show another large improvement in speed and let the pages be generated really fast. MySQL query cache and APC were also activated for this test.

The skeleton (skeleton-t.php, see Figure 9.1) is once again the only script to lose speed. Since the request rates are very high this cannot really be felt, but it is still important to our analysis: the activation of the Smarty Cache seems to add some overhead, and there is the previously observed speed losses due tto memory occupied by the MySQL query cache.

All other scripts show high gains (compare with Table 9.1, page 103):

• pres-skel-n.php (Figure 9.2) receives gains between 151% and 178%: header, menu, sidebar and footer are no longer dependent on the database. Instead just the file is loaded and instantly returned to the browser.

• For index.php (Figure 9.3, page 102) each of the previous points ap- plies. There remain dynamic fields such as relative dates for news items. They are not very expensive, though. Additionally 6 news items and band headers for each item are displayed. Those news items are stored in cache separately in order to share them between the common pages (band page, index page, and search page). Usually there is no connection to the database established, except if it is explicitly needed. This is either the case when we consider a search page or when a user is logged in. In this case each band header receives a plus or minus sign, for adding or deleting it from the user’s band list. Also there is a custom area displayed in the sidebare that uses the database. Overall, the majority of users are not logged in and, therefore, do not use the database at all. This enables the highest speed boost for them. Gains lie between 207% and 236% for 100 resp. 10 concurrent requests.

• links.php profits most from the no longer used database queries (eventhough they have been sped up by the MySQL query cache al- ready). 9.3. RESULTS 101

Figure 9.1: Smarty Caching benchmark: skeleton-t.php skeleton−t.php 400 smarty_disabled386.24 350 smarty_enabled 328.72 300 263.13 250 228.06 200

150 Requests per second 100

50

0 10 100 Concurrent Requests

Figure 9.2: Smarty Caching benchmark: pres-skel-n.php pres−skel−n.php 180 smarty_disabled 160 smarty_enabled 158.88

140 137.22 120

100

80

60 Requests per second 57.20 54.76 40

20

0 10 100 Concurrent Requests 102 CHAPTER 9. SMARTY CACHING

Figure 9.3: Smarty Caching benchmark: index.php index.php 70 smarty_disabled smarty_enabled 60 61.80 57.65 50

40

30

Requests per second 20 20.16 17.15 10

0 10 100 Concurrent Requests

Figure 9.4: Smarty Caching benchmark: links.php links.php 140 smarty_disabled 133.48 smarty_enabled 120 117.44

100

80

60

Requests per second 40

20 10.06 10.04 0 10 100 Concurrent Requests 9.4. CONCLUSIONS FOR SMARTY CACHING 103

When looking at Figure 9.4 the enormous gains of 1227% resp. 1070% (for 100 CCRs) can be enjoyed.

The results only profit mainly from APC. The MySQL query cache is only used for filling the cache and dynamic fields if the script needs any at all.

Table 9.1: Smarty Caching Benchmarking results (Requests per second) SC off SC on Concurrent requests File 10 100 10 100 skeleton-t.php 386.24 263.13 328.72 228.06 pres-skel-n.php 57.20 54.76 158.88 137.22 index.php 20.16 17.15 61.80 57.65 links.php 10.06 10.04 133.48 117.44

9.4 Conclusions for Smarty Caching

This section shows the power a well written PHP script has. When using the cache carefully and managing the correct clearance of the cache whenever a manipulation happens, there is another enourmous speed-up possible. This is especially true for the pages which could not really be tuned by external means, index.php and links.php. An important point of this form of caching is the fact that only a part of the previously used caching methods is active. For example, the MySQL query cache is only used for generating and filling the cache (this applies to the tested pages). 104 CHAPTER 9. SMARTY CACHING Chapter 10

Conclusions

In this thesis we tested several caching techniques concentrating on different aspects of the generation and distribution of dynamically generated pages belonging to a web application.

The results show that a combination of the demonstrated techniques leads to a useful result. All in all speed gains of up to 2,240% (index.php without caching vs. with Smarty Caching, see Table 10.1 and Figure 10.1, page 107) are possible if the application is tuned carefully. This applies to pages that have already been written with some care for speed, so the original page already rendered in 0.5 seconds, but now does the same in less than 0.1s.

While the external caching methods (Squid, APC, and MySQL query cache) do not need very much effort by the programmer, Smarty Caching involves an application design with this option in mind or requires a redesign.

Table 10.1: Overall Benchmarking results (Requests/s) File (.php) No Tuning Squid APC MySQL Smarty skeleton 84.69 3868.60 409.11 404.08 – pres-skel 5.70 2523.40 9.01 56.31 158.88 index 2.64 2.64 3.25 18.80 61.80

105 106 CHAPTER 10. CONCLUSIONS

10.1 Further Work

The work done for this thesis was carried out only on a single desktop PC. Its results already show what potential caching has. It is not very favourable (but very common) to have the main three programs (proxy, web, and database server) reside on the same machine.

Further speed up can be established by moving each service to a single machine. The proxy server provides (as already mentioned) means for load balancing. The web server can therefore be designed redundantly. The database server can be split into several servers that can be clustered – or split up into master and slave databases at least with the slaves doing search operations.

A network of computers with the needed software requires quite a budget already. Opposed to that, everything used in the work on this thesis is software based and built on freely available open source software. 10.1. FURTHER WORK 107

Figure 10.1: Overall Benchmarking results 108 CHAPTER 10. CONCLUSIONS Appendix A

File Sources

A.1 Benchmark Script

Listing A.1 shows the source code of the benchmarking script which was developed for this thesis. The script takes patch files (see A.2, page 112) as parameters.

Listing A.1: Benchmark – benchmark.sh

1 #!/ bin /sh

2

3 # configuration

4

5 if [ -z "$EXTERNAL_CONFIG" ]; then

6 PRETEST_NUMCONNS=1

7 PRETEST_SLEEP=5

8 NUMCONNS=1000

9 CONCURR=100

10 FILES="skeleton.php pres-skel.php index.php links.php"

11 fi

12 COUNT =0

13

14 # check for correct parameter count

15 check_params () {

16 if [ -z "$1" ]; then

17 echo "please specify at least one patch script"

18 exit 1

19 fi

109 110 APPENDIX A. FILE SOURCES

20

21 until [ -z "$1" ]; do

22 if [ ! -e $1 ]; then

23 echo "patch file $1 could not be found."

24 exit 1

25 fi

26 shift

27 done

28 }

29

30 # all config files are prepared, run the test

31 run_benchmark () {

32 LOGFILE=$1.log

33 CHARTFILE=../$1.chart

34 # empty files

35 rm -f $LOGFILE $CHARTFILE

36

37 echo ""

38 echo $1

39

40 # restart programs to have fair results (need to shut down all of them first)

41 sudo /etc/init.d/apache2 stop

42 sudo /etc/init.d/mysql stop

43 sudo /etc/init.d/squid stop

44 sudo /etc/init.d/apache2 start

45 sudo /etc/init.d/mysql start

46 sudo /etc/init.d/squid start

47 sudo rm -rf /var/www/bandnews/cache/*

48

49 echo starting tests..

50

51 # create a chart file as input for gnuplot

52 echo $1 > $CHARTFILE

53 echo File Requests_per_second >> $CHARTFILE

54

55 # test each of these files

56 for f in $FILES; do

57 echo -n testing $f...

58 if [ $PRETEST_NUMCONNS -gt 0 ]; then A.1. BENCHMARK SCRIPT 111

59 ab -n $PRETEST_NUMCONNS -c $CONCURR http:// localhost/$f > /dev/null

60 sleep $PRETEST_SLEEP

61 fi

62

63 ab -t 60 -n $NUMCONNS -c $CONCURR -H ’Accept-Encoding : gzip’ http://localhost/$f >> $LOGFILE.$f

64 echo finished.

65 REQSEQ=‘grep "Requests per" $LOGFILE.$f | awk ’{ print $4 }’‘

66

67 cat $LOGFILE.$f >> $LOGFILE

68 rm -f $LOGFILE.$f

69

70 echo $f $REQSEQ >> $CHARTFILE

71

72 done

73 }

74

75 run () {

76 let COUNT=$COUNT+1

77 RUN =1

78 if [ "$RUN_ONLY" ] && [ $COUNT -ne $RUN_ONLY ]; then

79 RUN =0

80 fi

81 if [ $RUN -eq 1 ]; then

82 run_benchmark $1

83 fi

84 }

85

86 # the config files are prepared here and start the benchmark when done

87 test_run () {

88 # the identifier is used to

89 local identifier=$1

90 shift

91 local cur=$1

92 shift

93

94 if [ -n "$1" ]; then

95 test_run ${identifier}_${cur}0 "$@" 112 APPENDIX A. FILE SOURCES

96 else

97 run ${identifier}_${cur}0

98 fi

99

100 sudo patch -p0 -i $cur

101 if [ -n "$1" ]; then

102 test_run ${identifier}_${cur}1 "$@"

103 else

104 run ${identifier}_${cur}1

105 fi

106

107 # undo the patch

108 sudo patch -R -p0 -i $cur

109 }

110

111 check_params "$@"

112 # include the test parameters in the file name

113 test_run "${PRETEST_NUMCONNS}-${PRETEST_SLEEP}-${NUMCONNS }-${CONCURR}" "$@"

A.2 Patch Files

Listing A.2: Squid patch file – squid

1 *** /etc/squid/squid.conf 2005-04-12 09:13:43.965791648 +0200

2 --- squid.conf 2005-04-12 09:14:07.775172072 +0200

3 ***************

4 *** 50,56 ****

5 # visible on the internal address.

6 #

7 # Default :

8 ! # http_port 3128

9

10 # TAG: https_port

11 # Note: This option is only available if Squid is rebuilt with the

12 --- 50,56 ----

13 # visible on the internal address.

14 # A.2. PATCH FILES 113

15 # Default :

16 ! http_port 80

17

18 # TAG: https_port

19 # Note: This option is only available if Squid is rebuilt with the

20 ***************

21 *** 1844,1850 ****

22 # of your access lists to avoid potential confusion.

23 #

24 # Default :

25 ! # http_access deny all

26 #

27 #Recommended minimum configuration:

28 #

29 --- 1844,1850 ----

30 # of your access lists to avoid potential confusion.

31 #

32 # Default :

33 ! http_access allow all

34 #

35 #Recommended minimum configuration:

36 #

37 ***************

38 *** 2182,2188 ****

39 # the ’httpd_accel_with_proxy’ option.

40 #

41 # Default :

42 ! # httpd_accel_port 80

43

44 # TAG: httpd_accel_single_host on|off

45 # If you are running Squid as an accelerator and have a single backend

46 --- 2182,2189 ----

47 # the ’httpd_accel_with_proxy’ option.

48 #

49 # Default :

50 ! httpd_accel_host 127.0.0.1

51 ! httpd_accel_port 81

52

53 # TAG: httpd_accel_single_host on|off 114 APPENDIX A. FILE SOURCES

54 # If you are running Squid as an accelerator and have a single backend

55 ***************

56 *** 2211,2217 ****

57 # setting )

58 #

59 # Default :

60 ! # httpd_accel_with_proxy off

61

62 # TAG: httpd_accel_uses_host_header on|off

63 # HTTP/1.1 requests include a Host: header which is basically the

64 --- 2212,2218 ----

65 # setting )

66 #

67 # Default :

68 ! httpd_accel_with_proxy on

69

70 # TAG: httpd_accel_uses_host_header on|off

71 # HTTP/1.1 requests include a Host: header which is basically the

72 ***************

73 *** 2231,2237 ****

74 # require the Host: header will not be properly cached.

75 #

76 # Default :

77 ! # httpd_accel_uses_host_header off

78

79 # TAG: httpd_accel_no_pmtu_disc on|off

80 # In many setups of transparently intercepting proxies Path - MTU

81 --- 2232,2238 ----

82 # require the Host: header will not be properly cached.

83 #

84 # Default :

85 ! httpd_accel_uses_host_header on

86

87 # TAG: httpd_accel_no_pmtu_disc on|off

88 # In many setups of transparently intercepting proxies Path - MTU A.2. PATCH FILES 115

89 *** /etc/apache2/ports.conf 2005-04-12 09:13:43.963791952 +0200

90 --- ports.conf 2005-04-12 09:13:24.624731936 +0200

91 ***************

92 *** 1 ****

93 ! Listen 80

94 --- 1 ----

95 ! Listen 81

Listing A.3: APC patch file – apc

1 *** /etc/php4/apache2/php.ini 2005-04-07 16:49:07.454799952 +0200

2 --- php.ini 2005-04-07 16:50:32.645848944 +0200

3 ***************

4 *** 1077,1080 ****

5 ; End :

6 extension=curl.so

7 extension=mysql.so

8 ! ;extension=apc.so

9 --- 1077,1080 ----

10 ; End :

11 extension=curl.so

12 extension=mysql.so

13 ! extension=apc.so

Listing A.4: MySQL Query Cache patch file – mqc

1 *** /etc/mysql/my.cnf 2005-04-07 17:16:04.555963208 +0200

2 --- my.cnf 2005-04-07 17:27:01.430103160 +0200

3 ***************

4 *** 56,62 ****

5 # Query Cache Configuration

6 #

7 query_cache_limit = 1048576

8 ! query_cache_size = 0

9 query_cache_type = 1

10 #

11 # Here you can see queries with especially long duration

12 --- 56,62 ----

13 # Query Cache Configuration 116 APPENDIX A. FILE SOURCES

14 #

15 query_cache_limit = 1048576

16 ! query_cache_size = 26214400

17 query_cache_type = 1

18 #

19 # Here you can see queries with especially long duration

Listing A.5: Persistent connection patch file – persist

1 *** /etc/php4/apache2/php.ini 2005-04-24 13:15:06.384164796 +0200

2 --- php.ini 2005-04-24 13:14:15.499717762 +0200

3 ***************

4 *** 595,601 ****

5

6 [ MySQL ]

7 ; Allow or prevent persistent links.

8 ! mysql.allow_persistent = Off

9

10 ; Maximum number of persistent links. -1 means no limit .

11 mysql.max_persistent = -1

12 --- 595,601 ----

13

14 [ MySQL ]

15 ; Allow or prevent persistent links.

16 ! mysql.allow_persistent = On

17

18 ; Maximum number of persistent links. -1 means no limit .

19 mysql.max_persistent = -1

20 *** /var/www/bandnews/db/db.php 2005-04-26 09:48:44.140531952 +0200

21 --- db.php 2005-04-26 09:48:52.170311240 +0200

22 ***************

23 *** 12,18 ****

24

25 $options = array(

26 ’debug’ => 0,

27 ! ’persistent’ => false,

28 ); A.2. PATCH FILES 117

29

30 $db = DB::connect($dsn, $options);

31 --- 12,18 ----

32

33 $options = array(

34 ’debug’ => 0,

35 ! ’persistent’ => true,

36 );

37

38 $db = DB::connect($dsn, $options);

Listing A.6: Smarty Caching patch file – persist

1 *** /var/www/bandnews/inc/smarty.inc.php 2005-04-26 12:54:03.921068360 +0200

2 --- smarty.inc.php 2005-04-26 12:53:47.958495040 +0200

3 ***************

4 *** 10,16 ****

5 $this->config_dir = ROOT_PATH . "/config/";

6 $this->register_block(’dynamic’, ’ smarty_block_dynamic’, false);

7 $this->register_modifier(’convert_to_class’, ’ smarty_modifier_convert_to_class’, false);

8 ! $this->caching = false;

9 $this->cache_lifetime = -1;

10 $this->use_sub_dirs = true;

11 $this->security = false;

12 --- 10,16 ----

13 $this->config_dir = ROOT_PATH . "/config/";

14 $this->register_block(’dynamic’, ’ smarty_block_dynamic’, false);

15 $this->register_modifier(’convert_to_class’, ’ smarty_modifier_convert_to_class’, false);

16 ! $this->caching = true;

17 $this->cache_lifetime = -1;

18 $this->use_sub_dirs = true;

19 $this->security = false; 118 APPENDIX A. FILE SOURCES B List of Figures

3.1 Screenshot of Bandnews.org ...... 20

3.2 Screenshot of myBandnews while selecting personal bands .. 22

4.1 The MVC design pattern ...... 33

4.2 Three-tier architecture ...... 34

4.3 Screenshot of the output of the alternating backgrounds ex- ample ...... 36

4.4 PHP script execution ...... 40

4.5 Script execution with compiler cache ...... 41

5.1 Processing a Request ...... 51

5.2 Script layers of an application script ...... 56

5.3 Typical output while benchmarking ...... 61

6.1 Squid benchmark: skeleton-t.php ...... 72

6.2 Squid benchmark: pres-skel-t.php ...... 73

6.3 Squid benchmark: index.php ...... 73

7.1 APC benchmark: skeleton-t.php ...... 79

7.2 APC benchmark: pres-skel-t.php ...... 80

7.3 APC benchmark: index.php ...... 80

7.4 APC benchmark: links.php ...... 81

7.5 APC benchmark: Skeletons with(out) ob start() (10 CCR) 83

119 120 APPENDIX B. LIST OF FIGURES

7.6 APC benchmark: Skeletons with(out) ob start() (100 CCR) 83

8.1 MQC benchmark: skeleton-t.php ...... 90

8.2 MQC benchmark: pres-skel-t.php ...... 90

8.3 MQC benchmark: index.php ...... 91

8.4 MQC benchmark: links.php ...... 91

8.5 Persistent connection benchmark: pres-skel-t.php ...... 93

8.6 Persistent connection benchmark: index.php ...... 93

8.7 Persistent connection benchmark: links.php ...... 94

9.1 Smarty Caching benchmark: skeleton-t.php ...... 101

9.2 Smarty Caching benchmark: pres-skel-n.php ...... 101

9.3 Smarty Caching benchmark: index.php ...... 102

9.4 Smarty Caching benchmark: links.php ...... 102

10.1 Overall Benchmarking results ...... 107 C List of Tables

6.1 Benchmarking results (Requests per second): Without Squid 71

6.2 Benchmarking results (Requests/s): With Squid ...... 71

6.3 Benchmarking results (Requests/s): With Squid (cont.) ... 72

6.4 Benchmarking results (Requests per second): index.php ... 74

7.1 APC Benchmarking results (Requests/s) ...... 79

7.2 Output benchmarking results (Requests/s) ...... 82

8.1 MQC Benchmarking results (Requests per second) ...... 89

8.2 Comparison: Requests per second – Generation time ..... 92

8.3 Persistent connection benchmarking results (R/s) ...... 92

9.1 Smarty Caching Benchmarking results (Requests per second) 103

10.1 Overall Benchmarking results (Requests/s) ...... 105

121 122 APPENDIX C. LIST OF TABLES D List of Listings

2.1 Output of the uptime command ...... 15

2.2 A part of the output of the top command ...... 15

4.1 Hello World in PHP – helloworld.php ...... 27

4.2 Hello World in Smarty – hello.tpl ...... 34

4.3 Hello World in Smarty – hello.php ...... 34

4.4 Highlighting alternating lines – alternate.tpl ...... 35

4.5 Highlighting alternating lines – alternate.php ...... 35

5.1 A Presentation Skeleton – pres-skel.php ...... 57

5.2 Creating a patch file for the MySQL query cache ...... 60

5.3 Program versions ...... 61

6.1 Last modification check ...... 66

6.2 Last modification check adapted – pres-skel-t.php ..... 67

6.3 A reduced Squid configuration file – /etc/squid/squid.conf 70

7.1 Generate lengthy random output ...... 78

8.1 Activate MQC – /etc/mysql/my.cnf ...... 87

8.2 Activate persistent connections – /etc/php4/apache2/php.ini 87

8.3 Use persistent connections with PEAR::DB – db/db.php ... 88

9.1 The load function in inc/setup.inc.php ...... 98

9.2 The adapted presentation skeleton – pres-skel-n.php .... 99

A.1 Benchmark – benchmark.sh ...... 109

123 124 APPENDIX D. LIST OF LISTINGS

A.2 Squid patch file – squid ...... 112

A.3 APC patch file – apc ...... 115

A.4 MySQL Query Cache patch file – mqc ...... 115

A.5 Persistent connection patch file – persist ...... 116

A.6 Smarty Caching patch file – persist ...... 117 References

[AB04] MySQL AB. MySQL Language Reference: The Official Guide to the MySQL Language and APIs. MySQL Press, 2004.

[Arc03] R. Arcomano. Kernel Analysis HOWTO. Linux Docu- mentation Project, Mar 2003, http://www.tldp.org/HOWTO/ KernelAnalysis-HOWTO.html.

[BD99] G. Banga and P. Druschel. Measuring the capacity of a Web server under realistic loads. World Wide Web, 2(1-2):69–83, 1999, http://www.cs.rice.edu/∼druschel/wwwjsi99.ps.gz.

[BK93] Timothy C. Bell and David Kulp. Longest-match String Searching for Ziv-Lempel Compression. Soft- ware - Practice and Experience, 23(7):757–771, 1993, http://www.cs.ubc.ca/local/reading/proceedings/ spe91-95/spe/vol23/issue7/spe837.pdf.

[BLFF96] T. Berners-Lee, R. Fielding, and H. Frystyk. Hypertext Transfer Protocol – HTTP/1.0. RFC 1945, May 1996, http://www.ietf. org/rfc/rfc1945.txt.

[CMT01] I. Cooper, I. Melve, and G. Tomlinson. Internet Web Replication and Caching Taxonomy. RFC 3040, Jan 2001, http://www. ietf.org/rfc/rfc3040.txt.

[FGM+99] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee. Hypertext Transfer Protocol – HTTP/1.1. RFC 2616, Jun 1999, http://www.ietf.org/rfc/ rfc2616.txt.

125 126 REFERENCES

[Gun03] N. Gunther. UNIX Load Average Part 1: How It Works. TeamQuest White Papers, Dec 2003, http://www.teamquest. com/resources/gunther/ldavg1.shtml.

[Hub04] Jordan Hubbard. Open Source to the Core. Queue, 2(3):24–31, 2004, http://www.acmqueue.org/modules.php? name=Content&pa=printer friendly&pid=151.

[Jon05] M. Tim Jones. Optimization in GCC. Linux Journal, 2005(131):11, 2005.

[Kir05] Alexander Kirk. PHP and Multibyte. Apr 2005, http://alex. bandnews.org/?p=11.

[KP88] Glenn E. Krasner and Stephen T. Pope. A cookbook for using the model-view controller user interface paradigm in smalltalk- 80. J. Object Oriented Program., 1(3):26–49, 1988.

[Lin02] Nick Lindridge. The PHP Accelerator 1.2. PHP e.V. Mag- azine, Apr 2002, http://www.phpaccelerator.co.uk/PHPA Article.pdf.

[Mid02] Julian Midgley. Benchmarking Web Servers on Linux. Mar 2002, http://support.zeus.com/doc/tech/linux http benchmarking.pdf.

[Mog95] Jeffrey C. Mogul. The case for persistent-connection HTTP. In SIGCOMM ’95: Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communi- cation, pages 299–313, New York, NY, USA, 1995. ACM Press.

[Mog99] Jeffrey C. Mogul. Errors in timestamp-based HTTP header values. Technical Report, 99(2), Nov 1999, ftp://gatekeeper.research.compaq.com/pub/DEC/WRL/ research-reports/WRL-TR-99.3.pdf.

[Net05] Ltd Netcraft. Web server Survey. Apr 2005, http://news. netcraft.com/archives/web server survey.html.

[Par04] Terence John Parr. Enforcing strict model-view separation in template engines. In WWW ’04: Proceedings of the 13th inter- REFERENCES 127

national conference on World Wide Web, pages 224–233, New York, NY, USA, 2004. ACM Press.

[Qia96] Xiaolei Qian. Query Folding. In ICDE ’96: Proceedings of the Twelfth International Conference on Data Engineering, pages 48–55, Washington, DC, USA, 1996. IEEE Computer Society.

[Sch04] George Schlossnagle. Advanced PHP Programming. Sams, 2004.

[Sur00] Zeev Suraski. Output buffering, and how it can change your life. Zend Article, Dec 2000, http://www.zend.com/zend/art/ buffering.php.

[Swe01] Jason E. Sweat. Using PHP to Develop Three-Tier Architecture Applications. Zend Article, Dec 2001, http://www.zend.com/ zend/tut/tutsweatpart1.php.

[Wes01] Duane Wessels. Web Caching. O’Reilly & Associates, Inc., Se- bastopol, CA, USA, 2001.

[Wes04] Duane Wessels. Squid: The Definitive Guide. O’Reilly & Asso- ciates, Inc., Sebastopol, CA, USA, 2004.

[ZB04] Jeremy D. Zawodny and Derek J. Balling. High Performance MySQL. O’Reilly & Associates, Inc., Apr 2004.