Masaryk University Faculty}w¡¢£¤¥¦§¨  of Informatics !"#$%&'()+,-./012345

SQL query cache for the MySQL system

Master Thesis

Brno, April 2002 Martin Klemsa Hereby I state that this thesis is my genuine copyrighted work which I elaborated all by myself. All sources and literature that I used or consulted are properly cited and their complete reference is given.

Acknowledgements

I would like to thank Jan Pazdziora, the advisor of my thesis, for his help throughout the work. I am very grateful for his advice and valuable discussions. I would also like to thank Michael “Monty” Widenius – the MySQL chief developer – for a never- -ending stream of suggestions. Another thanks go to my family and my friends for their support during the period of time when I was working on this thesis.

i Abstract

MySQL is a free open source . For the purpose of increasing its performance, this paper presents an SQL query cache enhancement for the server. This new feature aims at increasing the server response speed, reducing its load and saving its resources by storing results of evaluated selection queries and retriev- ing them in case of their repeated occurrance. Reviews of previous work done on related subjects are given. System overview of the developed SQL query cache is presented together with a discussion of the problems encountered during its design and implementation. The comparison with the newly released version of the MySQL which contains a built-in query cache is also brought. Benchmarks shown to prove the performance increase of the modified server with repeated selection queries and a discussion about further enhancement and refinement possibilities conclude this paper.

Keywords: database, caching, hashing, MySQL, SQL, query cache

ii Contents

1 Introduction 1 1.1 Data retrieval ...... 1 1.2 Structured ...... 1 1.3 The MySQL database server ...... 2 1.3.1 SQL query cache enhancement ...... 2 1.4 The included CD-ROM ...... 2 1.5 Contents overview ...... 3

2 Caching 4 2.1 CPU caching ...... 4 2.2 World Wide Web caching ...... 5 2.3 Database web server query caching ...... 6 2.3.1 Active query caching ...... 6 2.4 Database query caching ...... 7 2.5 Database query cache requirements ...... 8 2.5.1 Memory requirements ...... 9 2.6 Differences between various types of caching ...... 9

3 Hashing 11 3.1 Collisions ...... 11 3.1.1 Open addressing ...... 12 3.1.2 Separate chaining ...... 12 3.1.3 Growing hashing tables ...... 12 3.2 The internal MySQL hashing table ...... 13 3.2.1 Fowler/Noll/Vo hash ...... 13

4 System Overview 14 4.1 The SQL query cache design ...... 14 4.2 The MySQL source code ...... 16 4.3 Server – client communication ...... 19 4.3.1 Sending fields ...... 20 4.3.2 Sending results ...... 20 4.4 Invalidation ...... 20

iii 4.4.1 Naive approach ...... 20 4.4.2 Gradual refinement ...... 21 4.5 Preventing parsing ...... 22 4.5.1 Table list entries duplicities ...... 23 4.6 Testing queries ...... 23 4.6.1 Sending modified data ...... 23 4.6.2 The mar sql not cacheable flag ...... 23 4.6.3 Temporary tables ...... 23 4.6.4 Special functions ...... 25 4.6.5 Procedures and UDF functions ...... 25 4.6.6 Variables ...... 25 4.6.7 MERGE table types ...... 27 4.7 Special MySQL options ...... 27 4.8 Caching empty results ...... 27 4.9 Caching more queries ...... 28 4.10 Generating hash key of stored items ...... 28 4.10.1 Active database name ...... 28 4.10.2 MySQL environment variables ...... 29 4.11 Memory limits ...... 29 4.11.1 sql cache memory limit ...... 29 4.11.2 sql cached query memory limit ...... 30 4.11.3 Determining cache size ...... 30 4.12 Cache replacement algorithm ...... 31 4.13 Fine-tuning the cache ...... 32 4.13.1 Ensuring thread safety ...... 32 4.13.2 Debugging compliance within MySQL ...... 32 4.13.3 Cache disabling situations ...... 32 4.14 The SQL query cache user interface ...... 32 4.14.1 Compile time options ...... 32 4.14.2 Command line options ...... 32 4.14.3 Client commands ...... 33

5 Comparison with MySQL 4.0.1-alpha 35 5.1 Differences ...... 35 5.1.1 Temporary tables ...... 36 5.2 Advantages of the built-in query cache ...... 36 5.3 Disadvantages of the built-in query cache ...... 36 5.4 Functionality errors ...... 37 5.5 Comparison conclusion ...... 38

6 Benchmarks 39 6.1 bench count distinct ...... 40 6.2 test alter table ...... 40 6.3 test ATIS ...... 40

iv 6.4 test big tables ...... 41 6.5 test connect ...... 41 6.6 test create ...... 42 6.7 test different select ...... 43 6.8 test repeated select ...... 43 6.9 test select ...... 44 6.10 test wisconsin 100 ...... 44 6.11 Benchmarks conclusion ...... 46

7 Discussion and Conclusion 47

A Installation 52

B Classes used 53 B.1 Class THD ...... 53 B.2 Class mar sql cache ...... 53 B.3 Class mar sql cache item ...... 54 B.4 Class mar sql cache packet ...... 55 B.5 Class db plus table ...... 55 B.6 Class db table info...... 55

v Chapter 1

Introduction

1.1 Data retrieval

Storing and retrieving data is one of the computers’ duty since the beginning of computing. – storage facilities for data – and computer languages for working with them have been evolving ever since. As time goes on, more and more speed is required of data retrieval processes. There are fast rotating hard drives, high speed memory units and lightning quick processors. But there is also more data to work with, more users to serve, and larger retrievals demanded. The data is pulled from the database by submitting a query – an expression in a computer language that tells the database system to find some data and send it to the client. Evaluation of these queries upon large databases can be very demanding as far as server resources are concerned. Ways are looked-for how to reduce data retrieval latency, minimize the server load and thus to enhance the performance of the systems. One of the methods (applicable to probably every job a man or a computer ever has to do) is preventing an unnecessary repetition of the work that has been already done. This can be achieved (at least in the case of the database server) by storing the retrieved data into some place from which it can be regained much faster than by pulling it from the database again. A possible solution is to store the data into a predefined area in the computer’s memory where it can wait for repeated usage. Such area is called the cache and the technique is called caching.

1.2 Structured Query Language

Perhaps the most popular language for retrieving data from a database today is the SQL – Structured Query Language – or some of its modifications. Queries in SQL have a more or less intuitive syntax, e.g. SELECT * FROM t1 WHERE t1.name1 = "Monty"; returns all entries from table t1 with attribute name1 equal to Monty. It is also possible to nest queries, e.g. SELECT * FROM t1 WHERE t1.name1 IN /*nested query begins here*/ SELECT name2 FROM t2; which returns all en-

1 tries from table t1 with attribute name1 in the set of all name2 attributes from table t2. Of course, there are many possibilities and a variety of queries that can be expressed in the SQL, but their full width is not important for this thesis. Queries that are necessary are all mentioned and briefly explained when it matters.

1.3 The MySQL database server

Many database servers are used world-wide and these systems, if commercial, are very costly. In order to make the database usage more available to common users that are not under the wings of an organization that would finance the use of a com- mercial database system, non-commercial systems came to being. Another reason for their development could be increasing competitive environment on this field and uncovering the possible weaknesses of the commercial products. The MySQL is one of these free open source systems. The project was begun in the 80’s by one Finn, Michael “Monty” Widenius, and two Swedes, David Axmark and Allan Larsson. Its increasing popularity can be proved by stating some of the companies that use it – Yahoo!, Finance, MP3.com, Motorola, NASA, Silicon Graphics, and Texas Instruments [1]. When working on this thesis began, the current version of the MySQL was 3.23.33. During working, version 4.0.0-alpha was released. None of these versions had an implemented query cache.

1.3.1 SQL query cache enhancement Michael Widenius, the MySQL chief developer, considered adding the query cache into the server a matter of highest importance, and contacted (among others) Jan Pazdziora (the advisor of this thesis) with a call for proposals and opinions. Thus, the SQL query cache became available as the master thesis at the Faculty of Infor- matics and was picked by the author. Main requirements on the SQL query cache (how the created query cache is called throughout the work) included correct functionality on as many queries as possible and server performance gain where it is intuitively expected – in case of repeated query submission from the clients. Another requirement was the proper behavior in multithreaded environment, as MySQL is written as a multithreaded program.

1.4 The included CD-ROM

The CD contains all the code that came to being during solving this thesis. Anyone can either use it as a patch (the patches/ directory, where all versions of the SQL query cache can be found (the latest version is 2.7), or use a copy of the sql/

2 directory (for MySQL 4.0.0-alpha) or a copy of the sql-3.23.33/ directory (for the older MySQL 3.23.33). For installation details see appendix A. The files on the included CD are referred to in this paper as CD/[path]/*, where [path] is the path to the particular file on the CD, and * its name.

1.5 Contents overview

The paper is organized as follows: Chapter 2 deals with caching and reviews previous work done, chapter 3 decribes the basics of hashing and advantages of its usage for looking up cached queries, chapter 4 presents the created MySQL server enhance- ment called the SQL query cache, chapter 5 shows the comparison between the SQL query cache and the built-in query cache of the new version of the MySQL server – 4.0.1-alpha. Chapter 6 documents, with the help of the benchmark results, the peformance of the modified system and chapter 7 brings a summary of the work that has been done, together with a discussion and suggestions for further development.

3 Chapter 2

Caching

There are many ways how caches are designed and constructed and many fields they are deployed in. Their widest range possible should be considered before designing a new one. This chapter presents some of them. All of the following cache schemes use different concepts and different solutions, they are both on the software and the hardware level and many interesting things can be learned from them.

2.1 CPU caching

Today’s processors are so fast that the computer’s memory simply cannot ensure a dataflow consistent enough for them to be effectively utilized. The memory unit fast enough is very expensive, larger, builds more heat and also consumes more power, as it needs more transistors to hold each bit of information than the common memory unit [2]. The CPU cache principle is that it keeps a copy of a unique memory address from the main memory unit in the fast cache memory unit, and when that address needs to be accessed, supplies the cached data. This wouldn’t be very effective if all the memory addresses were accessed regularly, as the cache memory has a much lower capacity than the main memory. Luckily, there is a term in computer science known as the locality of reference:

Programs tend to reuse data and instructions they have used recently. A widely held rule of thumb is that a program spends about 90 % of its execution time within only about 10 % of the code [3].

This allows relatively small amounts of a very fast memory utilized as a cache to be very effective in keeping the processor filled with the data and instructions it needs, thus significantly improving the overall computer performance. Nowadays, two-level caching is typical for personal computers. The first level is the fastest and the smallest (typically tens of KB) and the processor looks for the requested data into it in the first place. When it doesn’t find it there (a cache miss occurs), the second (slower and larger – typically hundreds of KB) level is searched,

4 and in case of yet another miss, the main memory has to be called for and supply the data, which dramatically increases latency. The CPU caching also allows caching writing operations, which is not common in those types of caches discussed later. There are two policies how to handle them – write-back and write-through. The former waits for cache flush for changes to be written to the main memory and is mostly used in single-processor systems, the latter writes both to the cache and to the memory in case of all writing operations and is mainly deployed in multi-processor systems.

2.2 World Wide Web caching

The World Wide Web (WWW) caching is mainly aimed at reducing server response latency and bandwidth demands and is beneficial both for the Internet users with high speed connection and the users with slower connection (e.g. using a modem). The instances of the WWW cache (such as Squid, which has its website at http://www.squid-cache.org) are called proxy cache servers and provide functions for storing and retrieving the data from the most used URLs1 that are typically more distant from the end user than from the cache server itself. As in the case of CPU caching, the copies of the data are stored, only they aren’t the contents of a memory address, but documents – the contents of a WWW address. When a request comes to the proxy cache server, it looks into its index in hope the requested data is cached. If so, it is sent to the client. In the other case, the cache server itself contacts the server with the requested data, pulls the data into the cache and sends it to the client. In order for this strategy to be efficient, the hierarchy of cache servers has been built. That is, close to the end user there are many cache servers. Groups of these are ‘supervised’ by higher level cache servers and again groups of these supervising servers are packed under even higher level cache servers. When a request comes from the user, it goes through all levels of the cache and only the highest level cache servers contact the actual server holding the requested data, if it was not found in any of the caches. Through this, the WWW caches reduce demands on the wide-area network band- width (the data is not transferred from distant servers so often), and it also reduces latency (the time spent on waiting for the data to be actually delivered to the client) [4]. The tree structure discussed above is not the only strategy how to handle data replication (storing copies of the same document in more proxy cache servers) – either sibling to sibling or parent to child. But all such methods must cope with the ever increasing traffic load and number of demands. One of them, presented in [4], deals with a backbone cache mesh of 10 nodes on the same level (siblings) and discusses traffic load distribution across these servers. In the end it is stated that due to an unevenness in domain layout over the world the best results are gained

1Uniform Resource Locator – a unique address within the World Wide Web

5 when half of the servers handles requests from the ‘.com’ domain and the other half everything else.

2.3 Database web server query caching

A web site constructed with the help of a back-end database system and providing an interface based on forms allowing users to submit queries is called the database web server. These systems become more and more popular and more and more network traffic is composed of this query – result kind of communication. The most common web caching method – using proxy cache servers – can’t cope with this kind of traffic simply because the proxies only cache static files [5]. However, there is an enhancement called the Active Cache Scheme [6] which allows modified proxy servers (Active Proxies) to cache even a dynamic content to some extent. The method is based on servers sending pieces of JAVA code (cache applets) to the proxy associated with each document. When the proxy is asked for the object it holds in the cache, it executes the appropriate applet which can generate dynamic parts of the requested document.

2.3.1 Active query caching Because this concept still doesn’t allow caching queries, a further modification called Active Query Caching that should cope with this problem is proposed in [5]. In addition to cache applets it introduces query applets. The basic function of these applets is to store query results at the proxy and retrieve them whenever an identical query is submitted to the server. This is called passive query caching. However, in order to increase the performance gain and cache utilization, the query applets have two more functions implemented – a query containment checking (testing if one query is a subset of another) and a simple selection query evaluation. These take advantage of the fact that web-based queries constructed automatically from data the user input into forms are usually selection queries with simple predicates executed upon a single table. The columns of this table are the forms’ attributes, which significantly simplifies the query containment checking and the query evaluation [5]. The containment checking is based on transforming all conditions into predicates in the Conjunctive Normal Form (CNF). This can be done just because the selection queries are simple, e.g. SELECT list FROM table WHERE condition. The problem of one query containing another is then reduced to recognizing if one CNF condition is more restrictive than another. Each applet, corresponding to a single URL containing a form, answers all queries sent from this form. Thus, as there are surely more URLs with forms, the proxy server holds many query applets, each of them managing its own query cache. Three schemes are proposed for the cache replacement (removing old cache items when new ones need to be inserted) – LFU (Least Frequently Used), LRU (Least Recently Used) and benefit-based. The benefit-based method combines the other

6 two to gain the best possible cache utilization and is defined in [5] as a weighted sum of the reference frequency and the recency. The presented solution is very interesting and also efficient on the field it was created for. However, for a general database query cache some features (e.g. the query containment checking enhancement) can’t be used as they are, because as a prerequisite they use simplified selection queries, which do not fully cover the selective power of the SQL, therefore they must not restrict the SQL query cache for the MySQL.

2.4 Database query caching

The Active Query Caching presented in section 2.3.1 is an example of OnLine Trans- action Processing (OLTP), which is one of the database servers’ deployment. The queries are relatively simple and do not work with too much data at a time. The Decision Support Systems (DSS), on the other hand, often encounter queries with a high amount of aggregation and complexity which need to access a substantial part of the data stored in the database. Thus, the response time for these queries is significantly higher than for the OLTP queries [7]. In the end, though, these queries return relatively small results containing aggregate data such as averages, sums and counts. When added to the fact that in many cases even DSS queries are submitted by users interactively, the need to cache the results in order to reduce the response time seems substantial. A caching manager for these systems is presented in [7]. It takes advantage of the fact that most queries submitted to the DSS systems can be transformed into a canonical form generic enough to fulfill the needs of most applications and allows fast containment (subsumption) testing (also written about in [5]). It then looks up queries both using an exact query match and the developed algorithms for query subsumption recognition. The condition for this active caching to work is that queries are in the canonical form – they have a common structure and contain only keywords SELECT, FROM, WHERE, AND, GROUP BY and HAVING (see figure 2.1).

SELECT select list, agg list FROM t1, t2, . . . , tn WHERE join condition AND select condition GROUP BY group by list HAVING group condition;

Figure 2.1: Canonical form of a query

Any query not in the canonical form is cached passively – no containment check- ing is performed. When a query is found to be contained within another query that is already in the cache, this relationship is represented in a query attachment graph. Thus, when the

7 need to check the subsumption would arise again, the graph is searched for already found relationships and in case of a hit, time is saved. The query attachment graph is also essential for the algorithm that handles inserting new items into the cache and removing the old ones. This cache replacement algorithm is very elaborate. It decides whether the new query result should be inserted into the cache and if so, which one of the old queries should be removed in case the cache is full. Its decisions, done on the basis of the query reference rate, execution cost, the size of the result and the maintenance cost of the result, are aimed at minimizing the response time [7]. From the presented solutions it can be seen that database caching is indeed a wide problem with a number of sub-problems it consists of, e.g. the algorithms for subsumption checking or cache admission and replacement. Due to the generality of queries that can be sent to the MySQL server, the will to allow as many varied queries into the cache as possible and also time and man-power limitations, it was decided to keep the MySQL query cache simple, using only the exact query match to look up stored queries. Because MySQL version 3.23.33, with which the work was begun, does not support nested selection queries (subselects), which highly encourage the deployment of the query containment checking, this solution was considered sufficient.

2.5 Database query cache requirements

There are two requirements concerning the query cache for the MySQL database server that must be discussed in advance – user transparency and a 100% accuracy. That means, a user must never notice the difference (except from the speed of data retrieval and presence of some controlling commands) between the server with the cache in use and the server without it – in other words, query results returned must never be influenced by the presence of the cache. The 100% accuracy is not as common a prerequisite as it may seem. For example, in case of the most commonly used WWW cache, Squid, it is not required – the cache may send an old version of a document that has been modified to the client and still be considered working well. In this case, old doesn’t mean out of date. So as to avoid returning out of date data to clients, objects must expire. Squid therefore allows setting refresh times for objects, ensuring out of date (meaning with the refresh time expired) data is not returned to clients [8]. However, as an example of a WWW cache that claims to guarantee return- ing really up to date URLs (using HTTP 1.0’s GET if-modified-since operator), NetCache, presented in [9], should be mentioned. Even the database query cache could return results based on out of date data, this depends mainly on the users’ needs and the deployment of the database server. In order for the database server with the query cache to stay as versatile as possible, it is advisable that when a table gets updated, all subsequent operations use the newest data – thus, the query cache ensures the 100% accuracy.

8 Some other requirements, not necessarily compulsory, but also important are asked for the database query cache to comply with:

• Increase of the server performance with repeated queries.

• Not a significant drop of the server performance otherwise.

• Transparency to the server performance while turned off.

By the transparency here it is meant that the server mustn’t be noticably slower with the cache turned off than altogether without it.

2.5.1 Memory requirements Choosing how much memory to allow for each stored result and for the whole cache to allocate is an optimization problem with two basic constraints. First, for the hit rate (ratio of successful cache lookups and total cache lookups) to guarantee better performance of the server with the cache than the server without it, the size of the cache must be large enough. Second, the size of the cache must be reasonable to avoid its parts being swapped from the memory, in which case the server could end up performing time-consuming hard disk operations. More precisely, suppose the cost of computing and returning the query result with no caching is denoted by costnormal(Q), cost of returning the result with a cache hit by costhit(Q) and the cost of computing, returning and inserting the result into the cache after a cache miss by costmiss(Q). Then the following formula stands for the minimum hit rate ratio τ(Q) [10]:

τ(Q)· costhit(Q) + (1 − τ(Q))· costmiss(Q) < costnormal(Q)

Because costhit(Q) and costmiss(Q) couldn’t be known in advance, the default memory size for the SQL query cache was set according to empirical benchmark results.

2.6 Differences between various types of caching

Four different types of caches have been presented. Many features are shared by them, many are not. Perhaps the main difference between the database caching and the other types of caching is the basic principle how the cache helps to improve the system performance. In case of the CPU and WWW cache, a copy of the data referenced by a unique address (no matter if it is the memory address for CPU caching or the URL for WWW caching) is stored in a place from which it is faster accessible either by the processor or by a client. From this point of view, World Wide Web caching is more similar to CPU caching than to the database caching.

9 In case of the database cache, results of selection queries (that have to be evalu- ated every time they are needed) are stored, thus increasing server response time in case of the repeated client requests. The clients benefit from reduced server response latency. One specific difference between the CPU cache and the other types is that not only read operations, but also write operations are cached.

10 Chapter 3

Hashing

When there is a need to store and retrieve data represented as numbers, it can be done e.g. using binary trees, sorted lists etc. However, with text data of different lengths (which queries have), usage of these structures loses its advantages and a better method has to be found. When storing results of queries, there is not much of a choice as far as the main part of the key is concerned – it must be the text of the query itself. With more queries in the cache, this leads to the problem of searching between basically text keys1 quickly. This can be achieved by hashing. Briefly, hashing is a method using a special function (called the hashing function) to assign a position in an array of finite length to each of the keys – strings of characters2. The array, together with the function, is called a hashing table. More precisely, suppose there is a hashing table consisting of an array formally described as A[0 . . . m − 1] and a hashing function h(x): U → h0, m) where U is the set of all possible keys, possibly infinite, and m is the size of the array. Then each key x is represented in the table by the position A[h(x)], and it can be looked up quickly.

3.1 Collisions

Because |U| is probably greater than m (for queries that would only be 8 char- acters long composed only of the 26 letters of the English alphabet |U| = 268 = 2.088270646· 1011. A table of such size can be worked with in theory, but practi- cally, if a single position would only take up one byte, this would mean a table about 209 GB large), cases that more keys are hashed into the same slot occur. These so called collisions have to be coped with. There are many methods to do that, in this paper, only some of them are discussed.

1Some more non-text information has to be added into the key – this will be discussed later. 2Hashing works well for numbers too – only the hashing function is constructed differently.

11 3.1.1 Open addressing This is perhaps the most straightforward and simplest method which takes advantage of the fact that in the hashing table that is only partially filled, there probably is a free spot somewhere ‘near’ the occupied position A[h(x)] in which it is needed to put the new item. Collisions are undone with the help of a collision function c(x): U → h0, m). When position h(x) is found already occupied, new positions in the distance c(x) (modulo m) from the original position are searched. Good c(x) should be easily computed and should ensure that all positions in the table shall be searched in the worst case. The latter condition can be met e.g. by setting c(x) and m to be coprime numbers. Usually a linear step is chosen – with c(x) ≡ 1. Open addressing certifies effectiveness for inserting and searching operations, however, for removing an item, an effective algorithm is not known [11]. The effectiveness spoken about is crucially influenced by the ratio of inserted table entries and the size of the table – table fulfilment factor α. In case of α ≤ 0.8 three steps are expectedly sufficient for finding any item in the table, no matter its size. For α → 1 the effectiveness quickly drops because colliding items have to be stored farther from the original position A[h(x)] [11].

3.1.2 Separate chaining This method is good for hashing tables which are expected to store more entries than the capacity of the array allows. Its principle is that all items belonging to a single position in the table (with the same h(x)) are stored in a linear list (chain), the head of which is in position A[h(x)]. Searching this structure is very easy – to find item x the algorithm simply goes through the list for A[h(x)] linearly. This, however, in the worst case of all items being hashed to the same position, dramatically reduces effectiveness down to O(n). The removal of an item is easier than in the case of open addressing with the same advantages and disadvantages as searching [11]. The expected number of attempts is roughly 1+α/2 for a successful search (hit) and e−α +α for an unsuccessful search (miss) [12]. Please note that when using the separate chaining method, the case α > 1 can occur, as more entries can be assigned to any single position in the table.

3.1.3 Growing hashing tables In cases when the size of the hashing table (m) can’t be decided prior to using it, a mechanism to enlarge it might be needed. One of the methods to cope with this problem is to use a growing hashing table. When the fulfilment factor α of such a table exceeds some value set in advance (e.g. 90 %), the hashing table is transformed into a larger one of size d · m for a suitable d > 1. During this operation called rehashing a new hashing function with a respectively widened range is used. By a detailed analysis it is possible to

12 specify a moment in which it is better to rehash the table than to continue operations upon the current one, because the expected operations would grow increasingly uneffective. The cost of the rehashing is obviously linear – new positions have to be assigned to all keys [11].

3.2 The internal MySQL hashing table

In the MySQL core code, there is a universal hashing table implemented. Instead of a static array presented so far, it uses a dynamic array to allocate memory for items stored in it. For collision resolving, a modified separate chaining method is used – the modification mainly being the consequence of using the dynamic array instead of the static one.

3.2.1 Fowler/Noll/Vo hash The hashing function h(x) is very interesting due to its mathematical background. The basis of the hashing algorithm was taken from an idea sent by email to the IEEE Posix P1003.2 mailing list from Phong Vo and Glenn Fowler. Landon Curt Noll later improved the algorithm. The magic is in the interesting relationship between the special prime 16777619 (224 + 403) and 232 (all possible four-byte integers) and 28 (all possible ASCII char- acters). Therefore it works well on both numbers and strings. For a more detailed description please see the MySQL source code (particularly the file mysys/hash.c) or consult figure 3.1.

uint calc_hashnr(const byte *key, uint len) { const byte *end = key + len; uint hash; for (hash = 0; key < end; key++) { hash *= 16777619; hash ^= (uint) *(uchar *)key; } return (hash); }

Figure 3.1: The internal MySQL hashing function

13 Chapter 4

System Overview

Enough information has been given to prove that caching can bring significant en- hancement on a large scope of deployment. This chapter discusses the development of the query cache for the MySQL database server from the very beginning covering the basic cache design to the very end presenting the SQL query cache user interface. It was decided that first the work would be aimed at implementing a safe query cache with the capacity of one query – the advantage of this approach is that first all the concentration is on the flawless correct storage and retrieval of one query result, and only when this works, the cache may be enhanced to store more queries and the concentration may lay upon the searching and cache replacement algorithms. This is also how this chapter is divided – from the beginning, the implementation of the cache for one query is covered, and only from section 4.9 caching more queries is discussed.

4.1 The SQL query cache design

When designing a query cache for a database server, one must first know what exactly happens when a query is delivered to the server from the client. In case of MySQL, the query string is first trimmed of any leading and trailing white spaces it may have, as well as of the trailing semicolon. Then it is written to log, parsed, and if the parsing found no discrepancies in syntax, executed1. The results are sent to the client and the query processing is finished (see figure 4.1). Obviously, parsing and execution are the two procedures where time can be saved by prevention of their unnecessary repetition. There are three ways how to design the basic functionality of the cache that differ in what is to be stored to the cache with each query string. First, it may be only the result of the parsing, in which case the query is always executed. In this case the data stored in the cache consists of information about the used tables, functions, expressions, conditions, etc. Such data can be, for example,

1In MySQL, execution preparation takes place during parsing, so the dividing line between parsing and execution is not totally clear.

14 Server ¨

- trim trimmed query ? ¨ query - ¨ Client parse Log results © ?resultsparsing © execute

© Figure 4.1: Query processing with no query cache packed in some sort of a tree structure, from which it is possible to retrieve it quickly and use it for the execution stage. Second, only the result of the execution may be stored. This case is further divided into two subcases – either the cache lookup is placed before the parsing (pre-parsing lookup) or after it (post-parsing lookup). The former eliminates multiple parsing of cached queries and thus saves time, however, increases memory demands, as more information for the sake of e.g. the access rights checking needs to be stored with each query. The latter retains the parsing need for all queries no matter if they are in the cache or not, and saves memory. Because the cache was designed to be as beneficial as possible as far as performance gain was concerned, the pre-parsing lookup approach was chosen for the SQL query cache (see figure 4.2). The third and the last possibility is to cache both the parsing and the execution results (two-level cache – see figure 4.3). The advantage of separate parser result caching is that the same queries even submitted by different users (that can in the end give different results) would have to be parsed only once, because the parser results are the same for syntactically same queries. However, there is one condition – the parsing stage would have to be clearly separated from the execution stage, and this is not true in MySQL. It was decided that the cache for the MySQL server should be one-level with pre-parsing lookup – the final result data is stored with each query (see figure 4.2). The two-level caching, after some research had been made, was considered very hard to do, not only because parsing and execution aren’t separated clearly enough, but also due to the fact that parsing results are stored in two structures – THD, for storing the information about one connection handled by one thread, and LEX, for storing the most of the parsing analysis results. The contents of these structures are hard to extract and store otherwise, and have in some cases different persistence

15 Server ¨

- trim trimmed query ? ¨ query - ¨ Client (hit) lookup Log results 6 © trimmed?query (miss) © parse parsing results ? execute

© Figure 4.2: Query processing with one-level pre-parsing query cache within the lifetime of the query. Moreover, cross-pointers THD->LEX and LEX->THD are frequently used. An attempt has been made to make things clearer and to separate the two struc- tures. The parsing result should only be in the LEX structure, and THD should only contain information about the thread, i.e. one connection to the server, independent of the currently processed query. In MySQL version 3.23.33, it was not very hard to remove the cross-link and the attempt was successful2. However, no further research was made as for how to actually move all the relevant query dependent data from THD to LEX and the query independent data from LEX to THD. This later proved to be a lucky solution, because the version 4.0.0-alpha came again with heavy usage of the cross-pointers LEX->THD and THD->LEX and made the changes previously done impossible to retain.

4.2 The MySQL source code

The whole code of MySQL 3.23.33, on which the work started, as well as the code of MySQL 4.0.0-alpha, on which the work was finished, is huge, written in C++. It was decided that the query cache should be the part of the server program itself – mysqld – rather than a stand-alone module. The mysqld’s source code is found in the mysql-*/sql/ directory, where mysql-* is the main directory of the appropriate distribution. 2see CD/patches/mysql-3.23.33-sql-lex-patch

16 Server ¨ trimmed trim ? query 6 parsing query lookup Client¨ parsing results trimmed - Log ¨  ?(hit) ?query (miss) results (hit) result parsing  parse © lookup results © (miss)

parsing results ? execute

© Figure 4.3: Query processing with two-level query cache

For development, RHIDE v1.7, a free Integrated Development Environment co- operating with gcc, was chosen. The first step was to compile the mysqld in RHIDE. Libraries and defines needed were extracted from the Makefile for mysqld and inserted into settings of RHIDE. After this, compilation was successful and tracing could begin. Since there was no prior knowledge as to where relevant parts of the code were to be found, except that the main parsing function mysql parse() is in sql parse.cc, the file containing main() function was taken (mysqld.cc) and the tracing started there. At that time it was found that RHIDE can’t handle breakpoints in multithreaded programs, one of which mysqld is, so simple printf()’s were used to tell position within the running code. After some initial problems, function do command() in sql parse.cc was located, then mysql parse(), mysql execute command() and finally mysql select() in sql select.cc, which were the most high-level functions to focus on at the beginning. For complete reference of function calling within the MySQL query selection processing please consult figure 4.4. Names of the files where the functions can be found are typed in a non-proportional font in that figure. Please note that the call graph corresponds to version 4.0.0-alpha only, as some of the relevant parts of the code changed since 3.23.33. Generally, from this point on, every remark about MySQL applies to version 4.0.0-alpha, unless stated otherwise.

17 pthread handler decl() sql parse.cc ¨

do command()

dispatch command() mar sql cache.* ¨ cache not present cache present mysql parse() miss mar sql cache::lookup() hit yyparse() sql yacc.yy ¨ send data to client

create func *() item © create.cc ¨ ©

mysql execute command() ©

handle select() sql select.cc ¨©

mysql select() at least one row no rows do select() return zero rows()

sql class.cc select send::send fields() ¨

send fields() sql© base.cc ¨

end send() ©

raw data modified data ¨ ¨ Field::send() Item::send()

field.cc item.cc

© ©© Figure 4.4: The graph of function calls during a selection query processing

18 4.3 Server – client communication

MySQL is a multi-user database server and the clients connect to it locally or via the network. It is important to learn more about this communication. All data is sent to clients in packets. To cache packets and not the data that these packets consist of would be very restraining as for the cache sharing spanning different users, since the function that constructs packets is passed a number of flags which have direct influence on how the packets are constructed and may vary among clients using the MySQL server. If a cache storing packets were used, these clients would be unable to share stored cached queries among themselves. So it was decided that caching packets directly would not be used. Instead, when cached data is sent to a client, the packets are constructed from the stored data by similar functions that construct them in the time of the original processing, which necessarily slows the cache down, because of the need to construct the packets every time query results are about to be sent (see figure 4.5). This was done on purpose of reaching the maximum cache utilization, in a trade-off for reducing the highest possible performance.

server ¨ server ¨

data ¨ data ¨ ? © ? © query cache ¨ data packet generation ¨ set ¨ ? ? ? © ? © © query cache ¨ packet generation ¨ packet ¨ packet ¨ packet ¨ set 1 set 2 set 3 © © ? ? ? ? © ? © ?© ©© user 1¨ user 2¨ user n¨ user 1¨ user 2¨ user n¨

users sharing© data in© query cache© users with© their own© data in cache ©

Figure 4.5: Caching data vs. caching packets for users accepting different packet lengths or having different environmental settings (e.g. character sets)

19 4.3.1 Sending fields Having by what means the data is sent, getting familiar with what MySQL sends to the client as a reply to a query was the next step. The data in the table is stored in rows and columns. In a single row, the columns are called fields. When sending the result of a select, number of fields (columns) and field names together with the name of the table they come from are sent first. So first part of each cached item (class mar sql cache item) is the field storage. This con- sists of two separate entries – number of fields stored within mar sql cache packet, and the list of field names (see appendices B.3 and B.4). Whenever a function that sends this data to a client, send fields() in sql base.cc (see figure 4.4), is called, the data sent is also stored into a new temporary cache item, which is later inserted into the cache (THD::mar sql new item – see appendix B.1).

4.3.2 Sending results After the number and the names of the fields are sent, the MySQL server then starts sending the actual results row by row (Field::send() – see figure 4.4). This is similar to sending field names, except that the data is put into one packet for each row, so MySQL sends the data over the network in complete rows. This was reflected in data storage to cache, where there is a special symbol used for the end of row. Also it was found that if a column value is NULL, a special symbol is stored into the packet. This is also reflected in data storage, with another special symbol and special storing function mar sql cache::store null().

4.4 Invalidation

Perhaps the greatest problem of all with the caching of queries comes with invalida- tion. That is, making sure that a query that gave the result A and after some other commands modifying the base data ought to give result B, doesn’t give result A again due to its presence in the cache. In yet other words, this is exactly keeping the SQL query cache 100% accurate and transparent for the users (clients).

4.4.1 Naive approach The first concept was simple – a list of ‘dirty’ commands modifying the data was added into mysql execute command() (sql parse.cc) and whenever one of these commands was received, the whole cache was invalidated. This wouldn’t do, because the cache that would be cleared up every time anything in the database is updated wouldn’t achieve any good utilization, maybe except when used with a totally static database. So a finer invalidation was required – only those queries that used the modified data should be invalidated.

20 4.4.2 Gradual refinement For finer invalidation purposes, each query has to contain information about which databases and tables it used for the data retrieval. This is stored in a simple list in a convenient format database name’\0’table name’\0’ (see appendix B.5), which saves the memory space and is advantageous for more reasons – first, in an environ- ment where all memory allocations take relatively long time, it saves time when one allocation routine call may be used instead of two. And, together with the informa- tion about the stored database and table length, it allows using these as strings in C language by simply returning the pointer to the first character of the appropriate string. When invalidation according to tables (only items using given tables are in- validated) takes place, its speed is essential. That’s why a new structure in class mar sql cache was introduced – HASH dbs tables (see appendix B.2). It keeps records containing joined names of the database and the table together with the pointers to the entries (cached queries) in HASH items, which use them. Thus when a need to invalidate all items using a particular table arises, the record for this table is either found in dbs tables hash, in which case all items in its list are removed from the cache, or not found in dbs tables hash, in which case no cached items use this table and none have to be invalidated. Of course, queries can, and in lots of cases do, use more tables, such as SELECT * FROM t1, t2; where t1 and t2 are tables. After executing such query, two records would be entered into the dbs tables hash, each pointing to the storage structure for this query. When, say, DROP TABLE t1; command would then need to be executed, t1 would be found in dbs tables hash and the query it points to (SELECT * FROM t1, t2;) would be removed from the cache. But its pointer would still remain in the dbs tables item for table t2. That’s why when invalidating a cached item, the program also goes through its list of used tables, searches for them in dbs tables hash, and, when needed, removes the pointer to this item from the dbs tables’ item list. Still finer invalidation is possible and was considered – invalidation according to columns in tables. Imagine a table with four columns, two of which were used for evaluation of a selection query, which was inserted into the cache. When the other two columns get updated, there is actually no need to invalidate the query from the cache. However, this solution would require storing column information for each query in the cache and it was estimated that storage capacity and processing overhead wouldn’t be worth the performance gain, especially when the supposed programming work was taken into account. One example where this approach would be highly desirable is the case of a large one-table database. Not only finer, but also more general invalidation was needed, so a new function invalidate by db() was added to the cache code, in order for commands like DROP database; (which do not use tables and were to this point removing all items from the cache) to invalidate only items which really need to be invalidated. This function

21 takes a database name as a parameter and removes only items using this database from the cache. Another problem with invalidation is when to do it. The intuitive approach would be to handle the invalidation at the beginning of the execution stage – to simply insert a list of commands that cause the invalidation and if such a command comes, invalidate the cache before it is executed. However, in a multithreaded multi-user environment, this is also the most naive approach. When a selection query is to be executed before the base data is really changed, it still would be desirable to retrieve its results from the cache if possible. Only after the data is changed, the cache needs to be invalidated so that the 100% accuracy condition is met. This is the way the invalidation is done in the SQL query cache presented in this thesis.

4.5 Preventing parsing

Parsing the queries is another thing that takes relatively a lot of time that is worth saving. So ways and means were looked for how to prevent parsing of queries that are stored in the cache. The main problem was in the fact that parsing the query produces the TABLE LIST structure where all tables used by that query are recorded, which is needed for access rights checking. This is quite different from the structure that is used in the cache for storing tables, and shouldn’t be stored as is, because that would mean more memory allocation calls. So it was decided that the caching would be split into two ways.

• First, ‘caching per user’ – username and IP was added into the cache key, allowing only users who sent the original query to retrieve its copy from the cache. This eliminated parsing and the access rights check. The modified key was named extended key. The ‘caching per user’ can be turned on by a command-line option --sql-cache-ext-key. By default it is off.

• Second, the cache stayed as it was, the cached queries were shared among users, the parsing and the access rights check was always done.

In order to aviod parsing even in the second case, it was necessary to insert access rights checking into the cache lookup code. Since the checking functions require TABLE LIST * argument for passing the names of the used tables, the TABLE LIST structure was added into mar sql cache item (see appendix B.3). Filling this structure is the last thing done with the cache item, after actually having been inserted into the cache, because of the elimination of duplicities in original table list, that are discussed later. If the access rights check is positive, the results of the query are sent and the lookup function returns TRUE value, to tell the do command() function which calls it not to call mysql parse() and rather finish processing the query (see figure 4.4). If, on the other hand, the access rights check is negative, the relevant information is sent to the client, and, maybe surprisingly,

22 again the TRUE value is returned. This is because of the fact that there is no need to run the parser and then check the access rights again – it would produce the same negative result.

4.5.1 Table list entries duplicities In case of large selection queries using unions and joins, the table list produced by the parser may contain some table names twice or more times. The cause of this behav- ior is unknown, anyway, for purposes of storing the query into the cache structures, this redundancy is inconvenient. When an entry into the dbs tables hash is done, the duplicates are recognized and discarded (which on the other hand slows the pro- cedure of storing the query into the cache), so that each table name remains exactly once everywhere it is needed (mar sql cache::dbs tables insert item()).

4.6 Testing queries

At this point the cache worked fine for a set of ‘easy’ selection queries, such as SELECT list FROM t1 WHERE attribute > N; etc. But the database servers must cope with much more complex queries, where the real benefit of the cache should be seen as their processing takes much more time and resources. So testing of more difficult queries began.

4.6.1 Sending modified data It was found that if the sent data is not exactly the data found in the table but is somehow modified – e.g. by SUM() and COUNT() functions (typical for Decision Support Systems [7]), etc. – it is not sent to the client by Field::send() function as usually, but instead by Item::send() function (see figure 4.4). So the code of the Item::send() had to be changed accordingly to Field::send(), which solved the problem with caching these queries.

4.6.2 The mar sql not cacheable flag Sometimes there is a need to prevent inserting the submitted query into the cache. In order to be able to ensure this, the mar sql not cacheable flag was added to the class THD. It enables marking queries the caching of which is not desirable (see appendix B.1).

4.6.3 Temporary tables Temporary tables are non-persistent tables which users may create during their connection to the server. They are stored in memory and they may share names with persistent tables. They are only visible within the connection that they were created in.

23 The SQL query cache doesn’t allow caching queries working with temporary tables. The discussion on the problems which led to this decision is presented here. The creation command itself – CREATE TEMPORARY TABLE (...) – must be paid special attention to. This is because temporary tables may use same names as persistent tables and more connections can hold temporary tables with the same names. Caching queries using these tables would therefore mean e.g. inserting a connection ID into the query hash key. Since there is no knowledge if a temporary table is used in a selection query before it is parsed, cache lookup before parsing would be unusable for such stored items. Inserting connection ID into the key of every query would mean ‘caching per user’, which is desirable to avoid. Threrefore it was decided to disable caching queries using temporary tables and to keep the interference between such queries and the cache as little as possible. Creation of a temporary table doesn’t cause invalidation, but forbids pre-parse cache lookup for the creator connection. The reason for this is that there may be queries using a persistent table of the same name as the newly created temporary table in the cache, and repeated submission of such queries would cause false cache hits, for the cache doesn’t recognize the presence of the temporary table. Only after parsing is done the queries that do not use any temporary tables are allowed for cache lookup.3 Dropping the temporary table returns the ability of pre-parse lookup to the connection. ALTER TABLE t1 RENAME t2; is another query that must be taken care of indi- vidually. In this particular case table list produced by the parser only contains t1. The need to invalidate all cache entries that also use t2 is not obvious. Practically in every case this query is submitted, t2 does not have to be invalidated, for if it exists, which is the prerequisite for any queries using it to be in the cache, the command will fail. There is, however, one special case, in which t2 would have to be invalidated. Consider this sequence of commands:

> CREATE TABLE t1 (...); > CREATE TABLE t2 (...); > SELECT list1 FROM t1 ...; > SELECT list2 FROM t2 ...; > CREATE TEMPORARY TABLE t1 (...); > ALTER TABLE t1 RENAME t2;

Here, even though t2 exists, the alteration command succeeds, because the new t2 is temporary. Now, queries using t2 would have to be removed from the cache if the cache allowed temporary table caching. The string t2 is found in lex->name, so invalidation would have to be done using both lex->name and table list produced by the parser. Please note that the stand-alone RENAME TABLE command is forbidden in MySQL to be used with temporary tables.

3Other users may still retrieve the cached queries which use the original persistent table – even for the case that the temporary table uses its name – without any penalty.

24 4.6.4 Special functions There are special built-in functions in MySQL such as NOW() and RAND() that give different results everytime they are called. It is desirable not to cache queries con- taining such functions. After parsing is done, however, there is no way to find out which of these functions were used in the query. So it was necessary to modify the parser – sql yacc.yy. Whenever e.g. NOW() or RAND() function is recognized, the mar sql not cacheable flag is set to 1 for the current thread. More functions that must not be cached were found and taken care of either in the parser, or in functions in item create.cc, which are called during parsing anyway (see figure 4.4). They are shown in table 4.1 with a brief reason for having been banned from the cache. Another special query (actually a function) that had to be taken care of is SELECT * FROM t1 WHERE (AUTOINCREMENT) column IS NULL; This is equivalent to SELECT LAST INSERT ID(); which uses another function that must not be cached. In sql select.cc there was found a piece of a code that transforms the former se- lect into the latter. The line that sets the mar sql not cacheable flag to 1 was easy to add and the problem was solved.

4.6.5 Procedures and UDF functions Procedures within MySQL are functions that users can compile into the code and that manipulate with the results before they are sent. The queries containing them were excluded from cache. The main and ultimate reason for this is that the return value of these procedures is not defined in time when the cache is started (simply because they may not yet exist), so there is no knowledge if the procedures return the same value for the same input every time, which is crucial for cache usage. For example, they may contain some sort of RAND() function (already in MySQL), returning random numbers. This surely must not be cached. UDF’s (User Definable Functions) are functions that are dynamically loadable into the mysqld using the dlopen() function. They are excluded from the cache for the same reason as procedures.

4.6.6 Variables Caching queries containing variables is questionable – either cache them and inval- idate the cache with every change of the variables they use, or disable caching of all queries which use variables. The latter was chosen because keeping record of which variables were used in a query is difficult for the reason that after parsing of the query is finished, there is no way to tell which variables were used. Also different users can use different variables but with the same names, and caching queries containing them would have to be user-specific – users would be able to retrieve only their own queries. Disabling the cache was solved with another parser modification – whenever a symbol beginning with ‘@’ – a variable – is encountered, the mar sql not cacheable flag is set to 1.

25 Name Functionality Ban reason (file name) Reads the file on the The file contents may server and returns its change. contents as a string. DATABASE() Returns the current Active database may database name. change. USER(), Returns the current Users may have different SYSTEM USER(), MySQL user name. names. SESSION USER() CONNECTION ID() Returns the connection Every connection has its ID (‘thread id’). own unique ID. GET LOCK(str, timeout) Tries to obtain a lock Function must always be with a name given by the executed. string ‘str’, with a time- out of ‘timeout’ seconds. It blocks requests by other clients for locks with the same name. RELEASE LOCK(str) Releases the lock named Function must always be by the string ‘str’ that executed. was obtained with ‘GET LOCK()’. BENCHMARK(cnt, exp) Executes ‘exp’ repeat- Function demands real edly ‘cnt’ times. It may processing to be mean- be used to time how fast ingful. MySQL processes the ex- pression. MASTER POS WAIT Blocks until the slave Function must always be (log name, log pos) reaches the specified po- executed. sition in the master log during replication. FOUND ROWS() Returns the number of Each user can have a dif- rows returned by the pre- ferent ‘previous query’. vious query.

Table 4.1: Miscellaneous functions of the MySQL database server

26 4.6.7 MERGE table types Another not trivial problem arises with tables created as unions, e.g. CREATE TABLE t3 (...) TYPE = MERGE UNION(t1, t2) INSERT METHOD = FIRST. When t1 changes, t3 changes also. In case of t2 it’s the same. And when t3 changes, both t1 and t2 may change. Invalidation is difficult due to the fact that except during the CREATE command execution it is hard to find out the connection of the tables t1, t2 and t3. The SQL query cache isn’t able to trace it. For this reason, a rather strict policy was picked up in order to cope with the unions:

• At the time of creation, cache items that use t1, t2 or t3 are invalidated.

• Tables t1, t2 and t3 are inserted into a special structure in mar sql cache – no cache tables (see appendix B.2). This structure holds the list of tables that always prevent any query from being cached in case it touches one of these tables. Table t is removed from this structure only on DROP t; command.

This allows mysqld with the SQL query cache pass the merge test in mysql- -test/t/ directory. But still one problem remains unsolved – the case when t3 is created before the server is started up. This can happen if the client creates t3, and then the server has to be restarted. The no cache tables is empty and all queries upon t1, t2, t3 will be cached, but they shouldn’t be.

4.7 Special MySQL options

The MySQL database server supports some options inside queries which are not a part of the SQL language and it is therefore necessary to enclose them into com- ment brackets, in order for the MySQL queries to be interchangeable with other SQL servers. To these options, another one was added – /*! NO SQL CACHE */, which enables the client (user) to tell the server explicitly not to cache the query in the text of which it appears. It does so simply by setting the mar sql not cacheable flag to 1. There is no command as /*! YES SQL CACHE */ that would explicitly demand caching the query in which it would appear. However, adding it would be very simple indeed – the mar sql not cacheable flag would have to be duplicated and renamed to e.g. mar sql must cache. Then the parser would have to be modified to recognize the YES SQL CACHE command and set the mar sql must cache flag to 1 whenever it appeared. The cache code would only check the flag and if it was set, it would force the query to be cached.

4.8 Caching empty results

Another question whether or not to cache concerns queries returning zero rows as their results. This may have several reasons. Some table used in the selection query

27 may be empty or a WHERE clause used in a query may disqualify all rows. In both cases it was decided that it is desirable to cache the empty results, because the parsing and computing of these queries may be as difficult as in the case of non- -empty results. A little problem was that the empty results are found out before the do select() function is called, within which all the data is stored into the cache in case some rows are returned. It was found that instead of do select(), return zero rows() function was used for this (see figure 4.4). So it was embedded within sql cache->start new query() and sql cache->store query to cache() which initialize and finalize the new query storage and thus the caching of the empty results was solved.

4.9 Caching more queries

At this point, caching of one query was complete. Cache containing more queries was of course desirable. Also it was desirable for the searching through these cached queries to be as fast as possible. Since the sign that distinguishes the stored items is mainly based on the original query string (text of the query that the user input to the client application), a hash structure was chosen to be the most suitable for this assignment. The MySQL source code contains a HASH class, so it was reused and no modifications were necessary. The key in the hash was at this time set to the query string. HASH items was added to class mar sql cache (see appendix B.2) and hash- related functions such as insert and search were used in the code. As the MySQL internal hashing table uses a dynamic array, it doesn’t take up much memory at the point of initialization. For checking of how many items there are in the hash, integer variable hash.records can be used.

4.10 Generating hash key of stored items

The main part of the hash key is the query string. However, some other important information is also required in the key.

4.10.1 Active database name When two users work with a different database and have this database selected as active, they no longer need to use the database prefix when using the tables from that database. So SELECT * FROM t1; may mean table t1 in different databases when sent from these different users. But after user A sends this query and it is inserted into cache, users B and C (when they send the same query) may get false hits just because they may use different active databases. So the name of the active database (if any), was added to the key of each item. When no active database is selected, users must specify it explicitly for each query (it is the part of the query string), so there is no need to add anything to the key.

28 Another reason why the name of the active database had to be added to the key is that users can change their active database during their session, e.g. by the USE database; command.

4.10.2 MySQL environment variables SQL SELECT LIMIT is a MySQL variable which sets the maximum number of rows returned to the client as a result of each query. All users may set it to a number they want. With the usage of the SQL query cache, setting this limit to different numbers by two or more clients would lead to wrong results. User A with limit n sends ‘SELECT * from test.t1;’, n rows are returned and stored into the cache. Then user B with limit m, m 6= n, sends the same query and obtains n rows in return from the cache, which is wrong. SQL SELECT LIMIT was for this reason added to the key of each query. With the LIMIT clause inside the query (which overrides the global setting) no such problem arises, as it is the part of the query string and thus the hash key anyway. SQL MAX JOIN SIZE variable tells the MySQL server how many rows can maxi- mally be examined within processing one selection query. If it needs to examine more of them than the limit, error message is given. Again, this value can be individual among clients, so it was added into the key. SQL BIG SELECTS variable controls the use of SQL MAX JOIN SIZE. If set to 1, the limit is omitted. Default setting is 0 – the limit for row examination works. As also this variable is user-specific, it was added to the key. No more variables in MySQL 4.0.0-alpha influence the returned results.

4.11 Memory limits

One of the most important things when adding a feature to a database server is to avoid compromising its stability. In case of a query cache, which stores items into RAM (Random Access Memory), it could happen that sooner or later the cache takes up all available memory and the server crashes. To prevent this from happening, memory limits had to be implemented.

4.11.1 sql cache memory limit This limit is used for checking if the whole size of the query cache isn’t above a particular capacity set in advance. At first, for each object the cache used, there were functions returning its con- tents’ total size, but soon it was discovered that calling these functions everytime the total cache size was needed was inconvenient and slow indeed. So another method was chosen – the size of each query is counted during its creation. The size of the whole cache is then the size of all items plus the size of the supplementary structures (see section 4.11.3).

29 The memory limit default was set to 32 MB, which, according to the benchmark results, seemed optimal. Nevertheless, the limit can be changed by a command line option as well as a client command (see section 4.14), so the default setting isn’t really important. The SQL query cache memory limit is checked upon insertion of each new query. Should the total size be over the limit, least recently used items are removed from the cache as long as the new query doesn’t fit. However, this single limit was found not good enough, since nothing is known about the size of one query – in case of returning many thousands of rows, a single query could fill the memory, taking server down.

4.11.2 sql cached query memory limit So a limit for each query was introduced and the check was made before the insertion attempt whether the query is suitable (its size is not over the query size limit) or not. The problem of this solution lay again in very large query results. Still, during creation, the query size might exceed the amount of free available memory (not to talk about the limits), and cause the server to misbehave, and there was no way to find out. So, yet another safety feature had to be added – the size of the query is tested as it is being created. When this size exceeds the query memory limit, the data stored is freed, and further storing of the query into the memory is disabled. This checking noticably slows down the query storage (particularly in case the half-stored result has to be freed from the memory) but is absolutely necessary to do. Otherwise, the limit for a single query wouldn’t work correctly. The cached query memory limit default was first set to 512 KB, but after running the benchmarks, it was raised to 1 MB, which was much faster. The default can of course be changed both by a command line option and a client command (see section 4.14). Since both memory limits can be modified while the server is running, there may be a question of which users can change them. The answer is simple. In the SQL query cache every user can do it. However, making the appropriate client commands available only for the administrator is easy, as within the code it is not hard to find out who the administrator is and undertake relevant procedures.

4.11.3 Determining cache size There are several items contributing to the total cache size. They are the structure mar sql cache, the hash table of cached queries, the queries themselves, the hash of databases and tables with items using them, and the hash of non-cacheable tables with its items. Please note that the LRU double-linked list, which is discussed later, is not a stand-alone structure and has as such no actual size. This is caused by its con- struction – the pointers are all contained within the structures of the cached queries,

30 save the first pointer holding the head of the list, which lies in the mar sql cache structure. Because of a quite frequent usage of the total cache size value within the code and in order to retain a reasonable speed, it was decided that most of the above items should be omitted for their little significance and because keeping record of their current size was rather demanding. Therefore the total cache size is counted as a sum of all cached queries’ sizes plus the number of records in all hashes multiplied by the size of the pointers to the respective items. Practically, only the joined size of the queries is stored all the time. When a new query is entered, its size is added to the total, when an old query is removed, its size is subtracted from the total. The used whole size is only returned by the mar sql cache::get size() func- tion which adds the items above. This is not computationally extensive, since the number of records is stored in a variable within each hash and the pointers’ sizes are resolved in compile time.

4.12 Cache replacement algorithm

When the cache is full and a new query arrives, it is clear that one of older queries must be discarded from the cache and the new query inserted into it. In order to get a good cache utilization, LRU (Least Recently Used) removal was chosen. Theoretically this means that the item which is least recently used is disposed of. The first thought was to use increasing timestamps for keeping record of which items were accessed, and an AVL tree (a height-balanced tree) to quickly find the item with the lowest timestamp (thus least recently used). This idea was implemented and working fine and the LRU problem was solved in O(log n). Eventually a better solution was recommended by the MySQL chief developer – the usage of a simple cycled double-linked list of items, which would not be a stand- -alone structure like the AVL tree, but a part of items’ structures themselves. In class mar sql cache there is only pointer to the first item in the list and each cache item keeps pointers to its neighbors in the list (see appendices B.2 and B.3). When a query is inserted into the cache, it is linked to the list into the first->prev position (virtually the end of the list). When a query is accessed in the cache, it is relinked there. When LRU removal is needed, it is clear that the query which is at the beginning of the list (pointed to by the first pointer) is the least recently used. As far as invalidation is concerned, once there is a need to remove any item, its neighbors are simply relinked to each other, which is the main advantage of the double-linked list. All these operations are finished in O(1) and that is the ultimate performance solution wanted.

31 4.13 Fine-tuning the cache

4.13.1 Ensuring thread safety Since mysqld is multithreaded, particularly each client is handled by a different thread, the cache had to be made thread safe. This safety is achieved by a standard pthread mutex t cache operation (see appendix B.2), where pthread mutex t is a mutex defined in the pthread library, which handles multithreading in operating systems which allow it (those that mysqld runs on). Every cache operation with any of the structures is then embedded within mutex lock and unlock to ensure mutual exclusion.

4.13.2 Debugging compliance within MySQL The MySQL uses a unified debugging mechanism – each function that is important enough starts with DBUG ENTER(...) and ends with DBUG RETURN(...) or, when no return value is given, DBUG VOID RETURN. All information that may be important for debugging is output by DBUG PRINT(...). The SQL query cache code was written so as to comply with this. Some of the fatal error messages, though, are written to the standard error output (stderr).

4.13.3 Cache disabling situations In case of memory allocation failures, the cache may still work fine as far as sending the already cached queries is concerned. So the checking code was written so as not to disable the cache whenever possible. Therefore the cache is disabled due to a memory allocation failure only in case that the failure occurs inside a constructor, and this is because the constructors can’t have return values that could indicate problems. The cache is also disabled when any of the hash init() functions fails, for the simple reason that without being able to insert items into the hashing tables, the cache can’t be functioning. The SQL query cache code never calls exit() function to exit the mysqld.

4.14 The SQL query cache user interface

4.14.1 Compile time options The cache is compiled into the mysqld when a -DUSE MAR SQL CACHE compile-time option is given to the compiler. In the sql/Makefile, which controls the mysqld compiling, this is set by default.

4.14.2 Command line options The SQL query cache introduces four new command-line options for the MySQL server. They are:

32 • --sql-cache-size=N where N is the initial size of the hashing table that stores cached queries. It has no effect on the number of queries that can be stored in the cache – this is only dependent on the memory limit. If N is lower than 1, caching is disabled. If N is greater than 16384, a warning is given to the standard error output that the hash size is enormous. This option mainly serves for disabling the cache at server start-up.

• --sql-cache-ext-key enables ‘caching per user’, i.e. access rights need not be checked for each query. The ‘caching per user’, which is turned off by default, allows users to retrieve only their own queries from the cache. This option requires adding some user specific details (username and IP) into the hash key of each query and thus increases memory demands. It also degrades cache utilization when more users are connected to the server. On the other hand, it increases the performance for single-user systems (e.g. when every user is logged in as ‘http’).

• --sql-cache-memory=N sets the maximum memory capacity that can be taken by the SQL query cache to N Kilobytes. If a number lower than 1 is given, the size is set to 0 and the cache is disabled. When an insert attempt occurs after the cache size reaches N, the least recently used items are removed from the cache as long as there is not enough space to insert the new query, and then the new query inserted.

• --sql-cached-query-memory=N sets the maximum size of the memory used by one stored query to N bytes. If N exceeds the total cache size, it is set to one half of the total cache size.

4.14.3 Client commands The clients that connect to mysqld with the SQL query cache turned on can use these new commands for controlling the cache:

• SHOW SQL CACHE; for looking at some information about the SQL cache (see figure 4.6).

• RESET SQL CACHE; for invalidating all items in the cache, thus setting it to its initial state. When mysqld is not compiled with cache support, the previous two commands are recognized as SHOW STATUS; and act accordingly. This is because the MySQL parser’s grammar had to be changed in order to recognize the new commands and the file sql yacc.yy is not compiled by gcc but by bison, which doesn’t use -DUSE MAR SQL CACHE option from the Makefile. Thus it is compiled the same whether the cache is used or not.

33 mysql> show sql_cache; +------+ | SQL query cache contents | +------+ | select * from t1, t2, t3, t4 | | select * from t2, t3, t4 | | select * from t4 | | select id_k, nazev_k from t2 where pocet_f = 0 | | select * from t1, t2 | | | | Cached queries: 5 | | Cache lookups: 45, hits: 19 (42.22%) | | Allocated memory: 6675 bytes (0.0 MB) (0.02% full) | | SQL cache memory limit: 33554432 bytes (32.0 MB) | | SQL cached query memory limit: 1048576 bytes (1024.0 KB) | | | | HASH items: 5 items (hash size 1024) | | HASH dbs_tables: 4 items (hash size 128) | | HASH no_cache_tables: 0 items (hash size 64) | +------+ 15 rows in set (0.00 sec)

Figure 4.6: SHOW SQL CACHE; command output

• SET SQL CACHE MEMORY = N; changes the main cache memory limit to N Kilo- bytes. If lower than 1, the limit is set to 0. In case N is lower than the current limit, least recently used items are removed from the cache to fit in the new limit. If the limit for a single query is higher than the new limit, it is set to one half of it.

• SET SQL CACHE MEMORY = DEFAULT; sets the main memory limit to 32 MB.

• SET SQL CACHED QUERY MEMORY = N; changes the memory limit for one query to N bytes. If lower than 512, it is set to 0. If higher than the main memory limit, it is set to one half of it. Queries already in the cache are not influenced by changing the limit.

• SET SQL CACHED QUERY MEMORY = DEFAULT; sets the query limit to 1 MB or to one half of the main memory limit, if it is lower than 1 MB.

34 Chapter 5

Comparison with MySQL 4.0.1-alpha

MySQL version 4.0.1-alpha, which was officially released just after the main part of the SQL query cache implemetation had been done, contains a built-in query cache. This chapter deals with its advantages and disadvantages in comparison to the SQL query cache that was implemented as a part of this thesis. The 4.0.1-alpha built-in query cache was being developed alongside with the SQL query cache. The MySQL chief developer considered it better than the SQL query cache (which also had a chance of being included into the official distribution) mostly because it uses its own memory management (see section 5.1).

5.1 Differences

The main difference between the two versions of the cache is the memory manage- ment – the built-in query cache allocates a large block of memory when mysqld is run and all cache operations take place inside this pre-allocated block. This is convenient for the reduction of the memory allocation operations, which take a lot of time in a multithreaded environment. However, there is the processing overhead of the memory management functions, e.g. splitting and joining blocks and defrag- mentation of the whole cache area. Also, because all the memory for the cache is taken at once, an empty cache fills as much space as a full one. The principle as to how the query results are cached is also different – the built-in query cache stores packets whereas the SQL query cache stores the data and constructs the packets every time they are needed (discussion on this topic is in section 4.3 and in figure 4.5). This makes the built-in cache faster (consult benchmarks in chapter 6), however, not so versatile.

35 5.1.1 Temporary tables A difference in behavior between the SQL query cache presented in this paper and the built-in query cache contained in the MySQL 4.0.1-alpha can also be seen in working with temporary tables. Queries using temporary tables are excluded from the cache in both versions. When the temporary table is created with the same name as the persistent table, the cache is not invalidated and all users (except the one that created the temporary table) can still take advantage of the cached queries upon the original persistent table. This is the same.

Invalidation problems In the SQL query cache presented in this thesis, any other than selection operation upon the temporary table causes invalidation of queries upon the persistent table with the same name, with the exception of the DROP TABLE ...; command. In the MySQL 4.0.1-alpha built-in query cache this is reversed – only when the temporary table is dropped, the cache is invalidated and the queries working with the original persistent table are removed from the cache. Both of these are partially wrong, as the ideal solution would be to isolate the temporary tables from cache operations completely – the operations upon them should not cause invalidation at all. Inserting new items into the cache and looking them up also differs, and is discussed in section 5.3.

5.2 Advantages of the built-in query cache

These features of the MySQL 4.0.1-alpha built-in query cache were found superior to those of the SQL query cache.

• The main advantage is the speed. Due to not using memory allocation calls during the query storage and retrieval and because the packets don’t have to be constructed before sending the cached data to the client, the built-in query cache is faster than the SQL query cache (see benchmarks in chapter 6).

• In the MySQL 4.0.1-alpha built-in query cache, the problem with caching the MERGE tables is solved. In the SQL query cache there are problems which were discussed in section 4.6.7.

5.3 Disadvantages of the built-in query cache

These features of the MySQL 4.0.1-alpha built-in query cache were found inconve- nient in comparison with the SQL query cache.

36 • Queries not using tables are not cached. This means that for example a selec- tion query that consists of a stand-alone mathematical expression (e.g. SELECT (3 + 1) * 2;) has to be parsed and computed every time it is received.

• Because of the CLIENT LONG FLAG which influences the length of the pack- ets sent to the client, some clients cannot retrieve cached queries that were originally sent from clients with different flag settings and vice versa.

• Clients have to be set to same character sets in order to be able to share the cached queries among them.

• There is no command that would conveniently show the cache status for the user. All information about the cache is included in the SHOW STATUS; com- mand, but it’s too abrupt in comparison with the SQL query cache’s SHOW SQL CACHE; command (compare figures 5.1 and 4.6). . . | Qcache_queries_in_cache | 4 | | Qcache_inserts | 4 | | Qcache_hits | 3 | | Qcache_not_cached | 0 | | Qcache_free_memory | 33540396 | | Qcache_free_blocks | 1 | | Qcache_total_blocks | 13 | . .

Figure 5.1: MySQL 4.0.1-alpha’s SHOW STATUS; command output

• The clients using temporary tables can’t insert queries not working with them into the cache. These clients also can’t retrieve cached queries which do not use temporary tables from the cache.

• There is no way to compile mysqld without cache support.

5.4 Functionality errors

In the MySQL 4.0.1 built-in query cache, several cases of a wrong functionality were found. These are related to the usage of the MySQL environment variables (discussed in section 4.10.2). Suppose there are two non-equal numbers n and m, and a table with n rows called table1. Separately execute the following lists of commands:

37 > SELECT * FROM table1; > SET SQL_SELECT_LIMIT = m; > SELECT * FROM table1;

The first select returns n rows. Then the number of rows that can be returned by a single selection query is limited to m. The final select returns n rows if the cache is turned on, which is obviously wrong. With the built-in query cache turned off, it returns the correct number of rows – m.

> SELECT * FROM table1; > SET SQL_MAX_JOIN_SIZE = m; > SELECT * FROM table1;

Another case – this time the SQL MAX JOIN SIZE set to m causes the final select to fail because more rows than the limit would have to be examined – with the built-in query cache turned off. With the cache turned on, it still returns n rows.

> SET SQL_MAX_JOIN_SIZE = m; > SET SQL_BIG_SELECTS = 1; > SELECT * FROM table1; > SET SQL_BIG_SELECTS = 0; > SELECT * FROM table1;

Just a variation of the above problem, this time, with the built-in query cache turned on, the value of the SQL BIG SELECTS variable which controls the functional- ity of the SQL MAX JOIN SIZE variable is not heeded, thus ending in wrong results.

5.5 Comparison conclusion

Examples of differences between two MySQL cache implementations have been given. Each one has its own advantages as well as disadvantages. One of the goals of this chapter was to explain that the MySQL query caching problematics can be approached from different points of view and that a lot of time is needed to uncover all the possible problems and weaknesses. Of course, the MySQL-4.0.1-alpha is the first version with the built-in query cache and can’t be perfect at once. The presented inconveniencies are meant also as suggestions to the MySQL developers for further enhancement and refinement of the server.

38 Chapter 6

Benchmarks

In this chapter, benchmarks (tests measuring speed) are presented to show the ad- vantages and disadvantages of the server with the query cache working and the server without it and also to compare the performance of the SQL query cache with the MySQL 4.0.1-alpha built-in query cache. The benchmarks are sorted alphabetically. All benchmarks were performed on AMD Duron 795 MHz (106x7.5) system with 512 MB RAM, WD400AB 40GB hard disk drive, ext2 file system, under RedHat Linux 7.1 with kernel 2.4.12. Both caches were looked upon from the memory limit point of view – that is the amount of memory the cache can maximally allocate. For all capacities below 1024 KB, the limit for a single query (feature of the SQL query cache only) is set to one half of the total memory limit. E.g. for 256 KB cache, one query may maximally take up 128 KB of space. From 1024 KB on, the limit for one query is set to 1024 KB. This was found to have the best performance. The goal was to prove an enhanced performance where expected – in benchmarks performing lots of selection queries, possibly repeatedly. And, on the other hand, to prove only a small loss of performance in benchmarks performing other operations, i.e. inserts, updates and deletes, and non-repeated selection queries. The mysqld was restarted for every single test to make sure that no data what- soever remains not only in the query cache, but also in any of other MySQL internal caches. In all figures in this chapter, there are two charts. The chart on the left shows the results of testing the server MySQL 4.0.0-alpha, with the SQL query cache patch applied and the cache turned on. The chart on the right brings the results of testing the server MySQL 4.0.1-alpha with the built-in query cache (see chapter 5) turned on. Please note that it was found that in case of the built-in query cache, when the cache size is set to a lower number than 16 KB, the cache is disabled. That’s why the tests for 4 and 8 KB weren’t actually performed, and the columns for 4 and 8 KB are only copies of the column for 0 KB. The only exception for this is the table creation and drop test (section 6.6).

39 The column for zero size on all charts denote that cache was disabled (had zero capacity) for that test. For detailed benchmark results, please see CD/sql-bench/cache results/.

6.1 bench count distinct

This benchmark executes a set of distinction selection queries. Quite a big perfor- mance gain was expected here, and figure 6.1 proves it.

Figure 6.1: bench count distinct results

6.2 test alter table

This benchmark doesn’t execute any selection queries and is mostly constructed from ALTER TABLE, CREATE INDEX and DROP INDEX commands, so the cache isn’t utilized at all. However, the key for cache lookup is generated for all queries, and they are all looked-for in the cache, so this benchmark is here to show the performance loss for key generation and cache lookup. As seen in figure 6.2, this is very low, below 1 %. The MySQL 4.0.1-alpha was noticably faster in this benchmark. This isn’t caused by the presence of the query cache (no selection queries were performed), but rather by some other internal change since version 4.0.0-alpha.

6.3 test ATIS

The ATIS benchmark (Air Travel Information Service) simulates the task of retriev- ing airline schedules, fares, and related information from the database. Apart from testing database speed, it is used as a test-bed application for research in spoken language understanding [13]. In figure 6.3 it can be seen that the increasing cache capacity gradually increases the database’s performance, but only after reaching a certain threshold, 8 KB here. This is caused by the size of results to be cached being greater than 4 KB, which is the limit for one query in this case. When a result that is being stored to the

40 Figure 6.2: test alter table results memory is found to exceed this limit, all that has been stored up to then must be freed from the memory which is rather time-consuming (also see section 6.10).

Figure 6.3: test ATIS results

6.4 test big tables

This is a test of simple selection queries upon tables with extremely many fields (1000). A quick glance at figure 6.4 (MySQL 4.0.0-alpha chart) shows a little unusual performance loss at small memory capacities. This is caused by large results and is the same as in Wisconsin 100x benchmark (see section 6.10 for explanation).

6.5 test connect

This benchmark tests the speed of connecting the client to the server and discon- necting afterwards. A simple selection query is peformed while the connection lasts – hence the slight performance gain with the cache turned on (figure 6.5).

41 Figure 6.4: test big tables results

Figure 6.5: test connect results

6.6 test create

Benchmark testing the speed of table creation and drop. In figure 6.6 it might be unclear why there is a significant performance gain when no (or very few) selection queries are executed. Here is the most probable explanation. In MySQL, every table is represented as a set of files on the hard disk. First, the test with cache memory set to 0 KB was run – a test much slower than all other tests. A significant amount of data was probably put into the system’s file cache and following tests used this cached data. That’s probably where the speed is gained. The SQL query cache can’t have any effect on this performance gain as it happens in table creation (see CD/sql-bench/results/4.0.*/test-create *). The columns for 4 and 8 KB are empty in the MySQL 4.0.1-alpha chart, because it couldn’t be determined how the file cache would influence the second test, just as in the case of MySQL 4.0.0-alpha. So the copies of the first column were not used. However, it is highly probable that the test, were it performed, would show the same performance gain, no matter the cache size, as in the case of MySQL 4.0.0-alpha.

42 Figure 6.6: test create results

6.7 test different select

This benchmark executes many different selection queries one after another and was designed to report the performance loss in the worst case – 0% hit ratio of the cache. As seen in figure 6.7, the penalty is below 5 %. Please note that the performance loss is heavily dependent on the query result size. The performed selection queries return ten thousand rows, each with two fields. The MySQL 4.0.1-alpha is slower (as in the test alter table benchmark) due to internal changes.

Figure 6.7: test different select results

6.8 test repeated select

Benchmark created for estimating the maximum performance gain possible – re- peating a single query many times. From the charts in figure 6.8 it is apparent that the performance reaches up to 800 % of the original even with the lowest possible capacity settings.

43 Figure 6.8: test repeated select results

6.9 test select

Great performance increase was expected in this benchmarks which, in blocks, re- peatedly runs a lot of different selects. As seen in figure 6.9, the maximum perfor- mance reached up to 600 % of the original non-cached server. This was reached with only 1 MB of memory. With the cache capacity of 16 KB already the benchmark was almost twice as fast as with no cache.

Figure 6.9: test select results

6.10 test wisconsin 100

The Wisconsin Benchmark described in [14] came to being as the result of the effort to evaluate the performance of the DIRECT and compare it with other database systems. It was designed to test major components of such systems and to be well understood as far as its principles were concerned. It employs an artificially created database with three relations, one with one thousand tuples (rows), the other two with ten thousand tuples. The tables are filled with synthetically created data in order to maintain their uniform distribution.

44 The benchmark’s query suite was designed to measure the performance of all ba- sic operations such as projections with different percentages of duplicate attributes, single and multiple joins and append, delete and modify operations. The criticism of this benchmark is mainly based on the fact that it only simulates a single-user workload and has too simple a database. Multi-user tests have been constructed on the base of the Wisconsin Benchmark, but they never reached the original benchmark’s popularity. With MySQL 4.0.0-alpha most of the total benchmark’s time is taken by creation of the tables and inserting data before the actual Wisconsin benchmark begins. That’s why its modified version, Wisconsin benchmark x100, was used instead. The modification is quite simple – instead of repeating the test 10 times as in the original version, it is repeated 1000 times. Thus the actual Wisconsin benchmark time plays the most significant role in the total benchmark time and the advantage of the cache can be clearly seen, on the contrary of the original version. The charts for this benchmark (in figure 6.10) are very interesting. In the first one (for MySQL 4.0.0-alpha) a performance drop can be seen as the cache memory limit increases, then suddenly at 1024 KB the performance stabilizes and gains pretty values levelled at about 130 % of the original performance. This is caused by the fact that Wisconsin benchmark’s selection queries produce very large results and with the cache memory limit set too low, these results have to be, not having been able to be cached completely, removed from the memory again which is a costly operation in terms of time. Certainly, the longer the result keeps being stored into the memory only to be freed afterwards, the more time loss occuring. This is why before reaching the threshold of 1024 KB, the performance for 512 KB is the lowest – only about 60 % of original. In case of MySQL 4.0.1-alpha, no such performance loss occurs due to its different memory management (see section 5.1).

Figure 6.10: test wisconsin 100 results

45 6.11 Benchmarks conclusion

The developed SQL query cache has been proven to have a positive effect on the MySQL database server performance. Although the best times were being achieved for capacities from 1024 KB up, which could suggest setting the default capacity to 1024 KB, the default cache size was set to 32 MB mainly due to the fact that in a real workload, more different queries are expected to be submitted to the server than during benchmarking and also because with the increase of the limit, the performance showed no degradation. The MySQL 4.0.1-alpha was found faster in the ATIS benchmark, test big tables, test connect and the Wisconsin x100 benchmark, which proved the statements about the superior speed from chapter 5, especially with large results.

46 Chapter 7

Discussion and Conclusion

The SQL query cache for the MySQL 4.0.0-alpha database server has been pre- sented. Results of selection queries with the restricitions defined above are stored in the memory and in case of a repeated query occurrance retrieved from the cache, increasing the server response speed and saving its precious resources. A working in- validation scheme is employed to remove queries when the tables they operate upon are updated or dropped. For a cache replacement when full, an LRU algorithm with a very good time and space efficiency is used. In case of a system failure during cache code execution (e.g. lack of memory), the cache simply shuts itself down and doesn’t jeopardize the stability of the server. The benchmark tests are shown to prove the performance enhancement and reduction where they were expected. Also a comparison with the new MySQL server version, 4.0.1-alpha, containing a built-in query cache has been given. It is understood that the proposed SQL query cache isn’t at all the ultimate solution of the caching problem of the database server. Handling the tables created as unions is not satisfying, it works fine only as long as the server is not restarted. After restart the information that is stored in no cache tables is lost and the server may then return incorrect results. Temporary tables, the caching of which is disabled, should not interfere with the cache at all, however, some commands working with temporary tables cause unwanted invalidation of the queries using persistent tables with the same names. More cache replacement algorithms than LRU can be used, e.g. LFU (least frequently used), algorithms that consider the time or space cost of saving the query into the cache (which is sometimes high indeed), the maintenance cost of the stored cached item (in terms of what items are dependent on the most updated tables), execution cost of the original queries, probability of query occurrance or combination of some of the above (as presented e.g. in [5]). The caching method of the SQL query cache is passive – the queries retrievable from the cache are exactly those that had been stored in it before. No effort what- soever is done to test the query subsumption [7] or otherwise actively enhance the cache performance.

47 The storage of the query parsing result also remains unsolved. Because many functions are called directly during parsing and the results are stored in many dif- ferent places, the MySQL parser would have to be rewritten in order to enable it. Also the cache lookup, based on the exact key match, could be improved, be- cause currently selection queries that differ only in letter case (SELECT vs. select) are recognized as different ones. Also white spaces inside the query string matter (SELECT * FROM t1,t2; vs. SELECT * FROM t1 , t2;). But this as well could be convincingly solved at the parser (or screener) level – the keywords in the query string when it is worked with could always be converted into the lower-case or upper-case and the white spaces reduced to one unified symbol. Another weakness and performance bottleneck arises from the essence of the caching method – each piece of data is stored separately in the memory, demanding a lot of allocation and deallocation calls at creation and extinction, which cuts the performance down, especially in case of large results and in a multithreaded environment. This is the price for the cache versatility, code legibility and easy maintenance. However, it is believed that having designed the SQL query cache this way (as it is a part of a huge project indeed), there is a fair chance that other people who will perhaps revise and rewrite it later will find it convenient and pleasant to work with.

48 Bibliography

[1] The MySQL web site. http://www.mysql.com/, 1995–2002 1.3

[2] Mazzucco, P. The Fundamentals Of Cache. http://www.slcentral.com/articles/00/10/cache/index.php, October 17th, 2000. 2.1

[3] Hennessy, J. L., Patterson, D. A. Computer Architecture: A Quantitative Approach. Morgan Kaufmann Publishers, Inc., 1996. 2.1

[4] Grimm, C., Pralle, H., V¨ockler J.-S. Load and Traffic Balancing in Large Scale Cache Meshes. http://www.cache.dfn.de/DFN-Cache/- Veroeffentlichungen/TNC98/index.html. University of Hannover, Institute for Computer Networks and Distributed Systems, 1998. 1

[5] Luo, Q., Naughton, J. F., Krishnamurhty, R., Cao, P., Li, Y. Active Query Caching for Database Web Servers. http://www.drinkme.com/library/uci/ics215/xml/webdbActive.pdf. Computer Science Department, University of Wisconsin-Madison, 2000. 2.3, 2.3.1, 2.4, 7

[6] Cao, P., Zhang, J., Beach, K. Active Cache: Caching Dynamic Contents on the Web. Proceedings of the IFIP International Conference on Distributed Systems Platforms and Open Distributed Processing, Middleware 1999. 2.3

[7] Shim, J., Scheuermann, P., Vingralek, R. Dynamic Caching of Query Results for Decision Support Systems. Proceedings of the 11th International Conference on Scientific and Statistical Database Management, Cleveland, 1999. 2.4, 2.4, 4.6.1, 7

[8] Pearson, O. Squid: A User’s Guide. http://squid-docs.sourceforge.net/latest/html/c28.htm, 2000. 2.5

[9] Danzig, P. NetCache Architecture and Deployment. http://wwwcache.ja.net/events/workshop/01/NetCache-3 2.pdf. Network Appliance, Santa Clara, CA, February 2nd, 1997. 2.5

49 [10] Florescu, D., Levy, A., Suciu, D., Yagoub, K. Run-time Management of Data Intensive Web-sites. ftp://ftp.inria.fr/INRIA/publication/publi-pdf/RR/RR-3684.pdf. Institut National de Recherche en Informatique et en Automatique, March 1999. 2.5.1

[11] Wiedermann, J. Vyhled´av´an´ı. matematick´ysemin´aˇrSNTL. SNTL, 1991. 3.1.1, 3.1.2, 3.1.3

[12] Knuth, D. E. The Art of Computer Programming. Vol. III. Reading (Mass.). Addison-Wesley Publishing Co., 1973. 3.1.2

[13] http://www.ai.sri.com/natural-language/projects/arpa-sls/atis.html, 1994. 6.3

[14] Bitton, D., DeWitt, D. J., Turbyfill, C. Benchmarking Database Systems: A Systematic Approach. Proceedings of the 1983 Conference, October, 1983. 6.10

50 Index

active caching, 7 Active Proxy, 6 ATIS benchmark, 40 cache, 1 cache applet, 6 caching, 1 canonical form of a query, 7 hashing, 11 hashing table, 11 locality of reference, 4 passive query caching, 6 post-parsing lookup, 15 pre-parsing lookup, 15 proxy cache server, 5 query, 1 query applet, 6 query attachment graph, 7 rehashing, 12

SQL, 1

Wisconsin Benchmark, 44

51 Appendix A

Installation

Unpack MySQL 4.0.0-alpha sources. Go to the mysql-4.0.0-alpha directory. Ap- ply the SQL query cache patch. Update Makefile.in in sql/ directory.

> cd wherever_you_want_to_unpack_mysql > tar zxvf CD/mysql-4.0.0-alpha.tar.gz > cd mysql-4.0.0-alpha > patch -p1 < CD/patches/mysql-4.0.0-alpha-sql-cache-v2.7 > automake sql/Makefile

Please note that the unpacked directory with the patch applied could not be put onto the CD because the MySQL sources use hard and symbolic links that the CD file system does not allow. Then use standard procedure to configure, compile and install MySQL. E.g.

> ./configure --prefix=/home/user/mysql_home_dir > make > make install > scripts/mysql_install_db > cd mysql-test > ./mysql-test-run --force

The --force option is required, for the ctype latin1 de test fails (it fails in the original unmodified version as well), which causes mysql-test-run to abort if --force option is not given. The MySQL would have to be compiled with a different character set option to pass the test. The MySQL server version 4.0.0-alpha with the SQL query cache support is now installed. In order to run benchmarks and other tests (not in mysql-test directory) you may need to install some of the programs found in the CD/software/ directory, and you will need Perl installed. See the MySQL documentation Chapter B – Contributed Programs for details. (http://www.mysql.com/documentation/).

52 Appendix B

Classes used

This appendix presents the most important parts of the classes that are the core of the SQL query cache implementation.

B.1 Class THD class THD { . .

#ifdef USE_MAR_SQL_CACHE mar_sql_cache_item *mar_sql_new_item; //here one cache item (query, fields, result) is stored //before finally inserting it into sql_cache #endif int mar_sql_not_cacheable; //when an ’unsecure’ query (RAND(), NOW(), ...) is encountered //during parsing, this is set to 1 to tell everybody that this //query must not be cached. Otherwise it is set to 0 //(as default). //NOTE: this is not inside the #ifdef because it is used even //if mysqld was not compiled with -DUSE_... (Monty’s suggestion) . .

}

B.2 Class mar sql cache class mar_sql_cache { HASH items; //cached queries HASH dbs_tables; //databases and tables used in cached queries

53 HASH no_cache_tables; //tables that must not be cached unsigned int hash_size; //maximum number of cached queries, if 0, cache is disabled int memory_limit; //maximum size of memory in bytes which the cache may allocate int size; //size of the whole cache in bytes //contents of dbs_tables and no_cache_tables do not count in int lookups; //lookups commited int hits; //hits achieved mar_sql_cache_item *first; //pointer to LRU item and head of the cycled double-linked list //used for LRU utilization pthread_mutex_t cache_operation; int extended_key; //include user into key?

. .

}

B.3 Class mar sql cache item class mar_sql_cache_item { char *key; int key_length; int query_string_length; ulonglong limit_found_rows; ha_rows examined_row_count; mar_list dbs_tables; //tables used by this item TABLE_LIST *tables; //for fast access rights check during in cache lookup uint send_fields_flag; //the flag passed to send_fields mar_sql_cache_packet elements_count; mar_list fields; //fields storage mar_list data; //results storage int size; //memory taken by this item mar_sql_cache_item *next; //for fast LRU removal mar_sql_cache_item *prev; //--||--

. .

}

54 B.4 Class mar sql cache packet class mar_sql_cache_packet { char *buff; int length;

. .

}

B.5 Class db plus table class db_plus_table { char *db_table; //memory region that holds the name in db\0table\0 format int db_length; int table_length; int length; //length of the whole string

. .

}

B.6 Class db table info class db_table_info { db_plus_table *db_table; mar_list cached_items;

. .

}

55