Enhancing the Performance of Openldap Directory Server with Multiple Caching
Total Page:16
File Type:pdf, Size:1020Kb
Enhancing the Performance of OpenLDAP Directory Server with Multiple Caching Jong Hyuk Choi and Hubertus Franke Kurt Zeilenga IBM Thomas J. Watson Research Center The OpenLDAP Foundation P.O. Box 218, Yorktown Heights, NY 10598 [email protected] jongchoi,frankeh¡ @us.ibm.com Keywords—Directory, LDAP, Caching, Linux, Profiling. This paper presents our efforts to enhance OpenLDAP [2], Abstract— Directory is a specialized data store optimized for ef- an open-source directory software suite. More specifically, this ficient information retrieval which has standard information model, paper focuses on the performance and reliability of slapd, the naming scheme, and access protocol for interoperability over net- standalone LDAP server of OpenLDAP. We participated in the work. It stores critical data such as user, resource, and policy infor- mation in the enterprise computing environment. This paper presents development of back-bdb, a transactional backend of OpenL- a performance driven design of a transactional backend of the OpenL- DAP, which directly utilizes the underlying Berkeley DB with- DAP open-source directory server which provides improved reliabil- out the general backend API layer used in the existing OpenL- ity with high performance. Based on a detailed system-level profiling DAP backend, back-ldbm. Back-bdb provides a higher level of of OpenLDAP on the Linux OS, we identify major performance bot- reliability and concurrency. The transaction support of back- tlenecks. After investigating the characteristics of each bottleneck, we propose a set of caching techniques in order to eliminate them: bdb makes directory recoverable from temporary failures and directory entry cache, search candidate cache, IDL (ID List) cache, disasters, while the page level locking enables concurrent ac- IDL stack slab cache, and BER (Basic Encoding Rule) transfer cache. cess to directory by the directory server and various admin- The performance evaluation with a directory workload convinces that istrative tools at the same time. However, because the initial these caches, when combined, yields a 126% throughput increase and design of back-bdb did not exhibit the expected performance, a 59% latency reduction with a reasonable level of storage overhead. we analyzed its performance through a detailed system-level profiling. Based on the bottleneck identification and analysis, we propose five distinct caches for OpenLDAP back-bdb: en- I. INTRODUCTION try cache, candidate cache, IDL (ID List) cache, slab cache for A directory provides a logically centralized view of informa- the IDL stack, and BER (Basic Encoding Rule) [3] cache for tion in a distributed environment, enabling different platforms transfer contents. The combined use of these caches yields a to access a shared, consistent information base. By sharing performance improvement of 126%. This paper analyzes the critical information such as user, resource, and policy data, in- efficacy and the performance impact of these caches in detail. teroperability among heterogeneous systems and services can The next section will introduce LDAP and the OpenLDAP be significantly enhanced. As IT services become available to open-source project. Section III will describe the architecture customers in a dynamic, on-demand basis, it becomes vital to of OpenLDAP directory server, slapd, focusing on the design provide standard ways of information access and sharing. of the search operation in back-bdb. Section IV will introduce LDAP (Lightweight Directory Access Protocol) is a stan- five caching approaches proposed in this paper. Section V will dard access protocol for OSI X.500 [1] conforming directo- describe the experimental setup used throughout the paper. In ries. Because LDAP can run on top of TCP/IP instead of the Section VI, the design and performance analysis of the entry OSI protocol stack, LDAP was first used to alleviate the client cache for back-bdb will be described. After presenting the pro- side protocol overhead. LDAP clients connect to an LDAP filing mechanism used to identify performance bottlenecks in gateway that forwards requests and responses to and from an Section VII, the following three sections present the design and X.500 directory. Further in the direction of being lightweight, performance analysis of the four caches : the candidate cache it has become commonplace to use standalone LDAP direc- in Section VIII, the IDL cache and the IDL stack slab cache tory servers that store directory data directly in them without in Section IX, and the BER cache in Section X. Section XI the need of separate X.500 directories. concludes the paper. Because directory searches constitute the majority of direc- tory operations, it is particularly important to provide a high II. LDAP AND OPENLDAP performance directory search in terms of latency and through- put. Low latency is essential to low delay IT services that rely LDAP (Lightweight Directory Access Protocol) [4] is a on the directory. High throughput is essential because one di- standard directory access protocol of the Internet to access di- rectory server should be able to process a large number of re- rectories having the X.500 [1] naming and data models [5]. quests from multiple directory clients simultaneously. It is also LDAP defines an access protocol over TCP/IP that is a well important to provide a highly reliable directory service since defined subset of the X.500 DAP (Directory Access Protocol) critical data are usually stored in enterprise directories. to enable lightweight implementations. Its protocol syntax is ISBN: 1-56555-269-5 737 SPECTS ©03 defined in ASN.1 (Abstract Syntax Notation One) [6]. LDAP rectory server (slapd), a replication daemon (slurpd), client provides ten directory operations : search, compare, add, APIs (C, C++, Perl ...), and various client and server side tools. delete, modify, modifyDN, bind, unbind, abandon, and ex- Slapd is a multi-threaded directory server that can be easily tended. The exchanged protocol messages between the server configured to support various types of backends. and the client are encoded by using BER of ASN.1. LDAP The OpenLDAP directory software suite is currently being uses a restricted form of BER to reduce the complexity of the deployed as the default directory software in most Linux distri- BER encoding/decoding process [4]. The attribute values are butions including RedHat and SuSE. OpenLDAP is also being in a string format (LDAPString) in LDAP. widely used in many enterprise IT environments. LDAP uses a subset of the X.500 naming and data models and organizes directory entries hierarchically in a DIT (Direc- III. LDAP SEARCH OPERATION tory Information Tree). A DIT can be composed of multiple subtrees residing on different LDAP servers. A referral mech- This section will introduce the search operation and the ar- anism is provided to allow clients to chase a link across servers chitecture of OpenLDAP slapd directory server. when they encounter such boundaries. On the other hand, an LDAP server may host a forest consisting of multiple DITs. A. LDAP Search Request An entry is identified by a unique name called RDN (Rel- The following is the definition of the search request [4]: ative Distinguished Name) among its siblings under the com- mon superior entry and by a unique name called DN (Distin- SearchRequest ::= SEQUENCE { baseObject LDAPDN, guished Name) within the entire DIT. One or more attribute scope ENUMERATED { values of an entry form RDN of the entry. DN can be formed baseObject (0), by concatenating RDN and DN of its immediate superior. singleLevel (1), wholeSubtree (2) }, An entry consists of a set of attributes permitted by the en- derefAliases ENUMERATED { try’s object class defined in the directory schema. and system neverDerefAliases (0), derefInSearching (1), and user schema definitions. An attribute consists of an at- derefFindingBaseObj (2), tribute type followed by one or more attribute values. The at- derefAlways (3) }, tribute type designates the name and OID (object identifier) of sizeLimit INTEGER (0 .. maxInt), timeLimit INTEGER (0 .. maxInt), the attribute and the syntax, the matching rules, and the cardi- typesOnly BOOLEAN, nality of the attribute values [7]. The object class designates filter Filter, entry’s name, OID, description, and its superclass as well as attributes AttributeDescriptionList } the required and allowed attributes. The object class hierarchy baseObject defines the DN of the base object entry, the represents the class hierarchy of entries, whereas DIT repre- reference point relative to which the search is performed. If sents the hierarchy of directory entry objects. An object class scope is baseObject, only the baseObject is searched; if inherits attributes from its superclass. singleLevel, all direct subordinate entries of the base object OpenLDAP [2] is an open-source directory software suite are searched; if wholeSubtree, the entire subtree rooted at conforming to the LDAPv3 protocol [4] and supporting vari- the base object is searched. derefAliases indicates whether ous platforms including Linux, FreeBSD, Apple Mac OS/X, to dereference an aliases encountered during the search. Pos- Sun Solaris, Microsoft Windows. Currently, it sports a rich sible options are 1) always-dereference, 2) never-dereference, set of features [8] : LDAPv3 over both IPv4 and IPv6, SASL 3) dereference only upon positioning the base object, and 4) (Simple Authentication and Security Layer) support, TLS dereference except for the base object positioning. size- (Transport Layer