What's New in OpenLDAP
Howard Chu CTO, Symas Corp / Chief Architect OpenLDAP
FOSDEM'14 OpenLDAP Project
● Open source code project
● Founded 1998
● Three core team members
● A dozen or so contributors
● Feature releases every 12-18 months
● Maintenance releases roughly monthly A Word About Symas
● Founded 1999
● Founders from Enterprise Software world – platinum Technology (Locus Computing) – IBM ● Howard joined OpenLDAP in 1999 – One of the Core Team members – Appointed Chief Architect January 2007 ● No debt, no VC investments Intro Howard Chu ● Founder and CTO Symas Corp. ● Developing Free/Open Source software since 1980s – GNU compiler toolchain, e.g. "gmake -j", etc. – Many other projects, check ohloh.net... ● Worked for NASA/JPL, wrote software for Space Shuttle, etc.
4 What's New
● Lightning Memory-Mapped Database (LMDB) and its knock-on effects
● Within OpenLDAP code ● Other projects ● New HyperDex clustered backend
● New Samba4/AD integration work
● Other features
● What's missing LMDB
● Introduced at LDAPCon 2011
● Full ACID transactions ● MVCC, readers and writers don't block each other ● Ultra-compact, compiles to under 32KB ● Memory-mapped, lightning fast zero-copy reads ● Much greater CPU and memory efficiency ● Much simpler configuration LMDB Impact
● Within OpenLDAP
● Revealed other frontend bottlenecks that were hidden by BerkeleyDB-based backends ● Addressed in OpenLDAP 2.5
● Thread pool enhanced, support multiple work queues to reduce mutex contention ● Connection manager enhanced, simplify write synchronization OpenLDAP Frontend
● Testing in 2011 (16 core server):
● back-hdb, 62000 searches/sec, 1485 % CPU ● back-mdb, 75000 searches/sec, 1000 % CPU ● back-mdb, 2 slapds, 127000 searches/sec, 1250 % CPU - network limited ● We should not have needed two processes to hit this rate Efficiency Note
● back-hdb 62000 searches/sec @ 1485 %
● 41.75 searches per CPU % ● back-mdb 127000 searches/sec @1250 %
● 101.60 searches per CPU % ● 2.433x as many searches per unit of CPU
● "Performance" isn't the point, *Efficiency* is what matters OpenLDAP Frontend
● Threadpool contention
● Analyzed using mutrace ● Found #1 bottleneck in threadpool mutex ● Modified threadpool to support multiple queues ● On quad-core laptop, using 4 queues reduced mutex contended time by factor of 6. ● Reduced condition variable contention by factor of 3. ● Overall 20 % improvement in throughput on quad-core VM OpenLDAP Frontend
● Connection Manager
● Also a single thread, accepting new connections and polling for read/write ready on existing ● Now can be split to multiple threads
● Impact depends on number of connections ● Polling for write is no longer handled by the listener thread
● Removes one level of locks and indirection ● Simplifies WriteTimeout implementation ● Typically no benchmark impact, only significant when blocking on writes due to slow clients OpenLDAP Frontend
Frontend Improvements, Quadcore VM
40000
35000
30000
25000 SearchRate d
n AuthRate o c 20000 e ModRate S / s p O 15000
10000
5000
0 OL 2.4 OL 2.5 OpenLDAP Frontend
● Putting it into context, compared to : – OpenLDAP 2.4 back-mdb and hdb – OpenLDAP 2.4 back-mdb on Windows 2012 x64 – OpenDJ 2.4.6, 389DS, ApacheDS 2.0.0-M13 – Latest proprietary servers from CA, Microsoft, Novell, and Oracle OpenLDAP Frontend
LDAP Performance
35000 30000 25000 20000 d n o c
e 15000 s / s p 10000 O 5000 0 OL hdb OpenDJ Other #1 Other #3 AD LDS 2012 OL mdb OL mdb W64 389DS Other #2 Other #4 ApacheDS
Search Mixed Search Modify Mixed Mod OpenLDAP Frontend
LDAP Performance
40000
35000
30000
25000 d n o 20000 c e s / s p
O 15000
10000
5000
0 OL mdb OL mdb W64 389DS Other #2 Other #4 ApacheDS OL mdb 2.5 OL hdb OpenDJ Other #1 Other #3 AD LDS 2012
Search Mixed Search Modify Mixed Mod LMDB Impact
● Adoption by many other projects
● Outperforms all other embedded databases in common applications
● CFengine, Postfix, PowerDNS, etc. ● Has none of the reliability/integrity weaknesses of other databases ● Has none of the licensing issues... ● Integrated into multiple NoSQL projects
● Redis, SkyDB, Memcached, HyperDex, etc. LMDB Microbenchmark
● Comparisons based on Google's LevelDB
● Also tested against Kyoto Cabinet's TreeDB, SQLite3, and BerkeleyDB
● Tested using RAM filesystem (tmpfs), reiserfs on SSD, and multiple filesystems on HDD – btrfs, ext2, ext3, ext4, jfs, ntfs, reiserfs, xfs, zfs – ext3, ext4, jfs, reiserfs, xfs also tested with external journals LMDB Microbenchmark
● Relative Footprint text data bss dec hex filename 272247 1456 328 274031 42e6f db_bench 1675911 2288 304 1678503 199ca7 db_bench_bdb 90423 1508 304 92235 1684b db_bench_mdb 653480 7768 1688 662936 a2764 db_bench_sqlite3 296572 4808 1096 302476 49d8c db_bench_tree_db
● Clearly LMDB has the smallest footprint – Carefully written C code beats C++ every time LMDB Microbenchmark
Read Performance Read Performance
Small Records Small Records
16000000 800000
14000000 700000
12000000 600000
10000000 500000
8000000 400000
6000000 300000
4000000 200000
2000000 100000
0 0 Sequential Random
SQLite3 TreeDB LevelDB BDB MDB SQLite3 TreeDB LevelDB BDB MDB LMDB Microbenchmark
Read Performance Read Performance
Large Records Large Records
35000000 2000000
30303030 1800000 1718213 30000000 1600000
25000000 1400000
1200000 20000000 1000000 15000000 800000
10000000 600000 400000 5000000 200000 7402 16514 299133 9133 7047 14518 15183 8646 0 0 Sequential Random
SQLite3 TreeDB LevelDB BDB MDB SQLite3 TreeDB LevelDB BDB MDB LMDB Microbenchmark
Read Performance Read Performance
Large Records Large Records
100000000 30303030 10000000 1718213 10000000 1000000
1000000 299133 100000 14518 15183 100000 8646 16514 10000 7047 7402 9133 10000 1000 1000 100 100
10 10
1 1 Sequential Random
SQLite3 TreeDB LevelDB BDB MDB SQLite3 TreeDB LevelDB BDB MDB LMDB Microbenchmark
Asynchronous Write Performance Asynchronous Write Performance
Large Records, tmpfs Large Records, tmpfs
14000 14000 12905 12735 12000 12000
10000 10000
8000 8000
5860 6000 6000 5709
4000 3366 4000
2029 1920 2004 1902 2000 2000 742 0 0 Sequential Random
SQLite3 TreeDB LevelDB BDB MDB SQLite3 TreeDB LevelDB BDB MDB LMDB Microbenchmark
Batched Write Performance Batched Write Performance
Large Records, tmpfs Large Records, tmpfs
14000 13215 14000 13099
12000 12000
10000 10000
8000 8000
5860 6000 6000 5709
4000 3138 4000 3079 2068 1952 2041 1939 2000 2000
0 0 Sequential Random
SQLite3 TreeDB LevelDB BDB MDB SQLite3 TreeDB LevelDB BDB MDB LMDB Microbenchmark
Synchronous Write Performance Synchronous Write Performance
Large Records, tmpfs Large Records, tmpfs
14000 14000 12916 12665 12000 12000
10000 10000
8000 8000
6000 6000
4000 3121 3368 4000 2026 1913 1996 2162 1893 2000 2000 745 0 0 Sequential Random
SQLite3 TreeDB LevelDB BDB MDB SQLite3 TreeDB LevelDB BDB MDB MemcacheDB
Read Performance Write Performance
Single Thread, Log Scale Single Thread, Log Scale
100 1000
min 100 min 10 avg avg max90th max90th 10 max95th max95th c c e e 1 max99th max99th s s m m max 1 max
0.1 0.1
0.01 0.01 BDB 4.7 MDB Memcached BDB 4.7 MDB Memcached MemcacheDB
Read Performance Write Performance
4 Threads, Log Scale 4 Threads, Log Scale
10 1000
min 100 min avg avg 1 max90th max90th 10 max95th max95th c c e max99th e max99th s s m max m 1 max 0.1
0.1
0.01 0.01 BDB 4.7 MDB Memcached BDB 4.7 MDB Memcached HyperDex
● New generation NoSQL database server
● http://hyperdex.org ● Simple configuration/deployment ● Multidimensional indexing/sharding ● Efficient distributed search engine ● Built on Google LevelDB, evolved to their fixed version HyperLevelDB ● Ported to LMDB LMDB, HyperDex LMDB, HyperDex
● CPU time used for inserts :
● LMDB 19:44.52 ● HyperLevelDB 96:46.96 ● HyperLevelDB used 4.9x more CPU for same number of operations
● Again, performance isn't the point. Throwing extra CPU at a job to "make it go faster" is stupid. LMDB, HyperDex LMDB, HyperDex
● CPU time used for read/update : – LMDB 1:33.17 – HyperLevelDB 3:37.67 ● HyperLevelDB used 2.3x more CPU for same number of operations LMDB, HyperDex LMDB, HyperDex ● CPU time used for inserts :
● LMDB 227:26 ● HyperLevelDB 3373:13 ● HyperLevelDB used 14.8x more CPU for same number of operations LMDB, HyperDex LMDB, HyperDex
● CPU time used for read/update : – LMDB 4:21.41 – HyperLevelDB 17:27 ● HyperLevelDB used 4.0x more CPU for same number of operations back-hyperdex
● New clustered backend built on HyperDex
● Existing back-ndb clustered backend is deprecated, Oracle has refused to cooperate on support ● Nearly complete LDAP support
● Currently has limited search filter support ● Uses flat (back-bdb style) namespace, not hierarchical ● Still in prototype stage as HyperDex API is still in flux Samba4/AD
● Samba4 provides its own ActiveDirectory-compatible LDAP service
● built on Samba ldb/tdb libraries ● supports AD replication ● Has some problems
● Incompatible with Samba3+OpenLDAP deployments ● Originally attempted to interoperate with OpenLDAP, but that work was abandoned ● Poor performance Samba4/AD
● OpenLDAP interop work revived
● two opposite approaches being pursued in parallel
● resurrect original interop code ● port functionality into slapd overlays ● currently about 75 % of the test suite passes ● keep an eye on contrib/slapd-modules/samba4 Other Features
● cn=config enhancements
● Support LDAPDelete op ● Support slapmodify/slapdelete offline tools ● LDAP transactions
● Needed for Samba4 support ● Frontend/overlay restructuring
● Rationalize Bind and ExtendedOp result handling ● Other internal API cleanup What's Missing
● Deprecated BerkeleyDB-based backends
● back-bdb was deprecated in 2.4 ● back-hdb deprecated in 2.5 ● both scheduled for deletion in 2.6 ● configure switches renamed, so existing packager scripts can no longer enable them without explicit action Questions?
41 Thanks!