PostgreSQL as a benchmarking tool

How it was used to check and improve the scalability of the DragonFly

François Tigeot [email protected]

1/21 About Me

● Independent consultant, sysadmin ● Former ccTLD system engineer

● *BSD user since ~= 1999

● Also a PostgreSQL user since ~= 1999

● Introduced FreeBSD in the .fr registry ● DragonFly developer since 2011

2/21 About DragonFly

-like Operating System ● Forked from FreeBSD 4.8 in 2003

● By (not the actor)

● Aims to be high-performance

● Uses per-core replicated resources and messaging ● Many operations are naturally lockless

3/21 About DragonFly (2)

● Innovative features very useful for some workloads ● Swapcache: second-level file cache

● Uses existing swap infrastructure

● Optimized for SSDs

4/21 Swapcache

Relative PostgreSQL performance

120

100 m u m i x a m o t

80 g n i t a l e r

Dfly 3.2 e 60

n With swapcache a m r o f r e p

f 40 o e g a t n e c r 20 e P

0 50 75 100 125 150 175 200

Database size relative to main memory (%)

5/21 November 2011

● PostgreSQL 9.1

● DragonFly 2.10 and 2.13 (development version) ● Dual Xeon, 24 threads, 96GB RAM

● Global MP lock removed from the kernel after version 2.10

● Was looking for benchmarks showing CPU scalability

● PGbench (read-only) was a good fit

6/21 November 2011 (2)

● Some crashes and bugs with high PGbench loads ● Quickly fixed (generally in less than a day)

● Deadlocks in the VM subsystem

● Overflows and races in zalloc()

● Races in the SysV shared memory subsystem ● Etc...

7/21 November 2011 (3)

Pgbench 127.0.0.1 TPS scaling

120000

100000

80000

DragonFly 2.10 60000 Scientific 6.1

40000

20000

0 1 2 4 8 12 16 20 24 28 32 48 64 80

8/21 November 2011 (4)

● System changes to improve performance ● Remove MP lock from SysV semaphore code

● Improve select() and poll()

● New “dmalloc” lockless memory allocator in libc

● Improve other memory allocation code paths ● it possible to concurrently process huge numbers of faults

9/21 November 2011 (5)

Pgbench 127.0.0.1 TPS scaling

120000

100000

80000 DragonFly 2.10 DragonFly 2.13 60000 Scientific Linux 6.1

40000

20000

0 1 2 4 8 12 16 20 24 28 32 48 64 80

10/21 September-October 2012

● PostgreSQL 9.3 (development branch using mmap) ● DragonFly 3.0 and 3.1 (development, future 3.2)

● Dual-Xeon, 24 threads, 24GB RAM

● Benchmark ● bottleneck

● Fix or tweak

● Repeat ● Very time consuming

11/21 September-October 2012 (2)

PostgreSQL 9.3 performance

180000

160000

140000

120000

100000 DragonFly-3.0

s Scientific Linux 3.2 p t 80000

60000

40000

20000

0 1 3 6 9 12 15 18 24 32 48 64 80

clients

12/21 September-October 2012 (3)

PostgreSQL 9.3 performance

180000

160000

140000

120000

100000 DragonFly-3.0 DragonFly 3.2 s p t Scientific Linux 3.2 80000

60000

40000

20000

0 1 3 6 9 12 15 18 24 32 48 64 80

clients

13/21 September-October 2012 (4)

● Mihai Carabas added a CPU topology framework to the kernel (work sponsored by Google) ● Old BSD scheduler changed to take this information into account

● Still significant limitations due to the original design

● The scheduler itself was single-threaded

14/21 September-October 2012 (5)

● Matt Dillon wrote a new scheduler

● Schedules processes as close as possible to the place they were last run on

● Avoids unnecessary competition for resources

● Doesn't use different hardware threads from the same CPU core at the same time if possible

● Globally balances the load and takes the machine topology into account to avoid hot spots

15/21 September-October 2012 (6)

● Other improvements ● Many default values tuned for new 64-bit machines (buffer cache)

● PMAP MMU optimizations. Avoids having to fault huge amounts of pages for processes using shared memory.

● Read shortcuts through the VM subsystem

16/21 September-October 2012 (7)

PostgreSQL 9.3 performance : various improvements

180000

160000

140000

120000

DragonFly-3.0 100000 pmap mmu optimize

s vm_read_shortcut p t 80000 DragonFly 3.2 (new scheduler)

60000

40000

20000

0 1 3 6 9 12 15 18 24 32 48 64 80

clients

17/21 September-October 2012 (8)

● Performance improvements not PostgreSQL- specific ● Number of MP MMU invalidations globally reduced

● read() performance globally improved

● Across the board improvements of performance under load

18/21 March 2014

● PostgreSQL 9.3, DragonFly 3.6.1 ● Dual-Xeon, 40 threads, 128GB RAM

● No PostgreSQL-specific performance work this time

● Improvements wrt DragonFly 3.2 likely caused by analysis of Poudrière runs in 2013 ● Poudrière = package building tool originally from FreeBSD

● Very CPU + /exec + I/O intensive

19/21 March 2014 (2)

PostgreSQL 9.3 performance

350000

300000

250000

200000 DragonFly 3.6.1 7.0 S P

T Centos 6.5 150000

100000

50000

0 1 5 10 15 20 25 30 35 40 60 80

Clients

20/21 Thank you

● Questions ?

21/21