HTTP/WebDAV synchronization protocol optimizations. Piotr Mrowczynski

HTTP/WebDAV synchronization protocol optimizations. - HTTP2 (https://github.com/owncloud/client/compare/http2)

- Bundling Scope of (https://github.com/owncloud/client/pull/5319) this talk - Request Scheduling (https://github.com/owncloud/client/pull/5440) - Dynamic Chunking for new chunking algorithm (https://github.com/owncloud/client/pull/5368) - Prioritize by modification time (https://github.com/owncloud/client/pull/5349)

Current ownCloud WebDAV / HTTP1.1 implementation

- HTTP/1.1 without pipelining – head of line blocking

- HTTP/1.1 without pipelining – head of line blocking - max 3-6 parallel connections (as in web) - each file is single PUT / GET / DELETE / MKDIR / MOVE request within a single persistent (Keep-Alive Header) connection e m i ... T

L CO UT UT MK P P Con. 1 Con. 2 Con. 6

Server can handle 100 requests in parallel at specific moment → client will use anyways max 6

- can handle 100 requests in parallel at specific moment → client will bind to max 6

- server with concurrent syncs is overloaded with x6 opened connections (usually SSL)

- server can handle 100 requests in parallel at specific moment → client will bind to max 6

- server with many concurrent syncs is overloaded with opened connections - Latency: Each file in separate connection has to waste time on latency (usually 15-300 ms)

x 53 ms = 9s 1000 files / 6 parallel = 167 lines of blocking → x 320 ms = 53s s n o

Lat i ency t a r e e p O m

i e d T i s - sfer r nd tran e ncy a v Late r e S HTTP1, HTTP2 and BUNDLING

HTTP/1.1 e m i ... T

L CO UT UT MK P P Con. 1 Con. 2 Con. 6

BUNDLING

Reduced latency gain e m i ... T

L LE LE CO ND ND MK BU BU Con. 1 Con. 2 Con. 6

HTTP/1.1 e m i ... T

L CO UT UT MK P P Con. 1 Con. 2 Con. 6

HTTP 2 with BUNDLING ownCloud's requests limitation Reduced latency gain e m i ... T

L LE LE e CO ND ND MK BU BU

m

i Con. 1 Con. 2 Con. 6

T

Bandwidth/Time Max Parallel Gap HTTP/1.1 Max Parallel { PUT Con. 1 e m i ... T

L Optimization target – request CO UT UT MK P P scheduling (will talk later) Con. 1 Con. 2 Con. 6

HTTP2 with ownCloud request limitation If optimized by pumping more requests or binary data transfers: - might hide request-response latency as in bundling

- might utilize bandwidth

- only limited by server/database capability of accepting parallel files

HTTP2 - possible benefits for ownCloud - The parallel multiplexed requests and response do not block each other.

HTTP2 - possible benefits for ownCloud - The parallel multiplexed requests and response do not block each other.

- Optimized and faster encryption

HTTP2 - possible benefits for ownCloud - The parallel multiplexed requests and response do not block each other.

- Optimized and faster encryption

- Binary framing – less errors, overhead and more

- Header compression

HTTP2 - possible benefits for ownCloud - The parallel multiplexed requests and response do not block each other.

- Optimized and faster encryption

- Binary framing – less errors, overhead and more

- Header compression

- Flow control (separate from TCP flow control)

BUNDLING - possible benefits for ownCloud - Files packed in group of requests, send over the network and single response is returned

- Above results in latency reduction and possible better network utilization

BUNDLING - possible benefits for ownCloud - Files packed in group of requests, send over the network and single response is returned

- Above results in latency reduction and possible better network utilization

- Reduces PHP overhead (script is fired up once for whole the group instead of per file)

- PHP overhead – can be also reduced by optimizing server side for single requests HTTP1 vs HTTP2 tests

HTTP1 vs HTTP2 tests Files in parallel limitation 1000 files 1kB – total 1MB of data Measurement repeated 10 times using Smashbox Benchmarking Tool

CERNBOX Geneva, Switzerland Berlin, Germany 79 Mbit/s SSD, 8GB RAM, it/s, Dow Upl 76Mb s latency, 4x2,4GHz, WiFi, 53 m Melbourne, Australia , 3 20 ms latenc y, Upl 220 M Openstack, 12GB RAM, bit/s, Dow 1 448 Mbit/s 4x2.5Ghz

HTTP1 vs HTTP2 tests Files in parallel limitation 1000 files 1kB – total 1MB of data

Measurement repeated 10 times using Smashbox Benchmarking Tool

Synchronization to CERNBox (EOS), Geneva, Switzerland Protocol Parallel Limit Location Upload Time [s] Download Time [s] HTTP1 6 53 ms, DE 115.9 +/- 39.6 58.7 +/- 17.9 HTTP2 98.8 +/- 34.8 53.8 +/- 19.2 -17s

HTTP1 6 320 ms, AU 230.1 +/- 27.2 151.4 +/- 44.1

HTTP2 186.7 +/- 27.6 213.0 +/- 24.1 (?) -43s HTTP1 100 320 ms, AU 209.8 +/- 21.0 129.4 +/- 21.7

HTTP2 39.6 +/- 11.7 42.6 +/ 9.2 (?)

Latency and WAN impact

HTTP1 vs HTTP2 tests Files in parallel limitation 1000 files 1kB – total 1MB of data

Measurement repeated 10 times using Smashbox Benchmarking Tool

Synchronization to CERNBox (EOS), Geneva, Switzerland Protocol Parallel Limit Location Upload Time [s] Download Time [s] HTTP1 6 53 ms, DE 115.9 +/- 39.6 58.7 +/- 17.9 HTTP2 98.8 +/- 34.8 53.8 +/- 19.2

HTTP1 6 320 ms, AU 230.1 +/- 27.2 151.4 +/- 44.1

HTTP2 186.7 +/- 27.6 213.0 +/- 24.1 (?) -170s HTTP1 100 320 ms, AU 209.8 +/- 21.0 129.4 +/- 21.7

HTTP2 39.6 +/- 11.7 42.6 +/ 9.2 (?)

Client still allows 1. Head of line blocking HTTP2 Pipelining only max 6 2. Server bookkeeping blocks upload connections HTTP1 vs HTTP2 tests Files in parallel limitation 1000 files 1kB – total 1MB of data

Measurement repeated 10 times using Smashbox Benchmarking Tool

Synchronization to CERNBox (EOS), Geneva, Switzerland Protocol Parallel Limit Location Upload Time [s] Download Time [s] HTTP1 6 53 ms, DE 115.9 +/- 39.6 58.7 +/- 17.9 HTTP2 98.8 +/- 34.8 53.8 +/- 19.2

HTTP1 6 320 ms, AU 230.1 +/- 27.2 151.4 +/- 44.1

HTTP2 186.7 +/- 27.6 213.0 +/- 24.1 (?)

HTTP1 100 320 ms, AU 209.8 +/- 21.0 129.4 +/- 21.7

HTTP2 39.6 +/- 11.7 42.6 +/ 9.2 (?)

On latency 50ms from Berlin to Geneva, we got even 20s →50 Hz (files per second) HTTP1 vs HTTP2 tests SSL Overhead 12 files 1kB – total 12kB of data

Measurement repeated 10 times using Smashbox Benchmarking Tool

Synchronization to CERNBox (EOS), Geneva, Switzerland Protocol Files Synced Location Upload Time [s] Download Time [s] HTTP1 6.7 +/- 1.4 5.0 +/- 1.3 12 320 ms, AU HTTP2 5.9 +/- 2.2 4.4 +/- 0.4

-1.8s 3-way-handshake and SSL optimization gain, connection reuse limit (only ?)

HTTP1 vs BUNDLING tests

HTTP1 vs BUNDLING tests 1000 files 1kB – total 1MB of data

Measurement repeated 10 times using Smashbox Benchmarking Tool

DAMKEN CLOUD Nuremberg, Germany Berlin, Germany 79 Mbit/s SSD, 8GB RAM, it/s, Dow Upl 76Mb s latency, 4x2,4GHz, WiFi, 37 m Melbourne, Australia Ethernet, 2 79 ms latenc y, Upl 220 M Openstack, 12GB RAM, bit/s, Dow 1 448 Mbit/s 4x2.5Ghz

HTTP1 vs BUNDLING tests Files in parallel limitation 1000 files 1kB – total 1MB of data

Measurement repeated 3 times using Smashbox Benchmarking Tool

Synchronization to Damken Cloud, Nuremberg, Germany Protocol Bundled Files Location Upload Time [s] Download Time [s] HTTP1 - 155.1 +/- 0.4 41.0 +/- 0.8 37 ms, DE -6s Bundling 100 149.1 +/- 0.8 38.0 +/- 3.5

HTTP1 - 187.8 +/- 5.6 67.6 +/- 0.2 -28s Bundling 100 279 ms, AU 158.4 +/- 3.2 66.0 +/- 1.3 Bundling 10 152.6 +/- 1.2 67.9 +/- 0.1

Latency Influence on HTTP1 Bundling in this prototype works only for upload HTTP1 vs BUNDLING tests Files in parallel limitation 100 files 1kB – total 100kB of data

1000 files to bundle →10 requests needed

100 files to bundle→6 requests needed (as number of connections)

It occurred that for 100 files sync time reduced: 20s → 16s in upload for 37ms latency

Requests Scheduling

“Wide and Narrow Pipe” problem with max 3-6 connections

10 kB 10 kB 10 kB 10 kB

10 kB 10 kB

2 MB/s available

“Wide and Narrow Pipe” problem with max 3-6 connections

5 MB 5 MB 5 MB

2 MB/s available

Better solution utilizing 6 connections

5 MB 5 MB

10 kB 10 kB

10 kB 10 kB

2 MB/s available

Even better solution using HTTP/2 for fast and idle servers Boosted using http2 prioritization? (E. Bocchi, Politecnico di Torino) 10 kB 10 kB 10 kB 10 kB 10 kB 10 kB 5 MB 5 MB 10 kB 10 kB 10 kB 10 kB 10 kB 10 kB 10 kB 10 kB

10 kB 10 kB 10 kB 10 kB 10 kB 10 kB

10 kB 10 kB 10 kB 10 kB

2 MB/s available

TU Berlin → TU Berlin Test

Measurement repeated 10 times using Smashbox Benchmarking Tool

Folder A Folder B 50 x 100kB 20 x 5MB

TU Berlin → TU Berlin Test Old implementation

Folder-wise, first folder with small, then folder with big

TU Berlin → TU Berlin Test New implementation

Cross-folder, big files and small files at the same time

TU Berlin → TU Berlin Test

Measurement repeated 10 times using Smashbox Benchmarking Tool

20 x 5MB 50 x 100kB

- Big files don't block smaller ones

- Small files don't block bigger ones

Upload: ~23s → ~20s Download: ~16s → ~13s

Future using HTTP2, Dynamic Chunking and Scheduling?

- Max Parallel Negotiation

Sending you 3 files

Ok, but I can handle 50 files at this moment

My dynamic chunk is now 20 MB. I have more users now Sending you 30 files Reduce to 15 files which fit into it

- Using HTTP2 PUSH in Discovery phase? (E. Bocchi, Politecnico di Torino) Take Away Message - Changing the way the requests are being send may significantly reduce the sync time (HTTP2, Bundling, Scheduling)

HTTP2 Bundling Number of 1 Max 6 Connections

Optimized encryption, Binary stream, PHP overhead reduced, files Optimized transfer header compression, 1 connection “buffered” in bundles on server

Limited only by server capability and Number of files client OS/hardware – probably max Max 6 processed in parallel 100

CERNBox + EOS (1000 files) DamkenCloud (1000 files) Observed 115s → 20s on 50ms lat. 155s → 149s on 37ms lat improvement 210s → 40s on 320ms lat. 188s → 158s on 279ms lat 6.7s → 5.9s for 10 files on 320ms lat 20s → 16s for 100 files on 37ms lat

- If you want to test HTTP2 or any of the features, please contact me [email protected]