Advanced Topics in Communication Networks Programming Network Data Planes

Laurent Vanbever nsg.ee.ethz.ch

ETH Zürich Oct 25 2018 Two weeks ago on Advanced Topics in Communication Networks R ec ap

A bloom lter is a streaming answering speci c questions approximately.

6 R ec ap

A bloom lter is a answering speci c questions approximately.

Is X in the stream? What is in the stream?Invertible Bloom Filter

7 A bloom lter is a streaming algorithm answering speci c questions approximately.

Is X in the stream? What is in the stream?Invertible Bloom Filter

What about other questions?

8 Is a certain ow in the stream? Bloom Filter

What ows are in the stream? Invertible Bloom Filter, HyperLogLog Sketch, ...

How frequently does an ow appear? Count Sketch, CountMin Sketch, ...

What are the most frequent elements? Count/CountMin + Heap, …

How many ows belong to a certain subnet? SketchLearn SIGCOMM ‘18 12 In the worst case, an algorithm providing exact frequencies requires linear space.

Data Stream n elements in total → n distinct elements (in the worst case) → n counters required? :(

18 Probabilistic datastructures can help again!

Bloom Filters Sketches quickly “lter” only those provide a approximate elements that might be in frequencies of elemetns the in a data stream. Save space by allowing Save space by allowing false positives. mis-counting.

20 A CountMin sketch uses the same principles as a counting bloom lter, but is designed to have provable L1 error bounds for frequency queries.

22 A CountMin sketch uses the same principles as a counting bloom lter, but is designed to have provable L1 error bounds for frequency queries.

24 relative to L1 norm

Pr [ x^i − xi ≥ ε‖x‖1 ]≤δ estimated true sum of frequency frequency frequencies

The estimation error exceeds ε‖x‖1 with a probability smaller than δ

26 A CountMin sketch uses the same principles as a counting bloom lter, but is designed to have provable L1 error bounds for frequency queries.

28 A CountMin Sketch uses multiple arrays and hashes.

counters w indices per array (range of hashes) "

6 " d arrays w⋅d counters (one per array) (total size) 29 A CountMin sketch uses the same principles as a counting bloom lter, but is designed to have provable L1 error bounds for frequency queries.

CountMin sketch recipe

1 e Choose d = ln , w = ⌈ δ ⌉ ⌈ε ⌉

Then x^i − xi ≥ε‖x‖1 with a probability less than δ 66 A CountMin sketch uses the same principles as a counting bloom lter, but is designed to have provable L1 error bounds for frequency queries.

→ only one design out of many!

67 A Count sketchMin uses the same principles as a counting bloom lter, but is designed to have provable L2 error bounds for frequency queries.

68 The Count sketch uses additional hashing to give L2 error bounds, but requires more resources.

CountMin sketch recipe

1 e Choose d = ln , w = ⌈ δ ⌉ ⌈ε ⌉

Then x^i − xi ≥ε‖x‖1 with a probability less than δ

Count sketch recipe

1 e Choose d = ln , w = ⌈ δ ⌉ ⌈ε2 ⌉

Then x^i − xi ≥ε‖x‖2 with a probability less than δ 72 Sketches are the new black

...and many more!

OpenSketch UnivMon SketchLearn NSDI ‘13 SIGCOMM ‘16 SIGCOMM ‘18

[source] [source] [source] Today we’ll talk about: important questions,

how ‘sketches’ answer them,

limitations of ‘sketches’,

and my master thesis :)

76 Sketches compute statistical summaries, favoring elements with high frequency.

Let ε=0.01, ‖x‖1 =10000 (⇒ ε⋅‖x‖1 =100)

Assume two flows xa , xb ,

with ‖xa‖1 =1000, ‖xb‖1 =50

Error relative to stream size: 1%

3ow size: xa: 10%, xb: 200%

80 Other Problems a Sketch can’t handle

causality patterns rare things

81 This week on Advanced Topics in Communication Networks We will look at one example of a P4 hardware switch along with two examples of P4-enabled applications

P4 NetCache NetChain hardware

How to map P4 to load-balancing for consistent, fault-tolerant Barefoot Tofino's platform key-value store key-value store

[SOSP'17] [NSDI'18]

+ Albert Gran's Master Thesis presentation on "Making Scheduling Programmable" See https://adv-net.ethz.ch for follow-up slides Advanced Topics in Communication Networks Programming Network Data Planes

Laurent Vanbever nsg.ee.ethz.ch

ETH Zürich Oct 25 2018