Facebook Networking & the Open Compute Project (OCP)

March 16, 2016 - Open Networking Summit Omar Baldonado Network Team OCP Networking Project Co-Chair

Launched Live Video

▪Supports millions of concurrent viewers for a single stream ▪Rolled out to dozens of countries in months More than 80% of our daily active users are outside the US software everywhere fboss and more…

Hybrid controllers Traffic shaping Network analytics 100G Network modeling & simulation Circuit automation Backbone & edge & testing traffic engineering Passive & active Config automation IPv6 monitoring & mgmt operations over features FBOSS/Wedges in production # of FBOSS/wedges#of in production

Time What happens when racks show up?

▪ Every week, every ▪ “Provisioning” ▪ RAM disks, addresses, images ▪ And what happens when they disappear? Upgrades

Changing the image every week (instead of twice/year)

https://code.facebook.com/posts/145488969140934/open-networking-advances-with-wedge-and-fboss/ fail fast over fail-proof NetNORAD

detect network interruptions and automatically mitigate them within seconds

https://code.facebook.com/posts/1534350660228025/netnorad-troubleshooting-networks-via-end-to-end-probing/ remember the whole network lifecycle

Open Rack Wedge Battery

Leopard Knox Power

Cold Storage 6-Pack

Open Data

Center Stack Cooling Scalable

Efficient

Flexible

ColdOpenLeopardCoolingWedgeBattery6-PackPowerKnox Storage Rack

Open Data Center Stack Data Triplet Mezzanine Wedge Center Rack Card v2

Battery Freedom Windmill Open Group Hug Cold Open 6-Pack Cabinet Servers () Rack v1 Storage Rack v2

Spitfire Power Watermark Mezzanine Winterfell Knox Micro Server Honey Yosemite Server (AMD) Supply (AMD) Card v1 (Panther) Badger

2011 2012 2013 2014 2015 OCP Networking as of March 2015

▪ One accepted switch ▪ Software building blocks ▪ Testing efforts starting

Takeaway: Disaggregation was here, but still ramping up! What a difference a year makes OCP networking hardware

▪ Full design packages ▪ Community review ▪ Testing program ▪ Disaggregation ▪ Hardware and software ▪ Multiple layers 11 OCP data center switches accepted

▪ 16x40G ▪ 48x10G ▪ 32x40G ▪ 36x40G ▪ 32x100G Newly shared OCP specs - new DC switches

▪ Facebook Wedge 100 ▪ Alpha 48x10G and 32x100G Newly shared OCP specs - new silicon

▪ 48x10G Mediatek/ Nephos ▪ 32x100G Edge-core with Cavium Newly shared OCP specs - chassis/modular

▪ Facebook “6-pack” - 128X40G ▪ Edge-core 256x100G, 512x100G Newly shared OCP devices - edge & access

▪ Edge - based on Broadcom “Qumran” - deep buffers, expandable TCAM ▪ Access - 48x1G w/ stacking & POE options Newly shared OCP devices - access points

▪ 2 indoor, 1 outdoor ▪ 802.11ac OCP hardware needs… software

Software

▪ Every OCP networking device supports choice in software OCP software - moving up the stack

▪ Initial work was in “building blocks” ▪ ONIE, ONL, SAI ▪ Still continuing ▪ Moving up to actual forwarding functionality A growing ecosystem of software

▪ Multiple projects and providers emerging ▪ Open source and commercial ▪ Distributed and SAI centralized OCP Wedge Demos:

▪ Managing Wedge via “Metal-as-a-Service” ▪ Created an FBOSS snap ▪ OCP Hack-a-thon - created an Open Switch snap OCP Wedge Demos:

▪ TORC - “Applications, Microservices, VNFs controlled by Top-of-Rack Controller” ▪ Used Wedge’s micro-server extensively ▪ Docker, Mesos Master, FBOSS, OpenNSL, ONL, OpenBMC, Calico OCP Wedge Demos:

▪ “Evolving a Telcom operator network into an IT convergence network” ▪ Ported OpenSwitch to Wedge ▪ Ported Indigo to Wedge ▪ OpenFlow support ▪ Interested in SAI What’s next for Facebook Networking & OCP?

▪ Working with the ecosystem and user community ▪ Reaching to new areas of the network with OCP Telco and TIP ▪ Code, code, code