Facebook Networking & the Open Compute Project (OCP)
March 16, 2016 - Open Networking Summit Omar Baldonado Facebook Network Team OCP Networking Project Co-Chair
Launched Live Video
▪Supports millions of concurrent viewers for a single stream ▪Rolled out to dozens of countries in months More than 80% of our daily active users are outside the US software everywhere fboss and more…
Hybrid controllers Traffic shaping Network analytics 100G Network modeling & simulation Circuit automation Backbone & edge & testing traffic engineering Passive & active Config automation IPv6 monitoring & mgmt operations over features FBOSS/Wedges in production # of FBOSS/wedges#of in production
Time What happens when racks show up?
▪ Every week, every data center ▪ “Provisioning” ▪ RAM disks, addresses, images ▪ And what happens when they disappear? Upgrades
Changing the image every week (instead of twice/year)
https://code.facebook.com/posts/145488969140934/open-networking-advances-with-wedge-and-fboss/ fail fast over fail-proof NetNORAD
detect network interruptions and automatically mitigate them within seconds
https://code.facebook.com/posts/1534350660228025/netnorad-troubleshooting-networks-via-end-to-end-probing/ remember the whole network lifecycle
Open Rack Wedge Battery
Leopard Knox Power
Cold Storage 6-Pack
Open Data
Center Stack Cooling Scalable
Efficient
Flexible
ColdOpenLeopardCoolingWedgeBattery6-PackPowerKnox Storage Rack
Open Data Center Stack Data Triplet Mezzanine Wedge Center Rack Card v2
Battery Freedom Windmill Open Group Hug Cold Open 6-Pack Cabinet Servers (Intel) Rack v1 Storage Rack v2
Spitfire Power Watermark Mezzanine Winterfell Knox Micro Server Honey Yosemite Server (AMD) Supply (AMD) Card v1 (Panther) Badger
2011 2012 2013 2014 2015 OCP Networking as of March 2015
▪ One accepted switch ▪ Software building blocks ▪ Testing efforts starting
Takeaway: Disaggregation was here, but still ramping up! What a difference a year makes OCP networking hardware
▪ Full design packages ▪ Community review ▪ Testing program ▪ Disaggregation ▪ Hardware and software ▪ Multiple layers 11 OCP data center switches accepted
▪ 16x40G ▪ 48x10G ▪ 32x40G ▪ 36x40G ▪ 32x100G Newly shared OCP specs - new DC switches
▪ Facebook Wedge 100 ▪ Alpha 48x10G and 32x100G Newly shared OCP specs - new silicon
▪ 48x10G Mediatek/ Nephos ▪ 32x100G Edge-core with Cavium Newly shared OCP specs - chassis/modular
▪ Facebook “6-pack” - 128X40G ▪ Edge-core 256x100G, 512x100G Newly shared OCP devices - edge & access
▪ Edge - based on Broadcom “Qumran” - deep buffers, expandable TCAM ▪ Access - 48x1G w/ stacking & POE options Newly shared OCP devices - access points
▪ 2 indoor, 1 outdoor ▪ 802.11ac OCP hardware needs… software
Software
▪ Every OCP networking device supports choice in software OCP software - moving up the stack
▪ Initial work was in “building blocks” ▪ ONIE, ONL, SAI ▪ Still continuing ▪ Moving up to actual forwarding functionality A growing ecosystem of software
▪ Multiple projects and providers emerging ▪ Open source and commercial ▪ Distributed and SAI centralized OCP Wedge Demos:
▪ Managing Wedge via “Metal-as-a-Service” ▪ Created an FBOSS snap ▪ OCP Hack-a-thon - created an Open Switch snap OCP Wedge Demos:
▪ TORC - “Applications, Microservices, VNFs controlled by Top-of-Rack Controller” ▪ Used Wedge’s micro-server extensively ▪ Docker, Mesos Master, FBOSS, OpenNSL, ONL, OpenBMC, Calico OCP Wedge Demos:
▪ “Evolving a Telcom operator network into an IT convergence network” ▪ Ported OpenSwitch to Wedge ▪ Ported Indigo to Wedge ▪ OpenFlow support ▪ Interested in SAI What’s next for Facebook Networking & OCP?
▪ Working with the ecosystem and user community ▪ Reaching to new areas of the network with OCP Telco and TIP ▪ Code, code, code