Data Center Bridging

IEEE 802 Tutorial 12th November 2007 Contributors and Supporters

ƒ Hugh Barrass (Cisco) ƒ Zhi Hern-Loh (Fulcrum Micro) ƒ Jan Bialkowski (Infinera) ƒ Mike Ko (IBM) ƒ Bob Brunner (Ericsson) ƒ Menu Menuchehry (Marvell) ƒ Craig Carlson (Qlogic) ƒ Joe Pelissier (Cisco) ƒ Mukund Chavan () ƒ Renato Recio (IBM) ƒ Rao Cherukuri () ƒ Guenter Roeck (Teak ƒ Uri Cummings (Fulcrum Micro) Technologies) ƒ Norman Finn (Cisco) ƒ Ravi Shenoy (Emulex) ƒ Anoop Ghanwani (Brocade) ƒ John Terry (Brocade) ƒ Mitchell Gusat (IBM) ƒ Pat Thaler (Broadcom) ƒ Asif Hazarika (Fujitsu ƒ Manoj Wadekar (Intel) Microelectronics) ƒ Fred Worley (HP)

DCB Tutorial Nov 2007 2 Agenda

ƒ Introduction: Pat Thaler

ƒ Background: Manoj Wadekar

ƒ Gap Analysis: Anoop Ghanwani

ƒ Solution Framework: Hugh Barrass

ƒ Potential Challenges and Solutions: Joe Pelissier

ƒ 802.1 Architecture for DCB: Norm Finn

ƒ Q&A

DCB Tutorial Nov 2007 3 Data Center Bridging Related Projects

ƒ802.1Qau Congestion Notification ¾In draft development

ƒ802.1Qaz Enhanced Transmission Selection ¾PAR submitted for IEEE 802 approval at this meeting

ƒPriority-Based Flow Control ¾Congestion Management task group is developing a PAR

DCB Tutorial Nov 2007 4 Background: Data Center I/O Consolidation

Manoj Wadekar Data Center Topology

NAS transaction NAS web state content access

packet processing (load balance, DB queries firewall, proxy, etc) business responses transactions SAN commits, secured data integrity open interactions client Back End requests, DAS Intra-Tier: lock mechanics, data sharing, responses web consistency content Inter-Tier: database query/response access Mid Tier Intra-Tier: State preservation Front End Inter-Tier: Business transaction, database query or commit Intra-Tier: Load balancing, dynamic content creation, caching Inter-Tier: Business transaction

Networking Characteristics:

Application Smaller Larger I/O Size:

# of Connections: Many Few

DCB Tutorial Nov 2007 6 Characteristics of Data Center Market Segments

ƒ Internet Portal Data Centers ¾ Internet based services to consumers and businesses ¾ Concentrated industry, with small number of high growth customers

ƒ Enterprise Servers Data Centers ¾ Business workflow, Database, Web 2.0/SOA ¾ Large enterprise to medium Business

ƒ HPC and Analytics Data Centers ¾ Large 1000s node clusters for HPC (DOD, Seismic,…) ¾ Medium 100s node clusters for Analytics (e.g. FSIs …) ¾ Complex scientific and technical

DCB Tutorial Nov 2007 7 Data Centers Today

Type Internet Portal Data CenterEnterprise Data Center HPCC Data Center

¾SW based enterprises, Non ¾ Non mission critical HW mission critical HW base ¾Large desktop client base architecture ¾10K to 100K servers ¾ 100 to 10K servers ¾100 to 10K servers Character ƒ Primary Needs: ƒ Primary Needs: ƒ Primary Needs: istics ¾Low capital cost ¾Robust RAS ¾Low Latency ¾Reduced power and cooling ¾Security and QOS control ¾High throughput ¾Configuration solution ¾Simplified management flexibility

Presentation LogicBusiness Logic Database

Topology examples

100s Æ 1000s 1000s Servers >100 DB Storage Web Servers Servers

Ethernet Routers Switches Switches and Appliances and Appliances

¾Fabric preference is ¾Low Cost ¾Looks nice on the surface, but… ¾Standard high volume ¾Lots of cables (cost and power) underneath

DCB Tutorial Nov 2007 8 HPC Cluster Network Market Overview - Interconnects in Top 500

Interconnect Family – Top500 Machines Other Cray 100 HiPPI Future IBM Other 90 Directions Quadrics 80 IB 20 - 40 Gb/s IB InfiniPath Myrinet- 70 InfiniBand 2K 60 Myrinet Cray Ethernet Myri-10G (Enet) 50 40 IBM SP 1 & 10G Enet 30 Switch Gigabit 20 Ethernet 10 0 1997 2000 2002 2007 ƒ Standard networks Ethernet & Infiniband (IB) replacing proprietary networks: ¾ IB leading in aggregate performance ¾ Ethernet dominates in volume ƒ Adoption of 10 Gbps Ethernet in the HPC market will likely be fueled by: ¾ 10 Gbps NICs on the motherboard, 10 GBASE-T/10GBASE-KR ¾ Storage convergence over Ethernet (iSCSI, FCoE) ƒ There will always be need for special solution for high end (E.g. IB, Myrinet) ¾ But 10 GigE will also play a role in the HPC market in the future ƒ Challenges: Need Ethernet enhancements - IB claims better technology ¾ Low Latency, traffic differentiation, “no-drop” fabric, multi-path, bi-sectional bandwidth

DCB Tutorial Nov 2007 9 Data Center: Blade Servers

ProjectedProjected BladeBlade UnitsUnits (k)(k) andand ServerServer ShareShare (%)(%) BLADE 31%31% 29%29% BLADE 12,000 35% Enet 12,000 35%BLADE 23% Enet 23% 30%30% 10,00010,000 BLADE LAN Enet FC 19%19% 25%25% 8,0008,000 FC Enet SW Enet Storage 15%15% 20%20% IB 6,0006,000 FC 1515 % % IB FC SW 11%11% FC IPC 4,000 4,000 9% 9% 1010 % % IB

2,0002,000 IB SW 5%5% IB

00 0%0% 20062006 2007 2007 2008 2008 2009 2009 2010 2010 2011 2011 2012 2012

TotalTotal ServersServers BladesBlades BladeBlade AttachAttach RateRate

ƒ Challenges: IO Consolidation is ƒ Blade Servers are gaining strong requirement for Blade momentum in market Servers: ƒ Second Generation Blade Servers ¾ Multiple fabrics, mezzanine cards, in market ¾ Power/thermal envelope, ¾ Management complexity ƒ Provides server, cable ¾ Backplane complexity consolidation ƒ Ethernet default fabric ƒ Optional fabrics for SAN and HPC

REF: Total Servers: Worldwide and Regional Server 2007-2012 Forecast, April, 2007, IDC Blade Servers: Worldwide and US Blade Server 2006-2010 Forecast, Oct’06, IDC DCB Tutorial Nov 2007 10 Data Center: Storage Trends

Lot of people using SAN: SANSAN Growth:Growth: iSCSIiSCSI andand FCFC ƒ Networked Storage growing 60% year FCFC iSCSIiSCSI ƒ FC is incumbent and growing: 5,0005,000 Entrenched enterprise customers 4,0004,000 ƒ iSCSI is taking off…: SMB and Greenfield deployment - choice driven 3,0003,000 by targets 2,0002,000 External Terabytes External Terabytes

Storage convergence over Ethernet: 1,1, 0 0 0 0 0 0

ƒ iSCSI for new SAN installations 00 20052005 2006 2006 2007 2007 2008 2008 2009 2009 2010 2010 ƒ FC tunneled over Ethernet (FCoE) for expanding FC SAN installation IDC Storage and FC HBA analyst reports, 2006

Challenges: ƒ Too many ports, fabrics, cables, power/thermal ƒ Need to address FC as well as iSCSI

DCB Tutorial Nov 2007 11 SAN Protocols

ƒ FCoE: App Applications ¾FC Tunneled through Ethernet SCSI SCSI ¾Addresses customer investment in iSCSI FCP legacy FC storage ¾Expects FC equivalent no-drop Encapsulation Layer TCP behavior in underlying Ethernet interconnect IP FC/FCoE ¾ Needs Ethernet enhancements for Base link convergence and “no-drop” Transport Ethernet (DC enhanced) performance FC FC Headers CRC ƒ iSCSI: Ethernet Ethernet Headers FCS/PAD ¾ SCSI over TCP/IP that provides reliability FC Payload ¾ High speed protocol acceleration FC solutions benefit from reduced FCoE Frame packet drops

DCB Tutorial Nov 2007 12 Fabric convergence needs:

Processor ƒTraffic Differentiation: Memory ¾LAN, SAN, IPC traffic needs differentiation in converged fabric ƒLossless Fabric: I/O I/O I/O ¾FC does not have transport layer – retransmissions are at SCSI! IPCStg LAN ¾iSCSI acceleration may benefit from lossless fabric too ƒSeamless deployment: ¾Backward compatibility ¾Plug and play ƒEthernet needs these enhancements to be true converged fabric for Data Center

DCB Tutorial Nov 2007 13 Data Center Bridging

ƒWhat is DCB? ¾Data Center Bridging provides Ethernet enhancements for Data Center needs (Storage, IPC, Blade Processor Servers etc.) Memory ¾Enhancements apply to bridges as well as end stations I/O ƒWhy DCB? ¾DC market demands converged fabric ¾Ethernet needs enhancements to be successful converged fabric of choice ƒScope of DCB: DCB ¾Should provide convergence DCB capabilities for Data Center – short range networks

DCB Tutorial Nov 2007 14 Data Center Bridging Æ Value Proposition

Simpler Management Single physical fabric to manage. Simpler to deploy, upgrade and maintain. Fabric Convergence

Servers Multiple Fabrics Converged Fabric (DCB)

Lower Cost Less adapters, cables & switches Lower power/thermals

Improved RAS Reduced failure points, time, misconnections, bumping.

DCB Tutorial Nov 2007 15 Comparison of convergence levels

No convergence Converged Fabric DCB (dedicated) Management

Cluster Cluster Memory Memory Memory Memory Network Memory Network Memory uP uP uP Chip Chip Chip SAN SAN 1 PCIe PCIe PCIe Link

System LAN System LAN System

Number of 2-3 1 1 Fabric Types IO interference No No Yes Technologies 3 1 to 2 1 to 2 Managed HW cost 3x adapters 3x adapters 1x adapter RAS More HW More HW Least HW Cable mess 3-4x 3-4x 1x

DCB Tutorial Nov 2007 16 Data Center Bridging Summary

DCB provides IO Consolidation: ƒLower CapEx, Lower OpEx, Unified management

Needs standards-based Ethernet enhancements: ƒNeed to support multiple traffic types and provide traffic differentiation ƒ“No-drop” option for DC applications ƒDeterministic network behavior for IPC ƒDoes not disrupt existing infrastructure ƒShould allow “plug-and-play” for enhanced devices ƒMaintain backward compatibility for legacy devices

DCB Tutorial Nov 2007 17 Gap Analysis: Data Center Current Infrastructure

Anoop Ghanwani Overview

ƒData Center Bridging (DCB) requirements

ƒWhat do 802.1 and 802.3 already provide?

ƒWhat can be done to make bridges more suited for the data center?

DCB Tutorial Nov 2007 19 Recap of DCB Requirements

ƒSingle physical infrastructure for different traffic types

ƒTraffic in existing bridged LANs ¾Tolerant to loss – apps that care about loss use TCP ¾QoS achieved by using traffic classes; e.g. voice vs data traffic

ƒData center traffic has different needs ¾Some apps expect lossless transport; e.g. FCoE ƒThis requirement cannot be satisfactorily met by existing standards

ƒBuilding a converged network ¾Multiple apps with different requirements; e.g. ƒVoice (loss- & delay-sensitive) ƒStorage (lossless, delay-sensitive) ƒEmail (loss-tolerant) ƒ… ¾Assign each app type to a traffic class ¾Satisfy the loss/delay/BW requirements for each traffic class

DCB Tutorial Nov 2007 20 Relevant 802.1/802.3 Standards

ƒFor traffic isolation and bandwidth sharing ¾Expedited traffic forwarding - 802.1p/Q ƒ8 traffic classes ƒDefault transmission selection mechanism is strict priority •Others are permitted but not specified ƒWorks well in for traffic in existing LANs •Control > Voice > Data

ƒFor achieving lossless behavior ¾Congestion Notification - P802.1Qau [in progress] ƒGoal is to reduce loss due to congestion in the data center ƒWorks end to end – both ends must be within the L2 network ƒNeeded for apps that run directly over L2 with no native congestion control ¾PAUSE - 802.3x ƒOn/off flow control ƒOperates at the link level ƒCan we build a converged data center network using existing standards?

DCB Tutorial Nov 2007 21 Transmission Selection for DCB

ƒ 802.1p/Q’s strict priority is inadequate LACP STP ¾Potential starvation of lower priorities ¾No min BW guarantee or max BW limit ¾Operator cannot manage bandwidth OSPF ƒ1/3 for storage, 1/10 for voice, etc.

VoIP VoIP ƒ Transmission selection requirements ¾Allow for minimal interference between traffic classes IPC IPC ƒCongestion-managed traffic will back off during congestion ƒShould not result in non-congestion- FCoE FCoE managed traffic grabbing all the BW ¾Ideally, a “virtual pipe” for each class

HTTP HTTP ƒ Benefits regular LAN traffic as well ¾Many proprietary implementations exist

ƒ Need a standardized behavior with a common management framework

DCB Tutorial Nov 2007 22 Achieving Lossless Transport Using P802.1Qau and 802.3x

CN: Congestion Notification RL: Rate Limiter LLFC: Link Level Flow Control NIC NIC Congestion

Bridge Bridge

Bridge NIC CN NIC RL CN – Message is generated and sent to ingress end station when a bridge experiences congestion Bridge LLFC RL – In response to CN, ingress node rate-limits the flows that caused the congestion

NIC LLFC •Simulation has shown P802.1Qau is effective at reducing loss •Packet drop still possible if multiple high-BW source go active simultaneously •LLFC is the only way to guarantee zero loss

DCB Tutorial Nov 2007 23 More on Link Level Flow Control Using 802.3x ƒ 802.3x is an on/off mechanism ¾All traffic stops during the “off” phase

ƒ 802.3x does not benefit some traffic LACP STP ¾Tolerant to loss; e.g. data over TCP ¾Low BW, high priority ensure loss is relatively rare; e.g. voice OSPF ƒ 802.3x may be detrimental in some VoIP VoIP cases ¾Control traffic; e.g. LACP & STP BPDUs ¾Increases latency for interactive IPC IPC traffic

Link Level ƒ As a result most folks turn 802.3x off FCoE FCoE Flow Control Enabled ƒ Need priority-based link level flow HTTP HTTP control ¾Should only affect traffic that needs it ¾Ability to enable it per priority ¾Not simply 8 x 802.3x PAUSE! ¾Provides a complete solution when used together with P802.1Qau

DCB Tutorial Nov 2007 24 Summary

ƒThe goal of DCB is to facilitate convergence in the data center ¾Apps with differing requirements on a single infrastructure

ƒNeed improvements to existing standards to be successful ¾Flexible, standards-based transmission selection ¾End to end congestion management ¾Enhanced link level flow control

ƒNetworks may contain DCB- and non-DCB-capable devices ¾Discovery/management framework for DCB-specific features

ƒThe technology exists in many implementations today ¾Standardization will promote interoperability and lower costs

DCB Tutorial Nov 2007 25 Solution Framework: Data Center Bridging

Hugh Barrass Data Center Bridging Framework

ƒ Solution must be bounded - No “leakage” to legacy or non-DCB systems - Can optimize behavior at “edges”

ƒ Aim for “no drop” ideal - Using reasonable architectures - No congestion spreading, or latency inflation (Covered in 802.1Qau) (in discussion) ƒ Relies on congestion notification with flow control backup - Two pronged attack required for corner cases

ƒ Support for multiple service architectures (PAR under consideration) - Priority queuing with transmission selection

(Covered in 802.1Qau and extensions) ƒ Discovery & capability exchange - Forms & defines the cloud; services supported - Management views & control

DCB Tutorial Nov 2007 27 DCB cloud

DCB devices in cloud, edge behavior DCB service exists for some (but not all) traffic classes Transparent pass-through for DCB bridges non-DCB classes Non-DCB bridges

DCB end-stations

Non-DCB end-stations ƒ Edge behavior

DCB Tutorial Nov 2007 28 DCB cloud definition allows mixed topology

Introduction of DCB devices in key parts of network offers significant advantages

ƒDCB cloud is formed, only DCB devices allowed inside - Function using LLDP defines cloud and operating parameters

ƒIf source, destination & path all use DCB then optimal behavior

ƒAt edge of cloud, edge devices restrict access to CM classes - DCB specific packets do not “leak out”

ƒOptimal performance for small, high b/w networks - E.g. datacenter core

DCB Tutorial Nov 2007 29 Sample system

1 congestion point, 1 reaction point considered

Network cloud: many to many connectivity

Reaction ƒEdge data destination point

ƒ control packets

Congestion point

ƒOther destinations ƒOther data sources – assumed equivalent ƒEdge data source

DCB Tutorial Nov 2007 30 In principle…

Operation of congestion management

ƒCM is an end-to-end function - Data from multiple sources hits a choke point - Congestion is detected, ingress rate is slowed - Control loop matches net ingress rate to choke point

ƒSimple cases – 2 or more similar ingress devices; 1 choke point - Each end device gets 1/N b/w - Minimal buffer use at choke point

ƒMore complex cases with asymmetric sources, multi-choke etc. - CM is always better than no CM!

ƒCorner cases still cause problems – particularly for buffering - b/w spikes overfill buffers before CM can take control - Requires secondary mechanism for safety net

DCB Tutorial Nov 2007 31 Priority based flow control

ƒ In corner cases, and edge conditions, CM cannot react quickly enough to prevent queue overflow. For certain traffic types the is unacceptable.

Control / hi-pri traffic doesn’t fill buffer

CM traffic generates flow control to avoid CM traffic buffer glitches full overflow

Best effort traffic dropped Transmission if buffer overflows selection controls egress

DCB Tutorial Nov 2007 32 Multiple service architectures using 802.1p

ƒ Current standard defines only simplistic behavior - Strict priority defined - More complex behavior ubiquitous, but not standardized ƒ Enhancements needed to support more sophisticated, converged, networks

ƒ Within the defined 8 code points - Define grouping & b/w sharing - Queue draining algorithms defined to allow minimum & maximum b/w share; weighted priorities etc. ƒ Sophisticated queue ƒ Common definition & management distribution interface allow stable interoperability within and between service groups ƒ Service types, sharing b/w

DCB Tutorial Nov 2007 33 DCB Framework

ƒCongestion Management ƒ Persistent Congestion: 802.1Qau (approved TF) ƒ Transient Congestion: Priority based Flow Control (under discussion)

ƒTraffic Differentiation ƒ Enhanced Transmission Selection: 802.1Qaz (proposed PAR)

ƒ Discovery and Capability Exchange ƒ Covered for 802.1Qau ƒ Additional enhancements may be needed for other DCB projects

DCB Tutorial Nov 2007 34 In summary

ƒ Complete datacenter solution needs whole framework - Components rely on each other - Utility is maximized only when all are available

ƒ Fully functional Data Center Bridging solves the problem - Allows convergence of datacenter network - Currently discrete networks per function - Eliminates niche application networks

ƒ High bandwidth, low latency, “no drop” network… ƒ … alongside scalability & simplicity of 802.1 bridging - Supports rapid growth to meet datacenter demands - Net benefits for users and producers

DCB Tutorial Nov 2007 35 Challenges and Solutions: DCB

Joe Pelissier DCB: Challenges

ƒCongestion Spreading: ¾Priority based Flow Control causes congestion spreading, throughput melt-down ƒDeadlocks: ¾Link level Flow Control can lead to deadlocks ƒDCB not required for all applications/products ¾DCB functionality applicable to DC networks, but.. ¾May become unnecessary burden for some products ƒCompatibility with existing devices ¾Data Centers contain applications that tolerate “drop” ¾Legacy devices and DCB devices interoperability challenges

DCB Tutorial Nov 2007 37 Requirements of a “Lossless” Fabric

ƒ Many IPC and storage protocols do not provide for low–level recovery of lost frames ¾ Done by higher level protocol (e.g. class driver or application) ¾ Recovery in certain cases requires 100’s to 1000’s of ms

ƒ Excessive loss (e.g. loss due to congestion vs. bit errors) may result in link resets, redundant fabric failovers, and severe application disruption

ƒ These protocols therefore require (and currently operate over) a flow control method to be enabled ¾ With 802.3x PAUSE, this implies a separate fabric for these protocols ƒSince traditional LAN/WAN traffic is best served without PAUSE ¾ These protocols are not “broken” ƒThey are optimized for layer 2 data center environments •Such as those envisioned for DCB ƒMaintaining this optimization is critical for broad market adoption

DCB Tutorial Nov 2007 38 Frame Loss Rate is not the Issue

ƒMany IPC and storage protocols are not “congestion aware” ¾Do not expect frame loss due to congestion ƒPotential for congestion caused frame loss highly application sensitive ƒTraffic may be very bursty – very real potential for fame loss ¾Do not respond appropriately ƒHuge retransmission attempts ƒCongestion collapse ƒStorage and IPC applications can tolerate low frame loss rates ¾Bit errors do occur ƒFrame loss due to congestion requires different behavior compared to frame loss due to bit errors ¾Back-off, slow restart, etc. to avoid congestion collapse

DCB Tutorial Nov 2007 39 Congestion Notification and Frame Loss

ƒTremendous effort has been expended in developing a congestion notification scheme for Bridged LANs ¾Simulation efforts indicate that these schemes are likely to dramatically reduce frame loss ¾However, frame loss not sufficiently eliminated ƒEspecially under transitory congestion events and in topologies that one would reasonably expect for storage and IPC traffic ƒCongestion Notification does reduce the congestion spreading side effect of flow control ¾Therefore a supplemental flow control mechanism that prevents frame loss is viewed as a requirement for successful deployment of storage and IPC protocols over Bridged LANs ƒThese protocols operate over a small portion of the network (i.e. the portion to be supported by DCB). ƒA simple method is sufficient

DCB Tutorial Nov 2007 40 Congestion Spreading

ƒ Multiple hops in a flow controlled network can cause congestion that spreads throughout the network ¾Major issue with 802.3x PAUSE NIC NIC . . ƒ Effects mitigated by: . .

¾Limited to flow controlled DCB region ...... of network . . ¾Limited to traffic that traditionally

operates over flow controlled . . . . . networks . . ƒE.g. IPC and storage . . ¾Isolated from and independent of NIC . . . . . NIC traffic that is prone to negative impacts of congestion spreading ƒPriority Based Flow Control (and selective transmission) create “virtual pipes” on the link •Flow control is enabled (or not) on each “pipe” as appropriate for the traffic type

DCB Tutorial Nov 2007 41 Deadlock (1) ƒ Flow control in conjunction with MSTP or SPB can cause deadlock ¾ See “Requirements Discussion of Link Level- Flow Control for Next Generation Ethernet” by Gusat et al, January ’07 (au-ZRL-Ethernet-LL-FC- requirements-r03) ƒ To create deadlock, all of the following conditions must occur: Node

¾ A cyclic flow control dependency exists STOP ¾ Traffic flows across all corners of the cycle ¾ Sufficient congestion occurs on all links in the cycle simultaneously such that each

bridge is unable to permit more frames to STOP flow ƒ At this point, all traffic on the affected Node Node links halts STOP until frames age out ¾ Generally after one second

ƒ Feeder links also experience severe STOP congestion Node and probable halt to traffic flow ¾ A form of congestion spreading

DCB Tutorial Nov 2007 42 Deadlock (2)

ƒ However: ¾The probability of an actual deadlock is very small ¾Deadlock recovers due to frame discard mandated by existing Maximum Bridge Transit Delay (clause 7.7.3 in IEEE Std 802.1D™- 2004)

ƒ Low Deadlock Probability: ¾Congestion Notification renders sufficient congestion at all necessary points in the network highly unlikely ¾Many Data Center topologies (e.g. Core / Edge or fat tree) do not contain cyclic flow control dependencies ¾MSTP or SPB routing may be configured to eliminate deadlock ƒEdges would not be routed as an intermediate hop between core bridges ¾Traffic flow frequently is such that deadlock cannot occur ƒE.g. storage traffic generally travels between storage arrays and servers ƒInsufficient traffic passes through certain “corners” of the loop to create deadlock

DCB Tutorial Nov 2007 43 Deadlock (3)

ƒ The previous assumptions are not without demonstration ¾IPC and storage networks are widely deployed in mission critical environments: ƒMost of these networks are flow controlled

ƒ What happens if a deadlock does Node occur? SLOWSTOP ¾Traffic not associated with the class (i.e. different priority level) is unaffected ¾Normal bridge fame lifetime SLOWSTOP enforcement frees Node Node the deadlock SLOWSTOP ƒCongestion Notification kicks in to prevent reoccurrence

SLOWSTOP Node

DCB Tutorial Nov 2007 44 Summary

ƒ Congestion Spreading: ¾Two-pronged solution: End-to-end congestion management using 802.1Qau and Priority based Flow Control ƒ802.1Qau reduces both congestion spreading and packet drop ƒPriority based Flow Control provides “no-drop” where required ƒ Deadlocks: ¾Deadlock had been shown to be a rare event in existing flow controlled networks ¾Addition of Congestion Notification further reduces occurrence ¾Normal Ethernet mechanisms resolve the deadlock ¾Non flow controlled traffic classes unaffected ƒ CM not required for all applications/products ¾Work related to Data Center should be identified as Data Center Bridging (following “Provider Bridging”, “AV Bridging”) ¾Provides clear message about usage model ƒ Compatibility with existing devices ¾Capability Exchange protocol and MIBs will be provided for backward compatibility

DCB Tutorial Nov 2007 45 802.1 Architecture for DCB

Norm Finn IEEE Std. 802.1Q-2006

Subclause 8.6 The Forwarding Process

DCB Tutorial Nov 2007 47 IEEE Std. 802.1ag-2007

Subclause 22 CFM in systems

DCB Tutorial Nov 2007 48 IEEE Std. 802.1ag-2007

Subclause 22 CFM in systems

DCB Tutorial Nov 2007 49 IEEE P802.1au/Draft 0.4: Bridges

Subclause 31 Congestion notification entity operation

DCB Tutorial Nov 2007 50 Proposed extension for PPP in Bridges

Subclause TBD – modified queuing entities

RED: PPP generation based on space available in per- port, per-controlled-priority, input buffers.

BLUE: PPP reaction, only for BCN controlled priority queues.

DCB Tutorial Nov 2007 51 IEEE P802.1au/Draft 0.4: Stations

Subclause 31 Congestion notification entity operation

DCB Tutorial Nov 2007 52 Proposed extension for PPP in Stations

Subclause TBD – modified queuing entities

RED: PPP generation based on space available in per- port, per-controlled-priority, input buffers.

BLUE: PPP reaction, only for BCN controlled priority queues.

DCB Tutorial Nov 2007 53 Thank You! Questions and Answers

ƒ802.1Qau Congestion Notification ¾In draft development

ƒ802.1Qaz Enhanced Transmission Selection ¾PAR submitted for IEEE 802 approval at this meeting

ƒPriority-Based Flow Control ¾Congestion Management task group is developing a PAR

DCB Tutorial Nov 2007 55