Advanced Storage Area Network Design
Total Page:16
File Type:pdf, Size:1020Kb
Advanced Storage Area Network Design Edward Mazurek Technical Lead Data Center Storage Networking [email protected] @TheRealEdMaz BRKSAN-2883 Agenda • Introduction • Technology Overview • Design Principles • Storage Fabric Design Considerations • Data Center SAN Topologies • Intelligent SAN Services • Q&A 3 Introduction 6 An Era of Massive Data Growth Creating New Business Imperatives for IT 10X Increase in Data Produced (From 4.4T GB to 44T GB) 32B IoT Devices (Will be Connected to Internet) By 2020 40% of Data Will Be “Touched” by Cloud 85% of Data for Which Enterprises Will Have Liability and Responsibility IDC April 2014: The Digital Universe of Opportunities: Rich Data and Increasing Value of Internet of Things 7 Evolution of Storage Networking…. Enterprise Apps: OLTP, VDI, etc. Big Data, Scale-Out NAS Cloud Storage (Object) Compute Nodes REST API Fabric Fabric Block and/or File Arrays Multi-Protocol (FC, FICON, FCIP, FCoE, NAS, iSCSI, HTTP) Performance (16G FC, 10GE, 40GE, 100GE) Scale (Tens of Thousands P/V Devices, Billions of Objects) 8 Operational Simplicity (Automation, Self-Service Provisioning) Enterprise Flash Drives = More IO Significantly More IO/s per Drive at Much Lower Response Time • Drive performance hasn’t 110 changed since 2003 (15K 100 SATA drives drives) 90 (8 drives) • Supports new application 80 100% Random Read Miss 8KB performance requirements 70 One Drive per DA Processor - 8 processors 60 • Price/performance making 15K rpm drives SSD more affordable 50 (8 drives) 40 • Solid state drives dramatically Response Time Msec Time Response 30 Enterprise Flash increase IOPS that a given Drives (8 drives) 20 array can support 10 • Increased IO directly translates 0 to increased throughput 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 IOPs 9 Technology Overview 15 Fibre Channel – Foundations Based on SCSI • Foundational protocol, forms the basis of Host Disk (Initiator) (Target) an I/O transaction SCSI READ Operation • Communications are based upon Point to Point SCSI I/O Channel • Storage is accessed at a block-level via SCSI Host Disk (Initiator) (Target) • High-performance interconnect providing high I/O throughput SCSI WRITE Operation • SCSI I/O Channel The foundation for all block-based storage connectivity • Mature - SCSI-1 developed in 1986 16 Fibre Channel - Communications Point-to-point oriented Transmitter Receiver • Facilitated through device login N_port N_port Receiver Transmitter N_Port-to-N_Port connection Host Disk • Logical node connection point (Initiator) (Target) Flow controlled • Buffer-to-buffer credits and end-to-end basis Transmitter Receiver N_port Acknowledged Receiver Transmitter • For certain classes of traffic, none for others Host SAN (Initiator) (Switch) Multiple connections allowed per device 17 Fibre Channel Addressing Dual Port HBA Every Fibre Channel port and node has two 10:00:00:00:c9:6e:a8:16 64 bit hard-coded addresses called World 10:00:00:00:c9:6e:a8:17 Wide Names (WWN) 50:0a:09:83:9d:53:43:54 • NWWN(node) uniquely identify devices • PWWN(port) uniquely identify each port in a device • Allocated to manufacturer by IEEE Host Switch Disk phx2-9513# show int fc 1/1 • Coded into each device when manufactured fc1/1 is up Hardware is Fibre Channel, SFP is short wave laser Switch Name Server maps PWWN to FCID Port WWN is 20:01:00:05:9b:29:e8:80 4 bits 12 bits 24 bits 24 bits N-port or IEEE Organizational Unique ID 0002 Locally Assigned Identifier F_port Identifier (OUI) Format Identifier Port Identifier Assigned to each vendor Vendor-Unique Assignment 18 Port Initialization – FLOGI and PLOGIGIs/PLOGIs Target Step 1: Fabric Login (FLOGI) • Determines the presence or absence of a Fabric FC Fabric 3 • Exchanges Service Parameters with the Fabric • Switch identifies the WWN in the service parameters of the accept frame and assigns a Fibre Channel ID (FCID) • Initializes the buffer-to-buffer credits E_Port Step 2: Port Login (PLOGI) F_Port • Required between nodes that want to communicate 1 • Similar to FLOGI – Transports a PLOGI frame to the N_Port designation node port 2 • In P2P topology (no fabric present), initializes buffer- HBA to-buffer credits Initiator 19 FC_ID Address Model • FC_ID address models help speed up FC routing • Switches assign FC_ID addresses to N_Ports • Some addresses are reserved for fabric services • Private loop devices only understand 8-bit address (0x0000xx) • FL_Port can provide proxy service for public address translation • Maximum switch domains = 239 (based on standard) 8 Bits 8 Bits 8 Bits Switch Area Device Switch Topology Model Domain Private Loop Device 00 00 Arbitrated Loop Address Model Physical Address (AL_PA) Switch Public Loop Device Area Arbitrated Loop Address Model Domain Physical Address (AL_PA) 20 FSPF Fabric Shortest Path First • Provides routing services within any FC fabric • Supports multipath routing • Bases path status on a link state protocol similar to OSPF • Routes hop by hop, based only on the domain ID • Runs on E ports or TE ports and provides a loop free topology • Runs on a per VSAN basis. Connectivity in a given VSAN in a fabric is guaranteed only for the switches configured in that VSAN. • Uses a topology database to keep track of the state of the links on all switches in the fabric and associates a cost with each link • Fibre Channel standard ANSI T11 FC-SW2 21 FSPF phx2-5548-3# show fsp database vsan 12 FSPF Link State Database for VSAN 12 Domain 0x02(2) LSR Type = 1 Advertising domain ID = 0x02(2) LSR Age = 1400 16G Number of links = 4 Port-Channel NbrDomainId IfIndex NbrIfIndex Link Type Cost ----------------------------------------------------------------------------- 0x01(1) 0x00010101 0x00010003 1 125 0x01(1) 0x00010100 0x00010002 1 125 0x03(3) 0x00040000 0x00040000 1 62 16G 0x03(3) 0x00040001 0x00040001 1 125 FSPF Link State Database for VSAN 12 Domain 0x03(3) 8G LSR Type = 1 Advertising domain ID = 0x03(3) LSR Age = 1486 9148-2 Number of links = 2 5548 NbrDomainId IfIndex NbrIfIndex Link Type Cost 1/1 ----------------------------------------------------------------------------- 2/13 0x02(2) 0x00040000 0x00040000 1 62 2/14 1 1/2 0x02(2) 0x00040001 0x00040001 1 125 phx2-5548-3# 2/15 1/3 D2 2/16 2 1/4 D3 22 Fibre Channel FC-2 Hierarchy • Multiple exchanges are initiated between initiators (hosts) and targets (disks) • Each exchange consists of one or more bidirectional sequences • Each sequence consists of one or more frames • For the SCSI3 ULP, each exchange maps to a SCSI command OX_ID and RX_ID Exchange SEQ_ID Sequence Sequence Sequence SEQ_CNT Frame Frame Frame Frame Fields ULP Information Unit 23 What Is FCoE? It’s Fibre Channel From a Fibre Channel standpoint it’s • FC connectivity over a new type of cable called… Ethernet From an Ethernet standpoints it’s • Yet another ULP (Upper Layer Protocol) to be transported FC-4 ULP Mapping FC-4 ULP Mapping FC-3 Generic Services FC-3 Generic Services FC-2 Framing & Flow Control FC-2 Framing & Flow Control FC-1 Encoding FCoE Logical End Point FC-0 Physical Interface Ethernet Media Access Control Ethernet Physical Layer 24 Standards for FCoE FCoE is fully defined in FC-BB-5 standard FCoE works alongside additional technologies to make I/O Consolidation a reality FCoE IEEE 802.1 T11 DCB FC on FC on Other other Network network PFC ETS Media DCBX media Lossless Priority Configuration Ethernet Grouping Verification FC-BB-5 802.1Qbb 802.1Qaz 802.1Qaz Technically stable October, 2008 Standard Sponsor Ballot July 2010 Sponsor Ballot October 2010 Sponsor Ballot October25 2010 Completed in June 2009 Published Fall 2011 Published Fall 2011 Published Fall 2011 Status Published in May, 2010 25 • VLAN Tag enables 8 priorities FCoE Flow Control for Ethernet traffic IEEE 802.1Qbb Priority Flow Control 3.3ms • PFC enables Flow Control on a Per-Priority basis using PAUSE frames (IEEE 802.1p) Resume • Receiving device/switch sends Pause frame when receiving buffer passes threshold • Two types of pause frames • Quanta = 65535 = 3.3ms • Quanta = 0 = Immediate resume • Distance support is determined FCoE by how much buffer is available to absorb data in Ethernet Wire flight after Pause frame sent 26 ETS: Enhanced Transmission Selection IEEE 802.1Qaz • Allows you to create priority groups • Can guarantee bandwidth • Can assign bandwidth percentages to groups • Not all priorities need to be used or in groups 80%20% 80% 20% FCoE Ethernet Wire 27 FCoE Is Really Two Different Protocols FIP (FCoE Initialization Protocol) FCoE Itself • It is the control plane protocol • Is the data plane protocol • It is used to discover the FC entities • It is used to carry most of the FC connected to an Ethernet cloud frames and all the SCSI traffic • It is also used to login to and logout • Ethertype 0x8906 from the FC fabric • Uses unique BIA on CNA for MAC • Ethertype 0x8914 The Two Protocols Have • Two different Ethertypes • Two different frame formats • Both are defined in FC-BB-5 28 http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-560403.html FPMA - Fabric Provided MAC Address Fibre Channel over Ethernet Addressing Scheme Domain ID . FPMA assigned for each FCID FC Fabric Domain ID 11 . FPMA composed of a FC-MAP and FCID FCID Domain 11.00.01 . FC-MAP – Mapped Address Prefix ID 10 . the upper 24 bits of the FPMA FCID . FCID is the lower 24 bits of the FPMA 10.00.01 . FCoE forwarding decisions still made based on Fibre Channel FSPF and the FCID within the FPMA FCID Addressing FC-MAP FC-ID (0E-FC-xx) 10.00.01 FC-MAP FC-ID FPMA (0E-FC-xx) 10.00.01 29 What