Click to edit Master title style

PCI-SIG® Architecture Overview

Richard Solomon PCI-SIG WG Member Synopsys

Copyright © 2019 PCI-SIG® - All Rights Reserved ClickWhat’s to All edit This Master PCI Stuff title Anyway? style o Presentation will cover basic concepts and their evolution from PCI™ through PCI-X™ to PCI Express® • Specs written assuming designers have these key background concepts • High level overview of PCI, PCI-X, and PCI Express

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 2 Click to edit Master title style

PCI Background

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 3 ClickRevolutionary to edit Master AND Evolutionary title style o PCI™ (1992/1993) • Revolutionary • jumperless configuration (BARs) • Unprecedented bandwidth • 32-bit / 33MHz – 133MB/sec • 64-bit / 66MHz – 533MB/sec • Designed from day 1 for -mastering adapters

• Evolutionary • System BIOS maps devices then operating systems boot and run without further knowledge of PCI • PCI-aware O/S could gain improved functionality • PCI 2.1 (1995) doubled bandwidth with 66MHz mode

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 4 ClickRevolutionary to edit Master AND Evolutionary title style o PCI-X™ (1999) • Revolutionary • Unprecedented bandwidth • Up to 1066MB/sec with 64-bit / 133MHz • Registered bus protocol • Eased electrical timing requirements • Brought split transactions into PCI “world”

• Evolutionary • PCI compatible at hardware *AND* software levels • PCI-X 2.0 (2003) doubled bandwidth • 2133MB/sec at PCI-X 266 and 4266MB/sec at PCI-X 533

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 5 ClickRevolutionary to edit Master AND Evolutionary title style o PCI Express – aka PCIe® (2002) • Revolutionary • Unprecedented bandwidth • x1: up to 2GB/sec in *EACH* direction (PCIe 4.0) • x16: up to 32GB/sec in *EACH* direction (PCIe 4.0) • “Relaxed” electricals due to serial bus architecture • Point-to-point, low voltage, dual simplex with embedded clocking

• Evolutionary • PCI compatible at software level • Configuration space, Power Management, etc. • Of course, PCIe-aware O/S can get more functionality • Transaction layer familiar to PCI/PCI-X designers • System topology matches PCI/PCI-X • Doubling of bandwidth each generation (from 250MB/s/lane): • PCIe 2.0 (2006) 500MB/s/lane • PCIe 3.0 (2010) ~1GB/s/lane • PCIe 4.0 (2017) ~2GB/s/lane • PCIe 5.0 (2019) ~4GB/s/lane

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 6 Click to edit Master title style

PCI Concepts

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 7 ClickAddress to Spacesedit Master – Memory title style & I/O o Memory space mapped cleanly to CPU semantics • 32-bits of address space initially • 64-bits introduced via Dual-Address Cycles (DAC) • Extra clock of address time on PCI/PCI-X • 4DWORD header in PCI Express • Burstable o I/O space mapped cleanly to CPU semantics • 32-bits of address space • Actually much larger than CPUs of the time • Non-burstable • Most PCI implementations didn’t support • PCI-X codified • Carries forward to PCI Express

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 8 ClickAddress to Spacesedit Master – Configuration title style o Configuration space??? • Allows control of devices’ address decodes without conflict • No conceptual mapping to CPU address space • Memory-based access mechanisms in PCI-X and PCIe • Bus / Device / Function (aka BDF) form hierarchy-based address (PCIe 3.0 calls this “Routing ID”) • “Functions” allow multiple, logically independent agents in one physical device • E.g. combination SCSI + device • 256 bytes or 4K bytes of configuration space per device • PCI/PCI-X bridges form hierarchy • PCIe switches form hierarchy • Look like PCI-PCI bridges to software • “Type 0” and “Type 1” configuration cycles • Type 0: to same bus segment • Type 1: to another bus segment

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 9 ClickConfiguration to edit Master Space title (cont’d) style

Processor Processor Processor Processor

Address Port Data Port Address Port Data Port Main Host/PCI Bridge Host/PCI Bridge Memory Bus = 0 Bus = 4 Subord = 3 Subord = 5

PCI Bus 0 PCI Bus 4

PCI-to-PCI PCI-to-PCI Bridge Bridge

Primary = 0 Primary = 4 Secondary = 1 Secondary = 5 Subord = 3 Subord = 5

PCI Bus 1 PCI Bus 5

PCI-to-PCI PCI-to-PCI Bridge Bridge

Primary = 1 Primary = 1 Secondary = 2 Secondary = 3 Subord = 2 Subord = 3

PCI Bus 2

PCI Bus 3

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 10 ClickUsing to Configuration edit Master titleSpace style o Device Identification • VendorID: PCI-SIG assigned • DeviceID: Vendor self-assigned • Subsystem VendorID: PCI-SIG • Subsystem DeviceID: Vendor o Address Decode controls • Software reads/writes BARs to determine required size and maps appropriately • Memory, I/O, and bus-master enables o Other bus-oriented controls

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 11 Using Configuration Space – CapabilitiesClick to edit List Master title style

o Linked list • Follow the list! Cannot assume fixed location of any given feature in any given device • Features defined in their related specs: • PCI-X, PCIe, PCI Power Management, Etc… • Find consolidated list in PCI Code and ID Assignment Spec

31 16 15 8 7 0 Pointer to Feature-specific Next Capability Capability ID Dword 0 Configuration Registers Dword 1 Dword n

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 12 Using Configuration Space – ClickExtended to editCapabilities Master List title style

o Linked list – new with PCI Express • Follow the list! Cannot assume fixed location of any given feature in any given device • First entry in list is *always* at 100h • Features defined in PCI Express and related (e.g. MR-IOV, SR-IOV) specifications • Consolidated in PCI Code and ID Assignment Spec

31 20 19 16 15 8 7 0 Pointer to Next Capability Version Capability ID Dword 0 Feature-specific Configuration Registers Dword 1 Dword n

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 13 ClickInterrupts to edit Master title style o PCI introduced INTA#, INTB#, INTC#, INTD# - collectively referred to as INTx • Level sensitive • Decoupled device from CPU interrupt • System controlled INTx to CPU interrupt mapping • Configuration registers • report A/B/C/D • programmed with CPU interrupt number o PCI Express mimics this via “virtual wire” messages • Assert_INTx and Deassert_INTx

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 14 ClickWhat toare edit MSI Master and MSI title- X?style o Memory Write replaces previous interrupt semantics • PCI and PCI-X devices stop asserting INTA/B/C/D and PCI Express devices stop sending Assert_INTx messages once MSI or MSI-X mode is enabled • MSI uses one address with a variable data value indicating which “vector” is asserting • MSI-X uses a table of independent address and data pairs for each “vector” o NOTE: Boot devices and any device intended for a non-MSI operating system generally must still support the appropriate INTx signaling!

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 15 Click to edit Master title style

PCI-X Explained

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 16 ClickWhat tois editPCI -MasterX? title style o “PCI-X is high-performance backward compatible PCI” • PCI-X uses the same PCI architecture • PCI-X leverages the same base protocols as PCI • PCI-X leverages the same BIOS as PCI • PCI-X uses the same connector as PCI • PCI-X and PCI products are interoperable • PCI-X uses same software driver models as PCI o PCI-X is faster PCI • PCI-X 533 is up to 32 times faster than the original version of PCI • PCI-X protocol is more efficient than conventional PCI

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 17 ClickPCI 2.x/3.0 to edit vs.Master PCI- Xtitle Mode style 1 o Same bus and control signals

o Evolutionary protocol changes New “Attribute” o Clock frequency up to 133 MHz phase for enhanced features

1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12

PCI_CLK PCI_CLK AD ADDRESS ATTR DATA-0 DATA-1 DATA-2 DATA-3 DATA-4 DATA-5 AD ADDRESS DATA-0 DATA-1 DATA-2 DATA-3 DATA-4 DATA-5

C/BE# BUS CMD ATTR BE#'s-0 BE#'s-1 BE#'s-2 BE#'s-3 BE#'s-4 BE#'s-5 C/BE# BUS CMD BE#'s-0 BE#'s-1 BE#'s-2 BE#'s-3 BE#'s-4 BE#'s-5

FRAME# Transfer Data Transfer Data Transfer Data Transfer Data Transfer Data Transfer Data FRAME# Transfer Data Transfer Data Transfer Data Transfer Data Transfer Data Transfer Data IRDY# IRDY# TRDY# TRDY#

DEVSEL# DEVSEL#

Bus Transaction Bus Transaction

(Common clock)

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 18 ClickSplit Transactions to edit Master – Background title style o PCI commands contained no length • Bus allowed disconnects and retries • Difficult data management for target device • Writes overflow buffers • Reads require pre-fetch • How much to pre-fetch? When to discard? Prevent stale data? o PCI commands contained no initiator information • No way for target device to begin communication with the initiator • Peer-to-peer requires knowledge of system-assigned addresses

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 19 ClickSplit Transactionsto edit Master title style o PCI-X commands added length and Routing ID of initiator • Writes: allow target device to allocate buffers • Reads: Pre-fetch now deterministic o PCI-X retains “retry” & “disconnect”, adds “split” o Telephone analogy • Retry: “I’m busy go away” • Delayed transactions are complicated • Split: “I’ll call you back” • Simple • More efficient

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 20 ClickEfficient to editPCI Master-X Protocol title style

Bandwidth Usage with Conventional PCI Protocols Bandwidth Usage with PCI-X Protocols, included in PCI-X 2.0 275 275 100% 100% System Overhead 250 250 Idle Time Idle Time -- Unused BW -- Scheduling -- Unused BW 90% 90% 225 225 Transaction Overhead

80% Bandwidth Percent ofTotal 80% -- Addressing and Routing

200 200 Percent ofTotal /sec System Overhead 70% /sec 70% -- Scheduling 175 175 60% 60% Transaction Overhead 150 150

MegaBytes -- Addressing and Routing

50% MegaBytes 50%

125 125 BandWidth 40% 40%

100 30% 100 30% Bandwidth Bandwidth

Bandwidth Bandwidth Transaction Data Payload 50 Transaction Data Payload 20% 50 -- Actual user 20% -- Actual user data data 25 10% 25 10%

1 2 3 4 5 1 2

Number of Load Exerciser Cards Number of Load Exerciser Cards

The PCI-X protocol is more efficient than traditional PCI.

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 21 Click to edit Master title style

PCI Express Overview

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 22 ClickPCIe toSpecifications edit Master title style

Software B Transaction o Base a • Electrical Form Factors: s • Protocol Data Link • Configuration o Card (CEM) e o Bridge o Mini Card (MiniCEM) Physical o I/O Virtualization o M.2 Mechanical • Single Root o ExpressModule • Multi-Root o Cable • Address Translation o OCuLink Services

Integrated starting with PCIe 4.0 Base Specification

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 23 ClickPCIe toArchitecture edit Master Features title style o PCI Compatibility o Protocol • Configuration and PCI software • Fully packetized split- driver model transaction • PCI power management • Credit-based flow control software compatible • Hierarchical topology support o Performance • Virtual channel mechanism • Scalable frequency (2.5-32GT/s) • Scalable width (x1, x4, x8, x16) o Advanced Capabilities • Low latency and highest • CRC-based data integrity, utilization (Bandwidth/pin) hot plug, error logging o Physical Interface o Enhanced Configuration • Point-to-point, dual-simplex Space • Differential low voltage signaling • Extensions and bridges into • Embedded clocking other architectures • Supports connectors, modules, cables

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 24 ClickPCIe toSpeed edit MasterEvolution title style o Introduced at 2.5GT/sec • Commonly called 2.5GHz • PCI-SIG eventually adopts GigaTransfers per Second (GT/s) terminology • 100 MHz reference clock provided • Eases synchronization between ends • Particularly when Spread Spectrum Clocking in used • Optional, but nearly universal in traditional “PC” world • 8b/10b encoding used to provide DC balance and reduce “runs” of 0s or 1s which make clock recovery difficult o Specification Revisions: 1.0, 1.0a, 1.1

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 25 ClickPCIe toSpeed edit MasterEvolution title (2.x) style o Speed doubled to 5GT/sec • Reference clock remains at 100 MHz • Lower jitter clock sources required vs 2.5GT/sec • Generally higher quality clock generation/distribution required • 8b/10b encoding continues to be used o Specification Revisions: 2.0, 2.1 • Devices choosing to implement a maximum rate of 2.5GT/sec can still be fully 2.x compliant!

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 26 ClickPCIe toSpeed edit MasterEvolution title (3.x) style 2 x 5 = 8 ??? o Speed “doubled” over PCIe 2.x 5GT/sec o 8GT/sec electrical rate • 10GT/sec required significant cost and complexity in channel, receiver design, etc. o Reference clock remains at 100 MHz • Very similar requirements to 5GT/sec mode o Specification Revisions: 3.0, 3.1 • Devices choosing to implement a maximum rate of 2.5GT/sec or 5GT/sec can still be fully 3.x compliant!

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 27 ClickPCIe toSpeed edit MasterEvolution title (3.x) style o 128/130 encoding reduces overhead from the 20% loss of 8b/10b • Original plan was scrambling-only for exactly 2x the 5GT/sec bandwidth • 5000Mb/sec / (10bits/byte) = 500MB/sec per lane • 8000Mb/sec / ( 8bits/byte) = 1000MB/sec per lane • Pure 128/130 encoding is ~1.5% loss o Scrambling replaces DC-offset and run-length reduction functions of 8b/10b

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 28 ClickPCIe toSpeed edit MasterEvolution title (4.0) style o Speed doubled over PCIe 3.x 8GT/s • 16GT/s electrical rate • Reference clock remains at 100 MHz • Continued improvement/tightening of specifications • 128/130 Encoding retained • Huge document integration: included material from parallel PCI, PCI Power Management, SR-IOV, etc… o Specification Revisions: 4.0 • Devices choosing to implement a maximum rate of 2.5GT/s, 5GT/s, or 8GT/sec can still be fully 4.0 compliant!

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 29 ClickPCIe toSpeed edit MasterEvolution title (5.0) style o Speed doubled over PCIe 4.0 16GT/s • 32GT/s electrical rate • Reference clock remains at 100 MHz • Continued improvement/tightening of specifications • 128/130 Encoding retained • 10-bit tags & credit scaling from 4.0 sufficient for 32GT/s • Fastest PCIe spec ever – development time and link speed o Specification Revisions: 5.0 • Devices choosing to implement a maximum rate of 2.5GT/s, 5GT/s, 8GT/s or 16GT/sec can still be fully 5.0 compliant!

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 30 ClickPCIe toSpeed edit MasterEvolution title (6.0) style o Coming soon!

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 31 Click to edit Master title style

Thank you for attending the PCI-SIG Developers Conference 2019.

For more information please go to www.pcisig.com

PCI-SIG Developers Conference 2019 Copyright © 2019 PCI-SIG® - All Rights Reserved 32 Click to edit Master title style