White Paper

Mainframe

Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture

Technical and Business Reasons for Implementing a Switched-FICON Architecture

Abstract

A networked FICON storage architecture for your mainframe is a long-term, well-documented industry best practice. Networked storage architectures beat direct-attached architectures in terms of RAS, performance, scalability, and long-run costs. This paper explores the technical and business reasons for implementing switched-FICON instead of direct-attached storage infrastructures and explores the advantages of Brocade’s FICON SAN product family, which provides an outstanding, industry-leading solution for mainframe environments.

Executive Summary

Some mainframe organizations might consider implementing an old-style, traditional direct-attached high-performance FICON for IBM Z (zHPF) architecture. However, organizations should ask what they would be giving up by not deploying mainframe over a switched-FICON architecture. Will it actually result in efficiencies and cost savings?

With the many enhancements and improvements in mainframe I/O technology in the past 10 years, a frequently asked question is “Do I need FICON switching technology, or should I go with direct-attached storage?” With up to 320 FICON Express16SA channels supported on an IBM z15, why not just direct-attach the control units? The short answer is that with all of the ongoing I/O improvements, switching technology is needed—now more than ever. In fact, there are more reasons to use switched-FICON than there were to use switched-ESCON. Some of these reasons are purely technical; others are more business-related.

FICON switching directors provide high availability (architected for 99.999% availability) with redundant components and no single points of failure or repair. You can use modern FICON directors to attach hosts and devices in addition to FICON hosts and devices. With Fibre Channel and FICON intermix mode, both Fibre Channel Protocol (FCP) and FICON upper-level protocols can be supported within the same director when deployed independently by port. Director ports operate in either Fibre Channel or FICON mode. A total cost of ownership (TCO) analysis focused on the long term will clearly demonstrate that the modern mainframe customer is better off implementing a switched-FICON architecture.

Brocade’s unique history of technical development with the IBM Z Systems I/O product team has produced the world’s most advanced mainframe computing and storage systems SAN. Brocade’s technical heritage stretches back to the late 1980s with our creation of channel extension technologies to broaden data centers beyond the “glass house.” Then we revolutionized the classical “computer room” with the invention of the original ESCON SAN of the 1990s, which was leveraged in the 2000s as we facilitated geographically dispersed FICON storage systems. Today, the most compelling innovations to be found in mainframe storage networking technology are the product of the nearly 30-year-long partnership between Brocade and the IBM Z Systems team.

Broadcom Mainframe-FICON-WP100 March 27, 2020 Mainframe White Paper Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture

As the flash era of this decade disrupts the traditional mainframe storage networking mind-set, Brocade and IBM have released a series of features and functions that are specifically designed and integrated to address the needs and demands of the data center. These technologies leverage the fundamental capabilities of Brocade switching to enhance the world’s most reliable and crucial systems.

Recommendations

Direct-attached FICON storage might appear to be an inexpensive way to take advantage of mainframe technology. However, a closer examination will show why a switched-FICON architecture is a better, more robust design for enterprise data centers than direct-attached FICON.

There are six key technical reasons for connecting storage control units using switched-FICON:  To replace the I/O Operations (I/O Ops) component of System Automation for z/OS, widely known as ESCON Manager, that has been removed and is out of service as of September 2019.  To overcome buffer credit limitations on FICON Express 16Gb/s and 8Gb/s channels.  To build fan-in, fan-out architecture designs for maximizing resource utilization.  To localize failures for improved availability.  To increase scalability and enable flexible connectivity for continued growth.  To leverage new FICON technologies that are available only when switched-FICON is implemented, such as N_Port ID Virtualization (NPIV), Dynamic Channel Path Management (DCM), and FICON CUP Diagnostics.

I/O enhancements to IBM mainframes, such as Dynamic Channel Path Management and IBM Z Discovery and Configuration (zDAC), require a networked storage architecture (utilizing FICON directors) if the end user wishes to take advantage of them.

IBM Z offers unprecedented performance, scalability, and innovative features. To take full advantage of IBM Z, the end user must have an equally capable storage array and FICON director platform for connectivity. Our OEM storage partners combined with Brocade Gen 6 are the ideal combination with IBM Z mainframes, whether it is for a traditional z/OS, Linux, Protocol Intermix Mode (PIM), or private cloud environment. You can be assured that Brocade has the experience and that Gen 6 is the best FICON I/O infrastructure in the industry for mainframe data centers.

Mainframe users should utilize switched-FICON to provide for the proper deployment of I/O in every IBM Z data center that they operate. It is the only way in which they can derive the full value from their FICON I/O investments. Switched-FICON is an industry best practice, and it is in your best interests to deploy it.

Observations

Remember that the last System Automation product version to support I/O Operations (SA z/OS z/OS 3.5) went out of service in September 2019. This means that the System Automation for z/OS (SA z/OS) used by administrators to manage the FICON I/O switch configuration, to make configuration changes, and to display status information has been removed. With the removal of SA z/OS I/O Operations, a new way to provide the management, configuration, and alerting functions is now required. The IBM Z team and Brocade have developed, over the years, new functions, built upon industry standards, to completely replace the I/O Operations component in switched-FICON implementations.

Broadcom Mainframe-FICON-WP100 2 Mainframe White Paper Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture

Emerging and evolving enterprise critical workloads and higher density virtualization are continuing to push the limits of SAN infrastructures. The Brocade Gen 6 family features industry-leading 32Gb/s performance and an impressive 12.2Tb/s chassis bandwidth to address these next-generation I/O and bandwidth-intensive application requirements. In addition, the Brocade Gen 6 provides unmatched slot-to-slot and port performance, with 1.54Tb/s of bandwidth per slot (port card/blade). And this performance comes in the most energy-efficient FICON director in the industry, using an average of less than 1 watt per Gb/s, which is 15 times more efficient than competitive offerings.

The Brocade Gen 6 family enables high-speed replication and backup solutions over metro or WAN links with native Fibre Channel (10, 16, or 32Gb/s) and optional FCIP (1, 10, and 40GbE) extension support. This is done by integrating this technology via a FICON director blade (SX6) or by deploying a standalone switch (Brocade 7840).

Finally, this is all accomplished with unsurpassed levels of reliability, availability, and serviceability (RAS), based upon more than three decades of Brocade experience in the mainframe space. This experience includes originally defining the FICON standards and authoring or co-authoring many of the FICON patents.

Detailed Investigation

IBM Z delivered High Performance FICON (zHPF) in 2008 on the z10 EC platform and has been strategically investing in this technology ever since the initial delivery. The raw bandwidth of a z15’s FICON Express16SA channel with zHPF is 5.2 times greater than a FICON Express16SA channel using “native” FICON capabilities.

The raw I/O per second (IOPS) capacity of a z15’s FICON Express16SA that utilizes zHPF is 300,000 IOPS, which is 13 times greater than a FICON Express16SA channel that utilizes “native” command mode FICON capabilities. With these improved capabilities, organizations need to transition to zHPF to reap the full benefits of the z15. Within native command mode FICON, the number of open exchanges is limited by the FICON Express feature. FICON Express16SA, FICON Express16S+, FICON Express16S, FICON Express8S, and FICON Express8 allow up to 64 open exchanges. One open exchange (an exchange pair, actually) in command mode is the same as one I/O operation in progress.

Within enhanced zHPF FICON transport mode, one exchange is sent from the channel to the control unit (CU). Then, the same exchange ID is sent back from the control unit to the channel to complete the I/O operation. The maximum number of simultaneous exchanges that a channel can have open with the CU is 750 exchanges on the z15 processor family. The CU sets the maximum number of exchanges in the status area of the transport mode response information unit (IU). The default number is 64, which can be increased or decreased.

It is only when a director or switch is used between the host and the storage device that the true performance potential inherent in these channel bandwidth and I/O processing gains can be fully exploited.

Switched-FICON Is a Best Practice

In this section, we will explore the details behind each of our switched-FICON recommendations.

Replacing the I/O Operations Component of System Automation for z/OS

Prior to release 4.1, IBM System Automation for z/OS (SA z/OS) contained the I/O Operations (IOOPs) component to manage ESCON and FICON I/O switch configuration, make configuration changes, and display status information. Because of its ESCON functionality, this feature was often called the ESCON Manager. The I/O Operations component has been removed from the latest and subsequent releases of IBM System Automation for z/OS. This host-based feature has been replaced by FICON switching capabilities, which are summarized below. The management of ESCON directors was completely discontinued with no replacement capabilities provided.

Broadcom Mainframe-FICON-WP100 3 Mainframe White Paper Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture

On July 15, 2014, the IBM announcement letter “IBM Tivoli System Automation for z/OS V3.5 delivers policy-based automation to help manage the IBM System z platform” stated that the I/O Operations (IOOPS) module became unsupported in SA z/OS V4.1 and will no longer be available when SA z/OS V3.5 goes out of support in September 2019. For more details, see the IBM blog from Scott Compton, Dale Riedy, and Harry Yudenfriend titled “Fibre Channel SAN Management for z/OS, IBM Z and DS8000.”

Safe Switching allowed IOOPs in SA z/OS to vary paths online or offline when, because of port manipulation, the path from a channel to a device either became valid or was no longer valid. The IOOPs function of Safe Switching is replaced by Brocade’s Port Decommissioning capability.

For connectivity information, currently and in the future, the LPAR z/OS will display the I/O configuration information when the Display Matrix command is used. This allows operators to obtain information about the switches, switch ports, inter- switch links (ISLs), control units, and devices. The Display Matrix command for switches was created to display state information for each port of a switch. And Display Matrix for a device with the ROUTE keyword will report on ISL information.

Regarding I/O subsystem alerts, Fibre Channel switches include a control unit port (CUP) device that allows the fabric to communicate with z/OS via device-specific support code. Link failures that occur in the SAN are reported to z/OS via a link incident report that is presented via an attention message and link incident information.

Bottleneck detection and fenced ports are also surfaced with unsolicited messages from the CUP device. A FICON director reports a Health Summary Check condition when it detects conditions within the fabric that indicate that one or more ports, or routing between ports, may be operating at less than optimal capability. The condition is reported asynchronously using unsolicited alert status along with sense data that provides additional diagnostic information called Health Summary Diagnostic Parameters. The Health Summary Diagnostic Parameters are used in the execution of further diagnostic commands to give the installation detailed information on the disturbance.

Read Diagnostic Parameters (RDP) is a T11.org standard Extended Link Service (ELS) that provides the instrumentation needed for IBM Z to provide enhanced problem determination and fault isolation. RDP instrumentation data, kept at every link in the SAN, can be obtained using this in-band ELS command. With the creation of the RDP capability, supported by z/OS and FICON switching architecture, z/OS can now recognize when inconsistencies in link speeds occur across paths to a storage system and end to end on a single path. When these inconsistencies occur, the I/O subsystem (IOS) component of z/OS issues health check messages warning administrators that performance issues may occur. The RDP instrumentation data for each port within the SAN can be displayed within the Display Matrix command by using the LINKINFO option.

FICON Buffer Credits

When IBM Z introduced the availability of FICON Express16S followed by FICON Express16S+ and later FICON Express16SA channels, one very important change was the number of buffer credits available on each port per FICON Express channel card. While FICON Express4 channels had 200 buffer credits per port on a 4-port FICON Express4 channel card, this changed to 40 buffer credits per port on a FICON Express8 and/or FICON Express8S channel card. Organizations familiar with buffer credits will recall that the number of buffer credits required for a given distance varies directly in a linear relationship with the link speed. In other words, doubling the link speed would double the number of buffer credits required to achieve the same performance at the same distance.

Some organizations might recall the IBM statement on their support for extended distances for FICON: “The FICON Express4 features are intended to be the last features to support extended distance without performance degradation. IBM intends to not offer FICON features with buffer credits for performance at extended distances. Future FICON features are intended to support up to 10 km without performance degradation. Extended distance solutions may include FICON directors or switches (for buffer credit provision) or Dense Wave Division Multiplexers (for buffer credit simulation).”

Broadcom Mainframe-FICON-WP100 4 Mainframe White Paper Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture

IBM held true to its statement, and the 40 buffer credits per port on a FICON Express8/FICON Express8S channel card can support up to 10 km of distance for full-frame-size I/Os (2-KB frames). What happens if an organization has I/Os with smaller than full-size frames? The distance supported by the 40 buffer credits would decrease. It is likely that at faster future link speeds, the distance supported will decrease to 5 km or less.

At the time of the introduction of the FICON Express16S channel, the number of buffer credits was increased to 90. The 40 buffer credits per port on a FICON Express8 or FICON Express8S channel card and the 90 buffer credits on the FICON Express16S, FICON Express16S+, and FICON Express16SA channel cards can support up to 10 km of distance for full- frame-size I/Os (2-KB frames). But, again, what happens if organizations have I/Os with smaller than full-size frames?

A switched architecture allows organizations to overcome the buffer credit limitations on the FICON channel card. Depending upon the specific model, FICON directors and switches can have many hundreds of buffer credits available per port for long-distance connectivity.

Fan-In, Fan-Out Architecture Designs

Achieving the full utilization and value of resources purchased by an enterprise is a core principle within a good management strategy. In the late 1990s, the open systems world started to implement Fibre Channel SANs to overcome the low utilization of resources, such as storage connections, inherent in a direct-attached storage architecture. SANs addressed this issue through the use of fan-in and fan-out storage network designs. Multiple server host bus adapters (HBAs) could be connected through a Fibre Channel switch to a single storage port, fan-in architecture. A single server HBA could be connected through a Fibre Channel switch to multiple storage ports, fan-out architecture. These same principles apply to a FICON storage network. See Figure 1.

Figure 1: Fan-in is one CHPID to many storage ports, while fan-out is one storage port to many CHPIDs.

As a general rule, FICON Express16S, FICON Express16S+, and FICON Express16SA channels offer different levels of performance, in terms of I/O per second (IOPS) and bandwidth, than the storage ports to which they are connected. Therefore, a direct-attached FICON storage architecture might see very low channel or storage port utilization rates. To overcome this issue, fan-in and fan-out storage network designs are used.

A switched-FICON architecture allows a single channel to fan-in to multiple storage devices via switching, improving overall resource utilization while gaining full value from the expense of these resources. This can be especially valuable if an organization’s environment has newer FICON channels, such as FICON Express16S+ or Express16SA, but older disk or technology. Figure 2 illustrates how a single FICON channel can concurrently keep a number of disk volumes

Broadcom Mainframe-FICON-WP100 5 Mainframe White Paper Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture running at full-rated speeds. The actual fan-out ratios for connectivity to disk and/or tape will, of course, depend on the specific storage device and control unit; however, it is not unusual to see a FICON Express16S+ or Express16SA channel fan-in from a CHPID to a switch and then on to 5, 6, or more online storage device adapters (a 1:5 or 1:6 or more fan-in ratio). The exact fan-in ratio always depends on the storage system model and host adapter capabilities for IOPS and/or bandwidth. On the other hand, several FICON CHPIDs could be connected through a director or switch to a single storage port to maximize the storage port utilization and increase the overall I/O efficiency and throughput.

Figure 2: Switched-FICON allows one channel to keep multiple disk volumes fully utilized.

Keeping Failures Localized

In a direct-attached architecture, a failure anywhere in the path renders both the IBM Z channel interface and the storage adapter port unusable. The failure could be of an entire FICON channel card, a port on the channel card, the cable, the entire storage host adapter card, or an individual port on the storage host adapter card. In other words, a failure on any of these components will affect both the mainframe connection and the storage connection. The worst possible reliability, availability, and serviceability for FICON-attached storage are provided with a direct-attached architecture.

With a switched-FICON architecture, failures are localized to only the affected FICON channel interface or control unit interface, not both. The nonfailing side remains available, and if the storage side has not failed, other FICON channels can still access that storage adapter port via the switch or director (Figure 3). Switched-FICON failure isolation, combined with fan-in and fan-out architectures, allows for the most robust storage architectures, minimizing downtime and maximizing availability.

Figure 3: A FICON director isolates faults and improves availability.

Broadcom Mainframe-FICON-WP100 6 Mainframe White Paper Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture Scalable and Flexible Connectivity

Direct-attached FICON does not easily allow for dynamic growth and scalability, since a single FICON channel card port is tied to a single dedicated storage adapter port. In such an architecture, there is a 1:1 relationship (no fan-in or fan-out), and since there is a finite number of FICON channels available (dependent on the mainframe model or machine type), growth in a mainframe storage environment can pose a problem. What happens if an organization needs more FICON connectivity but has run out of FICON channels (see Table 1)? Two of the only viable alternatives would be to purchase another mainframe to obtain more channels or, more practically, implement a switched-FICON SAN architecture in the enterprise. FICON switching and proper usage of fan-in and fan-out in the storage architecture design will go a long way toward improving scalability and reducing costs for FICON Express channel cards and storage adapter ports.

Table 1: By Mainframe Model, the Maximum Number of Each Type of FICON Express Connection

Mainframe Number of FICON Channels Allowed on This Mainframe Model z800 32 FICON Express z900 96 FICON Express z9890 40 FICON Express / 80 FICON Express 2 z990 120 FICON Express / 240 FICON Express 2 z9BC 112 FICON Express 4 z9EC 336 FICON Express 4 z10EC 336 FICON Express 8 z114 64 FICON Express 8 z196 336 FICON Express 8S zBC12 128 FICON Express 8S zEC12 320 FICON Express 8S z13 320 FICON Express 8S / 320 FICON Express 16S z14 320 FICON Express 8S (carried forward) / 320 FICON Express 16S / 320 FICON Express 16S+ z15 320 FICON Express 16S (carried forward) / 320 FICON Express 16S+ (carried forward) 320 FICON Express 16SA

In addition, best-practice storage architecture designs include room for growth. With a switched-FICON architecture, adding a new storage system or port in a storage system is much easier to accomplish: Simply connect the new storage system or port to an available FICON director port. This eliminates the need to open the PCIe I/O channel drawer in the mainframe to add new channel interfaces, reducing potential human-caused outages and reducing capital and operational costs. This also gives managers more flexible planning options when upgrades are necessary, since the urgency of obtaining potential downtime for implementing the upgrades is lessened.

What about the next generation of channels? The bandwidth capabilities of channels are growing at a much faster rate than those of storage devices. As channel speeds increase, switches (already faster than channels) will allow data center managers to take advantage of new technology as it becomes available, while protecting their current investments and minimizing costs.

As a reminder, it is an IBM best-practice recommendation to use single-mode long-wave SFP and cable connections for FICON channels. Storage vendors, however, often offer both single-mode long-wave connections and multi-mode short- wave connections on their storage systems, allowing organizations to decide which to use. The organization makes the decision based on the trade-off between cost and reliability. Some organizations’ existing storage devices have a mix of single-mode and multi-mode connections. Since they cannot directly connect a single-mode FICON channel to a multi-mode

Broadcom Mainframe-FICON-WP100 7 Mainframe White Paper Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture storage host adapter, this could pose a problem. With a FICON director or switch in the path, however, organizations do not need to change the storage host adapter ports to comply with the single-mode best-practice recommendation for the FICON channels. The FICON switching device can have both types of connectivity. For example, it can have single-mode long-wave ports for attaching the FICON CHPIDs and multi-mode short-wave ports for attaching to the storage ports.

Furthermore, FICON switching elements at two different locations can be interconnected by fiber at distances up to 100 km or more, creating a cascaded FICON-switched architecture. This setup is typically used in disaster recovery and business continuance architectures. As previously discussed, FICON switching allows resources to be shared. With cascaded FICON switching, those resources can be shared between geographically separated locations, allowing data to be replicated or tape backups to be made at the alternate site, away from the primary site, with no performance loss. Often, workloads will be distributed such that both the local and remote sites are primary production sites, and each site uses the other as its backup.

While the fiber itself is relatively inexpensive, laying new fiber may require an expensive construction project. While dense wavelength division multiplexing (DWDM) can help get more out of fiber connections, inter-switch links with up to 32Gb/s of bandwidth are offered by switch vendors and can reduce the cost of DWDM or even eliminate the need for it. FICON switches maximize utilization of this valuable inter-site fiber by allowing multiple environments to share the same fiber link. In addition, Brocade FICON switching devices offer unique storage network management features, such as hardware-based ISL trunking and FICON Dynamic Routing (FIDR), that are very effective and efficient data transport technologies.

FICON switches allow data center managers to further exploit inter-site fiber sharing by enabling them to intermix FICON and native Fibre Channel Protocol (FCP) traffic, which is known as Protocol Intermix Mode, or PIM, as shown in Figure 4. Even in data centers where there is enough fiber to separate FICON and open systems traffic, preferred pathing features (for example, isolating FICON I/O from FCP I/O so that they do not use the same ISL links) on a FICON switch can be a great cost saver. With ISL preferred paths established, certain cross-site fiber can be allocated for FICON traffic while other fiber can be allocated for open systems FCP traffic. The ISLs can be configured such that in the event of a failure, and only in the event of an ISL failure, the links would be shared by both open systems and mainframe traffic.

Figure 4: Protocol Intermix Mode (PIM)

Leveraging New Technologies

Over the past decade, IBM has announced a series of technology enhancements that require the use of switched-FICON. These include the following:  N_Port ID Virtualization (NPIV) support for IBM Z Linux  FICON CUP Diagnostics  Dynamic Channel Path Management (DCM)  z/OS FICON Discovery and Auto-Configuration (zDAC)

Broadcom Mainframe-FICON-WP100 8 Mainframe White Paper Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture

N_Port ID Virtualization

NPIV (see Figure 6) allows for full support of LUN masking and zoning by virtualizing the Fibre Channel identifiers. IBM announced support for NPIV on Z Linux in 2005. Today, NPIV is supported on the IBM z9 and up through the z15. Until NPIV was supported on IBM Z, adoption of Linux on IBM Z had been relatively slow. Now NPIV allows each Linux guest on IBM Z to appear as if it had its own individual HBA to access its SCSI data when in reality those images are sharing the FCP channels, which are each carrying interleaved I/O from multiple Linux guests. Since IBM began supporting NPIV on IBM Z, adoption of Linux on IBM Z has grown significantly. It has been reported in the press that IBM workload specialty engines, including the Linux IFL, represented as much as one-half of the MIPS shipped, a further sign that Linux on IBM Z is finally gaining real traction. Implementation of NPIV on IBM Z requires a switched architecture.

FICON CUP Diagnostics

IBM, in a cooperative effort with Brocade, defined a new architectural component to be implemented in the Brocade Fabric OS (FOS), FICON Management Server (FMS) code and known as the “CUP Diagnostics” function. It enables an IBM Z host to receive direct alerts from the FICON directors about fabric problems and then to proactively gather additional, detailed information about the fabric (for example, components, topologies, ISL routing, and fabric problems). This function is available only in a switched-FICON SAN infrastructure.

When FICON users install and enable the optional Brocade FMS license on a FICON switch, they will be able to utilize these new switch control unit port (CUP) enhancements to provide a more robust fabric environment.

Brocade’s initial implementation of CUP Diagnostics was released in 2014 in Fabric OS® version 7.2 and has since been enhanced and improved. The user interface to make effective use of Brocade CUP Diagnostics is often the ROUTE and HEALTH keyword enhancements to the z/OS Display Matrix operator command.

CUP Diagnostics, sometimes called Switch (CUP) Diagnostics, provides new fabric-wide analytical command channel programs to enable z/OS to obtain fabric topology, collect diagnostic information such as performance data, determine the health of the fabric, generate a diagnostic log on the switch, and help users resolve problems in the fabric.

A FICON switch can provide a proactive, unsolicited unit check, called a Health Summary unit check, that indicates that one or more switches or ports in the fabric are operating at less than optimal capability. This will trigger z/OS to retrieve the diagnostic information from the switch to further diagnose the problem. The sense data will identify the source and destination ports to be used for the query.

A switch enabled for CUP Diagnostics signals to z/OS that there is a health problem. z/OS initiates monitoring for the path, requesting diagnostic data from the switch at regular intervals. The problem may require intervention such as additional z/OS system commands or I/O configuration changes. Once no errors or health issues have been detected by the switch for at least 2 hours, the monitoring of the route is stopped and will no longer appear in the report.

IBM Health Checker for z/OS reports if any switches that support CUP Diagnostics capabilities have indicated unusual health conditions on connected channel paths.

While z/OS has historically provided sophisticated I/O monitoring and recovery, these IBM Health Checker for z/OS reports will expose recently available diagnostic data provided directly from the FICON switch. These health checks will enable SAN administrators to easily and quickly gain additional insight into their enterprises’ fabric problems such as hardware errors, I/O misconfigurations, or fabric congestion.

Broadcom Mainframe-FICON-WP100 9 Mainframe White Paper Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture

Dynamic Channel Path Management

Dynamic Channel Path Management (DCM), as shown in Figure 5, is another feature that requires a switched-FICON architecture. DCM provides the ability to have IBM Z automatically manage switched-FICON I/O paths connected to storage systems in response to changing workload demands. Use of DCM helps simplify I/O configuration planning and definition, reduces the complexity of managing I/O, dynamically balances I/O channel resources, and enhances availability. DCM can best be summarized as a feature that allows for more flexible channel configurations, by designating channels as “managed,” and proactive performance management. DCM requires a switched-FICON architecture because topology information is communicated via the switch or director. The FICON switch must have the control unit port (CUP) active and be configured or defined as a control unit (CU) in the hardware configuration definition (HCD).

Figure 5: Dynamic Channel Path Management

z/OS FICON Discovery and Auto-Configuration

Released as a new capability of the IBM z196 in 2011, the z/OS FICON Discovery and Auto-Configuration (zDAC) feature is an I/O-related technology that simplifies the definition and management of I/O configurations for mainframe enterprises. This I/O technology provides the autonomic management of I/O requests to meet the workload goals specified by the enterprise policy, dynamically altering the logical I/O configuration to adjust the bandwidth capacity required. zDAC automatically reacts to hardware and firmware failures to adjust the configuration to maintain optimal availability characteristics; and it automatically identifies single points of failure from the IBM Z host through the I/O switching fabric to the storage subsystems. zDAC is a significant technology enhancement for FICON. IBM introduced zDAC as a follow-on to an earlier enhancement in which the FICON channels log in to the Fibre Channel name server on a FICON director. In order for zDAC to provide automatic discovery and configuration of FICON-attached disk and tape devices, it does require a switched-FICON architecture.

Essentially, zDAC automates a portion of the hardware configuration definition (HCD) Sysgen process. zDAC uses intelligent analysis to help validate the IBM Z and storage definitions' compatibility, and it uses built-in best practices to help configure for high availability and avoid single points of failure. zDAC is transparent to existing configurations and settings. It is invoked and integrated with the z/OS HCD and z/OS hardware configuration manager (HCM). zDAC also requires a switched-FICON architecture.

Broadcom Mainframe-FICON-WP100 10 Mainframe White Paper Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture zDAC has improved processing of device number-constrained configurations and those with constrained unit addresses for specific channels. It also provides the SAN administrator with the capability to specify switch and CHPID maps to guide path selection and to discover directly attached devices in addition to those connected to a switch. zDAC provides toleration of inactive or incapable systems identified in an LPAR group, and its discovery policies allow administrators to forego automatic device numbering so that they can provide their own device numbers. In addition, the policy refresh capability allows some policy options to be dynamically refreshed without requiring a new fabric discovery.

Business Reasons to Support a Switched-FICON Architecture

In addition to the technical reasons described earlier, the following business reasons support implementing a switched- FICON architecture:  To enable massive consolidation in order to reduce capital and operating expenses.  To improve application performance over long distances.  To support growth and enable effective resource sharing.

Massive Consolidation

With NPIV support on IBM Z, server and I/O consolidation is very compelling (see Figure 6). IBM undertook a well-publicized project at its internal data centers (Project Big Green) and consolidated 3900 open systems servers onto 30 IBM Z mainframes running Linux. IBM’s total cost of ownership (TCO) savings was calculated, taking into account footprint reductions, power and cooling, and management simplification costs. The result was a nearly 80% TCO savings for a 5-year period. This scale of TCO savings is why so many new IBM mainframe processor MIPS worldwide are now being used for Linux.

Implementation of NPIV requires connectivity from the FICON (FCP) channel to a switching device (director or smaller port- count switch) that supports NPIV. A special microcode load is installed on the FICON channel to enable it to function as an FCP channel. NPIV allows the theoretical consolidation of up to 255 IBM Z Linux images (“servers”) behind each FCP channel, using one port on a channel card and one port on the attached switching device for connecting these virtual servers. This enables massive consolidation of many HBAs, each attached to its own switch port in the SAN.

What IBM currently supports is up to 64 Linux images per FCP channel and 9,000 total images per mainframe host. Although this level of I/O consolidation was possible prior to NPIV support on IBM Z, implementing LUN masking and zoning in the same manner as with open systems servers/SAN/storage was not possible prior to the support for NPIV with Linux IBM Z.

NPIV implementation on IBM Z has also resulted in consolidation and adoption of a common SAN for distributed or open systems (FCP) and mainframe (FICON), commonly known as Protocol Intermix Mode (PIM). While IBM has supported PIM in IBM Z environments since 2003, adoption rates were low until NPIV implementations for Linux for IBM Z picked up with the introduction of the IBM z10 in 2008. With the IBM z10, enhanced segregation and security beyond simple zoning were possible through switch partitioning or virtual fabrics and logical switches.

Leveraging enhancements in switching technology, performance, and management, PIM users can now fully populate the latest high-density directors with no bandwidth oversubscription. They can use management capabilities such as virtual fabrics or logical switches to fully isolate open systems ports and FICON ports in the same physical director chassis. Rather than having more partially populated switching platforms that are dedicated to either mainframe (FICON) or open systems (FCP), PIM allows for consolidation onto fewer physical switching devices, reducing management complexity and improving resource utilization. This in turn leads to lower operating costs and a lower TCO for the storage network. It also allows for a consolidated, simplified cabling infrastructure.

Broadcom Mainframe-FICON-WP100 11 Mainframe White Paper Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture

Figure 6: Organizations implement NPIV to consolidate I/O in z Linux environments.

Application Performance over Distance

As previously discussed, the number of buffer credits per port on a FICON Express16S+ and/or FICON Express16SA channel is set to 90, supporting up to 10 km of distance, without performance degradation, if all of the frames are full size. What happens if organizations need to go beyond 10 km for a direct-attached storage configuration? They will likely see performance degradation due to insufficient buffer credits. Without a sufficient number of buffer credits, the “data pipe” cannot be kept full with streaming frames of data.

Switched-FICON avoids this problem as seen in Figure 7. FICON directors and switches have a sufficient number of buffer credits available on ports to allow them to stream frames at full-line performance rates with no bandwidth degradation. Each Gen 6 switch port provides more than 10,000 buffer credits that can be utilized for frame flow on that link. IT organizations that implement a cascaded FICON configuration between sites can, with the latest FICON director platforms, stream frames at 32Gb/s rates with no performance degradation for sites that are as far as 100 km apart.

Figure 7: Limited numbers of buffer credits on the host and storage can be overcome by the many buffer credits that are available on each port of a Gen 6 switching device.

Broadcom Mainframe-FICON-WP100 12 Mainframe White Paper Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture

Switched-FICON technology also allows organizations to take advantage of hardware-based Fibre Channel over IP (FCIP) protocol acceleration or emulation techniques for tape (reads and writes), as well as with zGM (z/OS Global Mirror, formerly known as XRC or Extended Remote Copy). This emulation technology, as seen in Figure 8, is available on standalone FCIP extension switches or on an FCIP blade in FICON directors. It allows the z/OS initiated channel programs to be acknowledged locally at each site and avoids the back-and-forth protocol handshakes that normally travel between remote sites. It also reduces the impact of latency on application performance and delivers local-like performance over unlimited distances. In addition, this acceleration or emulation technology optimizes bandwidth utilization.

Why is bandwidth efficiency so important? It is typically the most expensive budget component in an organization’s multisite disaster recovery or business continuity architecture. Anything that can be done to improve the bandwidth utilization and/or reduce the bandwidth requirements between sites would likely lead to significant TCO savings.

Figure 8: Switched-FICON with emulation allows optimized performance and bandwidth utilization over extended distance.

Enabling Growth and Resource Sharing

Direct-attached storage forces a 1:1 relationship between host connectivity and storage connectivity. In other words, each storage port on a storage system host adapter requires its own physical port connection on a FICON Express channel card. These channel cards are typically very expensive on a per-port basis—typically 4 to 6 times the cost of a FICON director port. Also, there is a finite number of FICON Express16SA channels available on a mainframe (a maximum of 320 on z15), as well as a finite number of host adapter ports in the storage system. If an organization has a large configuration and a direct-attached FICON storage architecture, how does it plan to scale its environment? What happens if an organization acquires a company, or experiences unanticipated growth, and needs additional channel ports? A switched-FICON infrastructure allows cost-effective, timely, seamless expansion, with minimal downtime, to meet growth requirements.

Direct-attached FICON storage also typically results in underutilized host channel card ports and storage adapter ports in storage systems. FICON Express16S+ and FICON Express16SA channels can comfortably perform at high-channel utilization rates, while a direct-attached storage architecture typically sees channel utilization rates of 10% or less. As illustrated in Figure 9, leveraging FICON directors or switches allows organizations to maximize channel utilization.

Broadcom Mainframe-FICON-WP100 13 Mainframe White Paper Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture

Figure 9: Switched-FICON drives improved channel utilization, while preserving CHPIDs for growth.

It is also very important to keep traffic for tape drives streaming and avoid stopping and starting the tape drives, as this leads to unwanted wear and tear of tape heads, cartridges, and the tape media itself. Switched-FICON can be utilized to optimize tape drive performance.

Finally, switches facilitate fan-in, which allows different hosts and logical partitions (LPARs) whose I/O subsystems are not shared to share the same assets. While some benefits may be realized immediately, the potential for value in future equipment planning can be even greater. With the ability to share assets, equipment that would be too expensive for a single environment can be deployed in a cost-saving manner. The most common example is to replace tape farms with virtual tape systems. By reducing the number of individual tape drives, maintenance (service contracts), floor space, power, tape handling, and cooling costs are reduced. Virtual tape also improves reliable data recovery, allows for significantly shorter recovery time objectives (RTO) and nearer recovery point objectives (RPO), and offers features such as peer-to-peer copies. However, without the ability to share these systems, it may be difficult to amass sufficient cost savings to justify the initial cost of virtual tape. And the only practical way to share these standalone tape systems or tape libraries is through a switch.

With disk storage systems, in addition to sharing the asset, it is sometimes desirable to share the data across multiple systems. The port limitations on a storage system may prohibit or limit this capability using direct-attached (point-to-point) FICON channels. Again, the switch can provide a solution to this issue.

Even when there is no need to share devices during normal production, this capability can be very valuable in the event of a failure. Data sets stored on tape can quickly be read by hosts that are picking up workload from a failed host and that are already attached to the same switch as the tape drives. Similarly, data stored on a storage system can be available as soon as a fault is determined.

Switched-FICON Is a Best Practice

The raw bandwidth of FICON Express16SA (3200MB/s full-duplex) running on a current IBM z15 is 32 times greater than the capabilities of the original IBM FICON Express (100MB/s). The raw I/O per second (IOPS) capacity of FICON Express16SA channels is even more impressive, particularly when a channel program utilizes the z High Performance FICON (zHPF) protocol. To utilize these tremendous improvements, the FICON protocol is packet-switched and capable of having hundreds of I/Os occupy the same channel simultaneously.

Broadcom Mainframe-FICON-WP100 14 Mainframe White Paper Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture

IBM Z’s most current FICON Express16SA channels using native command mode FICON can have up to 64 concurrent I/Os (open exchanges) to different devices. FICON Express16SA channels running zHPF can have up to 750 concurrent I/Os on the IBM Z processor family. Only when a director or switch is used between the host and the storage device can the true performance potential inherent in these zHPF channel-bandwidth and I/O-processing gains be fully exploited.

Direct-attached FICON might appear to be a great way to take advantage of FICON technology’s advances. However, a closer examination shows that switched-FICON offers the user a better, more robust architecture for enterprise data centers. Switched-FICON offers the following:  Better utilization of host channels and their performance capabilities  Scalability to meet growth requirements  Improved reliability, problem isolation, and availability  Flexible connectivity to support evolving infrastructures  Much more effective and efficient disaster recovery capabilities  More robust business continuity implementations via cascaded FICON  Improved distance connectivity, with improved performance over extended distances  New mainframe I/O technology enhancements such as NPIV, FICON DCM, zDAC, and zHPF

Switched-FICON also provides many business advantages and potential cost savings, including the following:  The ability to perform massive server, I/O, and SAN consolidation, dramatically reducing capital and operating expenses  Local-like application performance over any distance, allowing host and storage resources to reside wherever business dictates  More effective resource sharing, improved utilization, reduced costs, and improved recovery time

With the growing trend toward increased usage of Linux on IBM Z and the cost advantages of NPIV implementations and PIM SAN architectures, direct-attached storage in a mainframe environment is becoming a thing of the past. Investments made in switches for disaster recovery and business continuance are likely to pay the largest dividends. Having access to alternative resources and multiple paths to those resources can result in significant savings in the event of a failure. The advantages of a switched-FICON infrastructure are simply too great to ignore.

Brocade in Mainframe Environments

Now on our sixth generation (1Gb/s, 2Gb/s, 4Gb/s, 8Gb/s, 16Gb/s, and 32Gb/s) of switching technology (Gen 6), Brocade has the experience you can rely on, having been in the mainframe storage networking business for more than 30 years, as far back as the parallel channel extension technology of the late 1980s. Brocade has a history of thought leadership, having four of its own FICON hardware patents, as well as five FICON joint patents with IBM on technologies, such as the FICON bridge card and control unit port (CUP). Brocade helped IBM develop Fiber Connection (FICON), and in 2000 the first IBM certified FICON network infrastructure, using 1Gb/s ED5000 Directors, was deployed. Brocade has continued its heritage of mainframe storage networking thought leadership with six generations of FICON directors, including the current industry- leading FICON directors, such as the X6, and FICON channel extension, such as the Brocade 7840 and SX6 extension blade.

Reliability, Availability, and Serviceability

The largest corporations in the world literally run their business on mainframes. Government institutions in many countries worldwide also rely on the mainframe for their critical computing needs. Reliability, availability, and serviceability (RAS) for these mission-critical environments are of the utmost importance. Mainframe practitioners in these organizations avoid risk at all costs. They never want to suffer an unscheduled outage, and they want to minimize if not outright eliminate scheduled or planned outages. IBM Z mainframes have historically been the rock-solid pillar in terms of computing RAS. Mainframe

Broadcom Mainframe-FICON-WP100 15 Mainframe White Paper Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture practitioners have a history of creating I/O infrastructures that have “five nines” availability (99.999%). For FICON channel connectivity to mainframe-attached storage, these same organizations have a requirement for a FICON director platform that offers the same levels of RAS as the mainframe itself. The Brocade Gen 6 is the ideal FICON director for these RAS requirements.

The Brocade Gen 6 FICON Director features a modular, high-availability architecture that supports these mission-critical mainframe environments. The Brocade Gen 6 chassis has been engineered from inception for “five nines” of availability by providing multiple fans (supporting hot aisle–cool aisle), multiple fan connectors, dual core blade internal connectivity, dual control processors, dual power supplies, a passive backplane, and dual I/O timing clocks. These features and the switching design of the Brocade Gen 6 result in leading mean time between failure (MTBF) and mean time to recovery or repair (MTTR) numbers. In a study performed with a sample size of 26,593 Brocade products, the average downtime was 53 minutes per year, for an availability rate of 99.99984%. It is this kind of availability that consistently leads our OEM partners to praise Brocade products for their quality.

Scalability

Private cloud computing centered on IBM Z has emerged as a “hot topic.” Cloud computing requires a highly scalable (hyper- scale) storage networking architecture to support it. Hyper-Scale Inter-Chassis Link (ICL) is a unique Brocade Gen 6 feature that provides connectivity among two or more Brocade X6-4 or X6-8 chassis. This is the 2nd generation of ICL technology from Brocade with an optical QSFP (quad small form-factor pluggable). The 1st generation used a copper connector. Each ICL connects the core routing blades of two X6 chassis and provides up to 64Gb/s of throughput within a single cable and multiple cables being deployed. The Brocade X6-8 allows up to 32 QSFP ports, and the X6-4 allows up to 16 QSFP ports to help preserve switch ports for end devices.

This 2nd generation of Brocade optical ICL technology, based on QSFP technology, provides a number of benefits to an organization. Brocade has improved ICL connectivity over the use of copper connectors by upgrading to an optical form factor. By doing this, Brocade has also increased the distance of the connection from 2 meters to as much as 100 meters. QSFP combines four cables into one cable per port. This significantly reduces the number of ISL cables that the customer needs to deploy. Since the QSFP connections reside on the core blades within each Gen 6 director, they do not use up connections on the slot line cards. This frees up to 33% of the available ports for additional server and storage connectivity.

Dual-chassis backbone topologies connected through low-latency ICL connections are ideal in a FICON environment. The majority of FICON installations have switches that are connected in dual or triangular topologies, using ISLs to meet the FICON requirement for low latency between switches. New 64Gb/s QSFP-based ICLs enable simpler, flatter, low-latency chassis topologies spanning a distance of up to 100 meters with off-the-shelf MPO cables. They reduce inter-switch cables by 75% and preserve 33% of front-end ports for servers and storage, leading to fewer cables and more usable ports in a smaller footprint, thus reducing capital and operating expenses for the enterprise.

Traditional (z/OS) Mainframe Environments

In a “traditional” z/OS mainframe environment, reliability, availability, and serviceability (RAS), as well as performance, are the key concerns to most organizations. These characteristics provide the stability for the mainframe-based applications on which the largest companies in the world run their businesses. Dr. Thomas E. Bell, winner of the Computer Measurement Group (CMG) Michelson Award for lifetime achievement in the computer performance field, once famously commented that “all CPUs wait at the same speed.” Likewise, Dr. Steve Guendert, a CMG board member, has commented in his blog that “The IBM Z is a hungry machine, and its users need to feed the I/O beast.” Response time means money in these environments. The ability to process transactions more rapidly provides companies a competitive advantage in today’s fast- paced industry. The Brocade Gen 6 FICON switch technology makes sure that the “I/O beast is fed.”

Broadcom Mainframe-FICON-WP100 16 Mainframe White Paper Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture Linux on the Mainframe

The Arcati 2019 Mainframe User Survey found that 52 percent of respondents said that they run Linux on IBM Z, with another 8 percent at the planning stage. Regardless of whether Linux is running as a guest under z/VM or natively in an LPAR, it is an important trend that cannot be ignored. This trend has been growing since the 2005 introduction of support for NPIV on IBM Z. IT organizations are realizing that there are significant cost savings to be realized by moving to Linux on IBM Z, and these cost savings are in terms of hardware acquisition, software licensing, and operational costs, such as power and cooling. A Brocade Gen 6 FICON platform is the ideal choice for these Linux environments as it offers full support for NPIV, and its Virtual Fabrics functionality allows for highly secure separation of the z/OS data traffic from the Linux traffic on the FICON director.

Intermixing FICON and Fibre Channel

Intermixing FICON and Fibre Channel, also known as Protocol Intermix Mode (PIM), is another growing trend in mainframe environments. Linux on IBM Z has been the major driver of this trend, as its very nature often leads to mainframe end users using FCP channels and FICON channels on the mainframe. The performance and virtual fabrics capabilities, coupled with the immense amount of FICON and open systems SAN experience at Brocade, make the Gen 6 the ideal director platform to deploy in a PIM architecture.

Private Cloud

The ideas behind cloud computing are well known to experienced mainframe administrators who remember “service bureau computing.” Cloud computing is seeing a lot of adoption, and the concept of IBM Z Systems at the center of a private cloud is gaining a lot of traction. Private cloud computing relies on extensive virtualization. This virtualization is not just at the server and application; it is everywhere in the data center, most notably at the storage devices and in the network. Modern storage arrays, paired with Brocade Gen 6 switching, create the ideal architecture for a mainframe-centric private cloud.

Appendix: An Analogy

Just for a moment, consider the parallels of telephone evolution with I/O evolution. In the beginning, the rare phone call was not controlled by the user but rather had to be made through a telephone operator, and many people shared a single connection. Privacy was impossible, and the system was unreliable. In the beginning of IBM Z, every I/O was directed specifically to a small set of devices with little control by the user. There was very little in the way of performance, and system reliability was just okay.

With its second generation, telephones gained rotary dial so that the user could be more specific with their calling through larger telephone networks, although many still shared a single connection with other users. Privacy was always suspect, but reliability was better. Second generation I/O also progressed to the point that there were actual control units that provided some basic user-controlled services to their I/O requirements. Few people were really concerned with data privacy, but reliability was getting better and I/O infrastructure was getting larger. Long-distance data sharing was done with baud-rated modems connected to a phone, connected to a computer.

Push-button phones came into use, along with substantial computer-controlled telephone networks, which dramatically improved telephone calling. Privacy was expected but not guaranteed, and phone systems were often disrupted. ESCON I/O was invented, which dramatically improved data center I/O. Some basic privacy and security capabilities could be deployed, but often were not. Software tools began to be deployed to help manage these larger I/O configurations.

Broadcom Mainframe-FICON-WP100 17 Mainframe White Paper Unlock the Full Value of Mainframe Technology with a Switched-FICON Architecture

Cellular phones were invented, which gave rise to incredible growth, scalability, and mobility. But phone call privacy also peaked at this point in its evolution. FICON was developed and became deployed, which gave rise to incredible I/O scalability and performance. Switched-FICON vastly improved the scalability, reliability, robustness, and security of a customer’s I/O. Enhanced software management tools made using FICON I/O less complex.

Fourth-generation cellular phones began allowing apps, cameras, gadgets, and connection to WiFi and the Internet to become part of the phone. Privacy is almost never assured on a cellular phone, and often apps and browsers crash with regularity. Phone and app quality are variable and often not customer friendly, while service and maintenance can be a nightmare. Switched-FICON also developed apps to provide the user with a wealth of useful performance and health information and to be customer friendly in order to ease their burden of deployment and management of switched-FICON I/O. SAN capabilities became tightly integrated with IBM Z initiatives so that together they work seamlessly with other equipment in the data center. Switched-FICON also provides reliable options for long-distance backup and recovery solutions that are available to every IBM Z data center.

In summary, a vast number of people in this world use cellular phones today, and they change those phones out every two or three years as technology improves. A vast number of IBM Z data centers deploy switched-FICON today, and generally those devices are changed out every four to five years or so as technology improves. Shouldn’t you have and use both of these technologies to your advantage?

Broadcom Mainframe-FICON-WP100 18 Copyright © 2020 Broadcom. All Rights Reserved. Broadcom, the pulse logo, Brocade, the stylized B logo, and Fabric OS are among the trademarks of Broadcom in the United States, the EU, and/or other countries. The term “Broadcom” refers to Broadcom Inc. and/or its subsidiaries.

Broadcom reserves the right to make changes without further notice to any products or data herein to improve reliability, function, or design. Information furnished by Broadcom is believed to be accurate and reliable. However, Broadcom does not assume any liability arising out of the application or use of this information, nor the application or use of any product or circuit described herein, neither does it convey any license under its patent rights nor the rights of others.

The product described by this document may contain open source software covered by the GNU General Public License or other open source license agreements. To find out which open source software is included in Brocade products, to view the licensing terms applicable to the open source software, and to obtain a copy of the programming source code, please download the open source disclosure documents in the Broadcom Customer Support Portal (CSP). If you do not have a CSP account or are unable to log in, please contact your support provider for this information.