Solution Brief

Enterprise RAID Leaders Offer Scalable, Enterprise-Hardened SAS Storage for Supercomputing Environments LSI and DataON deliver scalable, high-performance storage using Enterprise- RAID technology in SAS switched fabric environments for the Supercomputing market.

Building and delivery solutions that can scale to meet the constantly growing demands of high performance computing (HPC) environments can be challenging at best. Supercomputing environments require Fusion energy experiments, for example, can generate up to 100 terabytes of data in a single uninterrupted access to vast amounts of data that they must often deliver to a day. Delivering storage for this type of environment requires the integration of top quality broadly distributed user base. For example components that provide industry leading performance and that can grow dynamically the European Grid Infrastructure, part of without interrupting the users. the CERN grid project, involves over 240 This is why LSI, an industry leader in SAS, Enterprise RAID and application acceleration institutions in 45 countries supporting solutions and DataOn, a leading developer of high-performance storage enclosures for science in more than 20 disciplines the high-performance and enterprise computing markets, are delivering storage solutions including bioinformatics, climate change, designed to deliver massively scalable storage to meet the extreme performance, capacity and education, energy and more. These leading data reliability needs of high performance computing environments. scientific, research and engineering efforts are driving society forward and helping to find solutions to the medical, The Need for environmental, and other challenges that 21st century supercomputing environments make the technological and scientific advances of humanity will face in the 21st century. This the modern world possible. They enable scientists to do what was considered science fiction work cannot afford to be constrained by just a few years ago. By using supercomputers we are able to peer into the vastness of space data storage limitations. or to explore the unimaginably small quantum world. Whether it is the Phoenix and Jaguar systems at the Oak Ridge National Laboratory (OCNL) studying climate change or the National Center for Supercomputing Applications (NCSA) at the University of Illinois developing new methods of drug delivery, these systems enable us to cure disease, to feed the hungry and to seek answers to the mysteries of the universe.

As more and more barriers to adding computing power are torn down the real challenge has become to provide the vast amounts of high performance data storage that these systems need. This is why in the July 12, 2011 issue of HPCwire, Galen Shipman, head of the Oak Ridge Leadership Computing Facility, said that what keeps him up at night is “the challenge of providing high-performance, reliable, and scalable I/O systems to meet the needs of a growing number of users from broadening domains of science”.

Meeting this challenge requires the use of high performance storage that can deliver data at an unprecedented rate and that can scale to house tens of petabytes of data. For example, the Jaguar supercomputer has 10,000 terabytes of disk space and the High Performance Storage System (HPSS) at the Oak Ridge National Laboratory currently stores more than 7 petabytes of data with up to an additional 40 terabytes being added daily. In true supercomputing environments it is not uncommon for enough new data to be generated each day to fill the entire capacity of the average storage subsystem. Solution Brief

The success or failure of supercomputing projects often has less to do with the actual supercomputer and more to do with the storage that supports it. That is why the storage deployment must not only be fast, secure and reliable; it must also be able to grow dynamically and to do so without interrupting user access.

It All Comes Down to Planning

When planning a storage deployment for a supercomputing environment one of the first choices to be made is whether to go with a direct attach storage (DAS) solution or to put in a (SAN). The advantages of a SAN are well known in that they offer multiple servers the ability to access shared storage. Incrementally adding storage is easy to do, as storage enclosures can simply be added to the fabric as needed. However, when compared to DAS, SANs tend to be more complicated and difficult to administer, more expensive and perhaps most importantly can offer significantly lower data throughput than DAS implementations.

Server Server Server

Storage

Storage Storage Storage

DAS SAN

Figure 1: DAS and SAN

DAS has many advantages. DAS is the most commonly used type of storage deployment, which has the natural consequence of being well understood, secure, reliable and easy to manage. DAS offers higher performance and is easier to use than a SAN.

SAS storage can offer higher per port throughput than or iSCSI. SAS-based storage solutions connect through multilane 6Gb/s SAS “wideports”. Each SAS multilane wideport is 4 lanes wide and supports data transmission rates of up to 6Gb/s per lane. This means that each physical SAS wideport connector can support up to 24Gb/s of data throughput. When compared to the current maximum per port data rate of 10Gb/s for iSCSI or 8Gb/s for Fibre Channel, SAS clearly offers the highest available throughput per port.

Scalability Solution Brief | 2 Solution Brief

SAS and SAS switching combine Server Server Server Server the high throughput and low latency of DAS with a switched fabric to provide the same access to shared storage as a SAN.

LSI SAS6160

Storage Storage

Storage

Figure 2: SAS Switched Fabric

There are two main challenges with traditional DAS environments. The first is that it can be difficult to add storage to an existing system without interrupting availability or increasing latency. Often, the system must be taken off-line to have storage added. The other challenge is that data is only available through the server that it is directly attached to. If the server fails then the storage is rendered inaccessible, even if it is perfectly functional.

That is why Serial Attached SCSI (SAS) and SAS switch technology is the right choice for supercomputing storage environments. SAS and SAS switching combine the high throughput and low latency of DAS with a switched fabric to provide the same access to shared storage as a SAN.

The use of a SAS switch, such as the LSI SAS6160, can also improve overall performance. Since most target devices are either SAS or SATA based, it allows for a single I/O technology to be used throughout the storage topology, which can offer less potential latency than a Fibre Channel or iSCSI SAN. In any storage environment, data is sent from a host based controller card, such as an LSI 6Gb/s MegaRAID SAS+SATA RAID controller, using a disk I/O protocol, such as SAS. When the data is sent out over the SAN, the SAS formatted data is then encapsulated within a SAN protocol, such as Fibre Channel. This is what allows the data to be sent out over the SAN and to safely arrive at the correct location. Upon arriving at the physical storage the Fibre Channel encapsulation is removed from the data packets, returning the data to its original format of SAS. Once the data is back in SAS format it can then be written to the disk drives. This is the same basic process that would occur in an iSCSI SAN. Each step that either encapsulates the data or removes it from the SAN protocol is an extra step that adds latency. In a switched SAS fabric, latency is reduced because only the SAS protocol is used throughout the entire storage deployment which eliminates the extra steps.

Scalability Solution Brief | 3 Solution Brief

Write command is issued

Block level data is sent to storage device SAS

Block level data is encapsulated within PC protocol Fibre Channel

Added to SAS frame for transmission over bre channel SAN

Start of Frame Optional SAS Data ESP CRC End of Frame Header Headers Trailer Frame

Data is sent over bre channel SAN

File System Write command PC encapsulation is removed is issued from block level data Fibre Channel SAS Block level data is sent to storage device Removed from SAS frame after transmission over bre channel SAN

Data is sent through Start of Frame Optional SAS Data ESP CRC End of SAS6160 switch Frame Header Headers Trailer Frame

Block level data Block level data is received SAS SAS is received

Data is written to Data is written to disk array

Data Sent Over a Switched Data Sent Over Fibre SAS Network Channel SAN

Figure 2: SAS Switched Fabric

As far as scaling storage, LSI SAS6160 switches can offer the same level of flexibility as iSCSI or Fibre Channel SANs. Just like in a SAN, storage in a SAS switched fabric can be increased by simply connecting additional storage enclosures to an available port on the SAS Switch. The new storage can then be added to SAS zones without any interruption to existing systems. With support for up to 1000 devices, a switched SAS environment can scale to meet even the largest storage needs.

When connecting additional storage to the fabric it is critical to use high-performance, enterprise class JBODs, such as the DataOn DNS-1640, that support multiple, redundant SAS paths to storage. The storage enclosure should also be able to survive the failure and replacement of any single component within the enclosure, without impacting access to data. To do this the drive enclosures should have a modular design that supports N+1 with hot-swappable modules and support dual-port SAS drives. The enclosures must also support industry standard monitoring and reporting features, such as SES and SMART, to ensure that the administrators are always aware of the status of the storage.

Scalability Solution Brief | 4 Solution Brief

With support for up to 1000 devices, a switched SAS environment can scale to meet even the largest storage needs.

Figure 4: DataON JBOD

SAS switches also address the availability issue of DAS. Using dual port MegaRAID controllers in conjunction with a dual switch fabric allows for the redundant connection of multiple servers to multiple storage solutions. Unlike with a DAS solution, the storage is not tied to a single server. Should a server either fail or be taken offline for maintenance, then the storage will remain available to the other servers in the fabric.

Finally, SAS switches resolve the cable length limitations of DAS. Since the SAS switches are able to act as repeaters, they can be used to achieve an end-to-end storage connection of up to 80 meters.

Simply put, switched SAS provides SAN multipath redundancy for high availability without sacrificing any of the performance or ease of use of a DAS solution.

Conclusion

To meet the unique scalability needs of supercomputing environments any storage solution should include a high-performance storage system utilizing LSI Enterprise MegaRAID controllers and SAS6160 switches. The LSI SAS storage solutions offer 6Gb/s data transfer rates, support for HDD and SDD drives, and advanced options including CacheCade Pro 2.0 and Fast Path software for application acceleration. Connecting the SAS MegaRAID controllers to SAS6160 switches can provide SAN scalability and multi-path availability combined with DAS high-performance, low latency data throughput.

The storage can be connected through enterprise class JBOD drive enclosures from DataON. The DataON enclosures provide a modular design, N+1 redundancy for no single point of failure, and support enclosure monitoring and reporting features.

There are many companies that offer products built to perform a specific function in data center environments. However, finding real, complete solutions that address the specific needs of true HPC environments is no easy task. This is why LSI has compiled their unique offering of enterprise-hardened SAS storage solutions, built upon proven technology, designed to deliver unparalleled performance, data protection, cost management and scalability.

Scalability Solution Brief | 5 Solution Brief

Build Today

You can use the same LSI Enterprise RAID technology for your High Performance Computing requirements today. Contact David Graas at [email protected] for more information.

About DataON

DataON Storage delivers innovative storage solutions designed to snap into existing IT environments. DataON Storage’s storage solutions are industry compliant, featuring various high performance network connectivity, I/O and disc options. DataON Storage offers customers superior performance through balanced architecture, high-speed interconnect options and industry leading management. From basic monitoring and alerting to en masse software provisioning, DataON Storage provides customers with a single point of accountability while ensuring successful planning, installation, post-installation support and onsite maintenance options. For More information visit…

For more information and sales office locations, please visit the LSI web sites at: lsi.com lsi.com/contacts

North American Headquarters LSI Europe Ltd. LSI KK Headquarters Milpitas, CA European Headquarters Tokyo, Japan T: +1.866.574.5741 (within U.S.) United Kingdom Tel: [+81] 3.5463.7165 T: +1.408.954.3108 (outside U.S.) T: [+44] 1344.413200

LSI, LSI and Design logo, MegaRAID and 3ware are trademarks or registered trademarks of LSI Corporation. All other brand and product names may be trademarks of their respective companies. LSI Corporation reserves the right to make changes to any products and services herein at any time without notice. LSI does not assume any responsibility or liability arising out of the application or use of any product or service described herein, except as expressly agreed to in writing by LSI; nor does the purchase, lease, or use of a product or service from LSI convey a license under any patent rights, copyrights, trademark rights, or any other of the intellectual property rights of LSI or of third parties. Copyright ©2012 by LSI Corporation. All rights reserved. > 0112