NEWSFEED LAB REVIEW HOW TO VIEWPOINT IDC HPC SERVER BUILD HOW LUSTRE MARKET THINKPAD A 1PFLOPS ADDS VALUE OVERVIEW W550S SUPERCOMPUTER TO HPC

The reference media in high-performance IT solutions by HPC TODAY www.hpcreview.com Hyperconverged Servers Compute + networking + storage appliances BENEFITS AND REALITY CHECK

GLOBAL N°7 edition editor’s note Shall hyperconvergence rule ?

Our cover story this month concerns a fundamental aspect of business and data center infrastructure. Following a period during which convergence has made significant strides and the software defined X approach has made its mark, the industry is slowly but steadily turning to hyperconverged servers, and its many benefits: performance, simplified administration, consistent solutions, reduced energy consumption ... do these benefits justify breaking the bank and throwing away all or part of existing infrastructures?

Not necessarily, according to some industry insiders who advocate a reasoned approach based on progressive renewal phases. At the end of the day such a significant shift is not only a purely technological choice, but also a means to conduct better and smarter business. In an increasingly fast-paced environment, business productivity has become critical…. so it’s only logical that the underlying infrastructure remains consistent with strategic needs and ambitions in a competitive market. Don’t hesitate to let us know your thoughts at [email protected]. Happy reading !

Happy reading! contents

Hyperconverged Servers

Compute + networking + storage appliances BENEFITS AND REALITY CHECK

newsfeed lab review VIEWPOINT

IDC VIEW HOW LUSTRE ON WORLDWIDE HPC ADDS VALUE SERVER FOR COMMERCIAL HPC

tech zone THINKPAD W550S SSDS FOR BIG DATA HOW TO

BUILDING BARACK OBAMA, A SUPERCOMPUTER NCSI INITIATIVE WITH 1PFLOPS EDITOR’S NOTE OF PEAK COMPUTING OPINION NEW OCZ NVME SSDS PERFORMANCE USING MANY-CORE SOURCES PROCESSORS THE HPC OBSERVATORY opinion When Massive Data Never becomes Big Data

STEVE CONWAY RESEARCH VP, IDC HIGH PERFORMANCE COMPUTING GROUP

he PRACE Days conference in Bar- them money.» Someone has to make the eco- celona last year provided power- nomic argument for storing data, especially ful reminders that massive data when it comes in petabyte-size chunks. doesn’t always become big data — This revelation sparked a lively discussion mainly because moving and storing on who gets to decide which data gets saved Tmassive data can cost massive money. PRACE when the future value of the data isn’t known. is the Partnership for Advanced Computing in For businesses like P&G, the decision process Europe, and the 2014 conference was the first is clear: if no one steps forward to pay for sto- to bring together scientific and industrial users rage, the data probably won’t be saved. For of PRACE supercomputers located in major Eu- massive data collected by governments, deci- ropean nations. sions can be more serendipitous and com- Tom Lange of Procter & Gamble, which has prehensive policies are often lacking today. used high performance computer (HPC) sys- Because the HPC community confronts very tems to help design consumer products inclu- massive data, much of it meeting the «four Vs» ding Pampers diapers and Pringles potato definitional criteria for big data (volume, va- chips, said that although P&G manufactures riety, velocity, and value), leading HPC users billions of Pampers a year and they all have are already wrestling with issues that may data sensors affixed to them, all this data is not affect mainstream IT markets for a while. deleted once the diapers exit the manufactu- In addition: ring line. «P&G doesn’t have any big data,» he — The Large Hadron Collider at CERN stated, explaining that «businesses typically generates 1 PB of data per second when it’s won’t pay for storing data that doesn’t make operating and the upcoming Square Kilome- opinion

Storage is not the only daunting expense associated with massive data. Moving data can also be very costly. Technical experts say that a single computation typically requires 1 picojoule of energy, but moving the results of the computation may cost as much as 100 picojoules.

ter Array telescope will disgorge 1 EB of data that a single computation typically requires 1 (1,000 PB) per day. In both cases, only a small picojoule of energy, but moving the results of fraction of the data will be stored. Storing it all the computation may cost as much as 100 pico- would break the budget. joules. At an average $1 million per megawatt, — China’s Internet-of-Things initia- skyrocketing energy costs have become the tive is targeting a 10,000-fold data reduction number two issue in the worldwide HPC com- to avoid having to confront 100ZB of data from munity, right after the perennial desire for home sensors alone by 2030. A major strategy bigger budgets. Hence, curtailing data move- of the effort, spearheaded by the Chinese Aca- ment is a bulls-eye target for cost-conscious demy of Sciences, is to create a single mega- HPC users in government, academia, and in- sensor that will do the work of the 40-50 device dustry. sensors that would otherwise be expected to The HPC community was arguably the ori- inhabit the average Chinese home at that time. ginal home of big data and still operates at the The HPC community has already figured out leading edge of many big data developments, how to exploit the benefits of Hadoop while but costs for data movement and storage will evading Hadoop’s requirement to store three continue to act as a brake on the growth of copies of data. The workaround is to decouple HPC big data. Even with this partial brake, IDC Hadoop from the Hadoop Distributed File Sys- forecasts that the server market for high per- tem (HDFS) and attach it instead to a fully PO- formance data analysis (big data using HPC) SIX-compliant parallel file system. will grow rapidly (23.5% CAGR) to reach $2.7 Storage is not the only daunting expense billion in 2018 and the related storage market associated with massive data. Moving data will expand to about $1.6 billion in the same can also be very costly. Technical experts say year. newsfeed

IDC view on worldwide HPC Servers newsfeed

Source: IDC’s Worldwide High-Performance Technical Server QView, 2015

Industry average for system price per processor for 1Q13– 1Q15 is $3,200

DC’s High Performance Technical Market share leadership has greatly shifted. Computing group has released its Lenovo has moved up to the major leagues, while Worldwide HighPerformance Tech- IBM moves down to triple A. Prior to this quarter, nical Server QView with technical Lenovo sales were too low to move it out of ©2015 and market data up to first quarter IDC #256828 2 the «other» category in the QView of 2015. Overall, the technical server vendor listing. In the first quarter of 2015, Leno- market grew by more than 10% on a vo had a solid hold on third place in the techni- year-over-year basis to over $2.5 bil- cal server market, with revenue behind only HP lion for 1Q15. and Dell. For its part, within the past six months, IBM fell from being the world’s second-largest IDC ANALYSTS NOTE IT technical server supplier to a distant fourth, IS WORTH SHARING well behind Lenovo, which in the first quarter of THE FOLLOWING INSIGHTS 2015 has more than three times technical server The IBM/Lenovo spilt is displaying some revenue than IBM. The IBM Power business is degree of conservation of mass. For the first showing some strong momentum for the future, quarter of 2015, the combined technical ser- such as the large CORAL procurement wins and Iver revenue of IBM and Lenovo (the combina- OpenPOWER Foundation gains, but it will take tion of IBM Power-based servers and Lenovo’s some time to reach full speed. newly acquired x86 servers) was only about Chinese suppliers are making inroads. Chinese 6% less than the total IBM server revenue technical server vendors appear to be making (Power plus x86) for the first quarter of 2014. a move against their competitors in the United At least in this initial phase, Lenovo has done States and Japan. In addition to Lenovo’s explo- a good job of holding onto much of the trans- sive growth — essentially a strong reflection on ferred IBM x86 server business. the ability of the firm to retain IBM x86 custo- newsfeed

mers — another Chinese technical server sup- on x86 servers could make the company a much plier, Sugon, saw its first-quarter 2015 yearover- stronger competitor. year revenue increase by over 60%. Although • NEC, , and SGI technical servers — which Sugon is only about one-seventh the size of Leno- use more nonstandard, in-house, and proprieta- vo — and is currently making sales primarily in ry hardware (in addition to their use the Chinese market — its overall growth rate is • Despite concerns that some white-box suppliers outpacing most competitors. Lenovo has said pu- — many of which are currently counted in the blicly that it expects to lose some IBM x86 server «other» category — may be driving the server business worldwide but make up for this with sector into a race to the bottom, Dell continues increased business in China. But Lenovo is pu- to offer the lowest overall system price per pro- shing hard into the European high-performance cessor. Indeed, the average system price per pro- computing (HPC) market by joining the Euro- cessor for the «other» category has generally in- pean Technology Platform for HPC (ETP4HPC) creased over the past nine quarters, and vendors and establishing a European HPC innovation are currently realizing almost $500 more per center in Stuttgart, Germany. processor for their systems on average than Dell. The high end is decidedly not for the faint of heart. The highest end of the technical ser- FUTURE OUTLOOK ver sector continues to display a significant Data contained in the Worldwide High-Per- degree of choppiness, with huge variations formance Technical Server QView and dis- in revenue from one quarter to the next. For cussed here indicates that there are a number example, Cray saw its 4Q14–1Q15 revenue de- of emerging trends that bear close observation cline by more than 80% (4Q14 was typically in the near term as they can have a significant very strong for Cray) while realizing a 1Q14– impact on both suppliers and buyers within 1Q15 annual revenue increase of over 30%. this sector. Trends to watch include: Likewise, Atos-Bull of France saw its 1Q14– • IBM’s drive to rely exclusively on the compa- 1Q15 revenue increase by over 40%, while its ny’s more powerful but relatively expensive 4Q14–1Q15 revenue declined by almost the Powerbased systems to drive future sales same percentage. • Lenovo’s continued retention of IBM custo- Figure 1 shows the ratio of total technical sys- mers — and perhaps even expansion into a tems revenue-to-processor counts in those sys- new base of customers — for the company’s tems for each of the major technical server sup- x86 servers, particularly within China and Eu- pliers across the past nine quarters and reveals rope and, to a lesser extent, in the U.S. market; that: likewise with Sugon as it attempts to expand • Post the sale of x86 servers to Lenovo, IBM tech- its reach in China and move out into foreign nical server systems’ price per processor has markets, starting in Asia increased significantly (from about $3,800 per • The continued growth of smaller, more dif- processor to well over $5,400 per processor), re- ferentiated technical server suppliers — like flecting the increased complexity — and associa- NEC, Cray, SGI, Fujitsu, Hitachi, T-Platforms, ted cost — inherent in IBM’s remaining Power- and Atos-Bull — to develop and compete effec- based server lines. tively with relatively higher-cost and higher- • Not surprisingly, Lenovo x86 servers hereto- performance systems using more proprietary fore marketed by IBM have a much lower ave- and custom technology rage system price per processor than IBM Power • The growth of white-box suppliers — that saw servers, and they are much more in line with a 26% first-quarter 2015 year-on-year growth other technical server suppliers that rely prima- rate — that appear to be targeting the major rily on x86 components and related technology. brand-name technical server vendors with Lenovo’s economies of scale and singular focus more than just lower prices. newsfeed

US government launches a HPC Strategic Computing Initiative newsfeed

Over the past six decades, U.S. computing capabilities have been maintained through continuous research and the development and deployment of new computing systems with rapidly increasing performance on applications of major significance to government, industry, and academia.

n order to maximize the benefits of deployment of new computing systems with HPC for economic competitiveness and rapidly increasing performance on applica- scientific discovery, the United States tions of major significance to government, in- Government will create a coordinated dustry, and academia. Maximizing the bene- Federal strategy in HPC research, deve- fits of HPC in the coming decades will require lopment, and deployment. Investment an effective national response to increasing in HPC has contributed substantially to demands for computing power, emerging national economic prosperity and ra- technological challenges and opportunities, pidly accelerated scientific discovery. and growing economic dependency on and Creating and deploying technology at competition with other nations. This natio- the leading edge is vital to advancing nal response will require a cohesive, strategic my Administration’s priorities and effort within the Federal Government and a spurring innovation. Accordingly, this close collaboration between the public and order establishes the National Strategic Com- private sectors. puting Initiative (NSCI). The NSCI is a whole- With this initiative, the US government of-government effort designed to create a cohe- wants to ensure sustain and enhance its scien- Isive, multi-agency strategic vision and Federal tific, technological, and economic leadership investment strategy, executed in collaboration position in HPC research, development, and with industry and academia, to maximize the deployment through a coordinated Federal benefits of HPC for the United States. strategy guided by four principles: Over the past six decades, U.S. computing • deploy and apply new HPC technologies capabilities have been maintained through broadly for economic competitiveness and continuous research and the development and scientific discovery. newsfeed

To achieve the five strategic objectives, this order identifies lead agencies, foundational research and development agencies, and deployment agencies.

even after the limits of current semiconduc- • foster public-private collabora- tor technology are reached (the «post- Moore’s tion, relying on the respective strengths of Law era»). government, industry, and academia to maxi- • Increasing the capacity and capa- mize the benefits of HPC. bility of an enduring national HPC ecosys- • adopt a whole-of-government ap- tem by employing a holistic approach that proach that draws upon the strengths of addresses relevant factors such as networking and seeks cooperation among all executive technology, workflow, downward scaling, departments and agencies with significant foundational algorithms and software, acces- expertise or equities in HPC while also colla- sibility, and workforce development. borating with industry and academia. • Developing an enduring public-pri- • develop a comprehensive technical vate collaboration to ensure that the and scientific approach to transition benefits of the research and development HPC research on hardware, system software, advances are, to the greatest extent, shared development tools, and applications efficiently between the United States Government and into development and, ultimately, operations. industrial and academic sectors. Executive departments, agencies, and of- fices (agencies) participating in the NSCI will ROLES AND RESPONSIBILITIES pursue five strategic objectives : To achieve the five strategic objectives, this • Accelerating delivery of a capable order identifies lead agencies, foundational re- exascale computing system that inte- search and development agencies, and deploy- grates hardware and software capability to ment agencies. Lead agencies are charged with deliver approximately 100 times the perfor- developing and delivering the next generation mance of current 10 petaflop systems across of integrated HPC capability and will engage a range of applications representing govern- in mutually supportive research and develop- ment needs. ment in hardware and software, as well as in • Increasing coherence between the developing the workforce to support the objec- technology base used for modeling and tives of the NSCI. Foundational research and simulation and that used for data analytic development agencies are charged with funda- computing. mental scientific discovery work and associa- • Establishing, over the next 15 years, ted advances in engineering necessary to sup- a viable path forward for future HPC systems port the NSCI objectives. Deployment agencies newsfeed

The assignment of these responsibilities reflect the historical roles that each of the lead agencies have played in pushing the frontiers of HPC.

will develop mission-based HPC requirements FOUNDATIONAL RESEARCH AND to influence the early stages of the design of DEVELOPMENT AGENCIES new HPC systems and will seek viewpoints There are two foundational research and deve- from the private sector and academia on target lopment agencies for the NSCI: the Intelligence HPC requirements. These groups may expand Advanced Research Projects Activity (IARPA) to include other government entities as HPC- and the National Institute of Standards and related mission needs emerge. Technology (NIST). IARPA will focus on future computing paradigms offering an alternative LEAD AGENCIES to standard semiconductor computing techno- There are three lead agencies for the NSCI: the logies. NIST will focus on measurement science Department of Energy (DOE), the Department of to support future computing technologies. The Defense (DOD), and the National Science Foun- foundational research and development agen- dation (NSF). The DOE Office of Science and cies will coordinate with deployment agencies to DOE National Nuclear Security Administration enable effective transition of research and deve- will execute a joint program focused on ad- lopment efforts that support the wide variety of vanced simulation through a capable exascale requirements across the Federal Government. computing program emphasizing sustained performance on relevant applications and ana- DEPLOYMENT AGENCIES lytic computing to support their missions. NSF There are five deployment agencies for the will play a central role in scientific discovery NSCI: the National Aeronautics and Space advances, the broader HPC ecosystem for scien- Administration, the Federal Bureau of Inves- tific discovery, and workforce development. tigation, the National Institutes of Health, the DOD will focus on data analytic computing to Department of Homeland Security, and the support its mission. The assignment of these National Oceanic and Atmospheric Adminis- responsibilities reflects the historical roles that tration. These agencies may participate in each of the lead agencies have played in pushing the co-design process to integrate the special the frontiers of HPC, and will keep the Nation requirements of their respective missions and on the forefront of this strategically important influence the early stages of design of new HPC field. The lead agencies will also work with the systems, software, and applications. Agencies foundational research and development agen- will also have the opportunity to participate cies and the deployment agencies to support the in testing, supporting workforce development objectives of the NSCI and address the wide va- activities, and ensuring effective deployment riety of needs across the Federal Government. within their mission contexts. newsfeed New OCZ NVMe SSDs Win Enterprise SSD Category Delivering Superior Random I/O Performance, Low Latencies and Robust Features via Multiple Configurations

CZ Storage Solutions, a The NVMe-compliant Z-Drive 6000 Series are leading provider of high- available in multiple configurations as follows: performance solid-state • The Z-Drive 6000 SFF Series (read-intensive drives (SSDs) for compu- applications): supports 2.5-inch small form ting devices and systems, factor (SFF) and usable capacities of 800GB, today announced that its 1.6TB and 3.2TB. new Z-Drive 6000 SSD • The Z-Drive 6300 SFF Series (mixed workload Series has earned the applications): supports 2.5-inch SFF and usable Best of Computex award capacities of 800GB, 1.6TB, 3.2TB and 6.4TB. in the Enterprise SSD • The Z-Drive 6300 AIC Series (mixed workload category from industry applications): supports Half-Height/Half- leading tech publication Length (HHHL) Add-in Card (AIC) form factors Tom’s Hardware Guide. and usable capacities of 800GB, 1.6TB, 3.2TB This prestigious award recognizing OCZ’s and 6.4TB. innovation and technological advancements The Best of Computex Judging Committee was announced June 6th at Computex 2015. was comprised of editors from Tom’s Hard- O“We are very pleased to win the Best of Com- ware Guide and Tom’s IT Pro publications, putex award in the Enterprise SSD category choosing the most innovative products and for our NVMe- compliant Z-Drive 6000 SSD components covering graphics technology, Series,” said Ralph Schmitt, CEO for OCZ Sto- motherboards, cooling systems, power sup- rage Solutions. “This award not only repre- plies, peripherals, storage technology, and ge- sents a validation of our product strategy, but neral computing. The award showcases ven- highlights a drive that delivers leading perfor- dors pushing the boundaries of technology mance and features, and is a compelling solu- with a compelling product worthy of the ’Best tion for our enterprise customers.” of Show’ recognition. CZ’s new Z-Drive 6000 SSD Series utilizes NVMe to streamline the storage stack and reduce protocol latency to dramatically boost performance and efficiency over previous Z-Drive generations. PCIe Gen 3 bandwidth allows the Z-Drive 6000 family to achieve lea- ding sustained transfer speeds up to 2.9 GB/s while NVMe efficiency enables the Z-Drive 6000 SSDs to read up to 700,000 I/O requests per second and write up to 160,000 operations per second. The Z-Drive 6000 also delivers consistent and predictable low-latency I/O responses of only 25 microseconds for a 4KB write and 80 microseconds for a 4KB read.

newsfeed sources

books

INTRODUCTION TO COMPUTATIONAL formulation to implementation of computatio- MODELS WITH PYTHON nal models. This book provides the foundation Jose M. Garrido for more advanced studies in scientific compu- Chapman and Hall/CRC, 466 pages, 49.29£ ting, including parallel computing using MPI, grid computing, and other methods and tech- Introduction to Computational niques used in high-performance computing. Models with Python explains how to implement computational mo- INTRODUCTION TO QUANTUM PHYSICS dels using the Python program- AND INFORMATION PROCESSING ming language. The book’s five Radhika Vathsan sections present an overview of Chapman and Hall/CRC, 270 pages, 49.29£ problem solving and simple Py- thon programs, introducing the basic models Introduction to Quantum Physics and techniques for designing and implemen- and Information Processing ting problem solutions; programming prin- guides beginners in understan- ciples with Python, covering basic program- ding the current state of research ming concepts, data definitions, programming in the novel, interdisciplinary structures with flowcharts and pseudo-code, area of quantum information. solving problems, and algorithms; python lists, Suitable for undergraduate and arrays, basic data structures, object orienta- beginning graduate students in physics, mathe- tion, linked lists, recursion, and running pro- matics, or engineering, the book goes deep into grams under Linux; implementation of com- issues of quantum theory without raising the putational models with Python using Numpy, technical level too much. The text begins with with examples and case studies; the modeling the basics of quantum mechanics required to of linear optimization problems, from problem understand how two-level systems are used as newsfeed sources

qubits. It goes on to show how quantum pro- perties are exploited in devising algorithms for moocs problems that are more efficient than the clas- sical counterpart. It then explores more sophis- ticated notions that form the backbone of quan- tum information theory. Requiring no background in quantum physics, this text pre- pares readers to follow more advanced books and research material in this rapidly growing field. Examples, detailed discussions, exercises, and problems facilitate a thorough, real-world understanding of quantum information.

HIGH PERFORMANCE VISUALIZATION: ENABLING EXTREME-SCALE SCIENTIFIC INSIGHT E. Wes Bethel, Hank Childs, Charles Hansen Chapman and Hall/CRC, 520 pages, 59.49£

Visualization and analysis tools, INTRODUCTION TO C++ techniques, and algorithms have Get a brief introduction to the C++ language undergone a rapid evolution in from the experts at Microsoft. C++ is a gene- recent decades to accommodate ral purpose programming language that sup- explosive growth in data size and ports various computer programming models complexity and to exploit emer- such as object-oriented programming and ge- ging multi- and many-core com- neric programming. It was created by Bjarne putational platforms. This book focuses on the Stroustrup and, “Its main purpose was to subset of scientific visualization concerned make writing good programs easier and more with algorithm design, implementation, and pleasant for the individual programmer.”. By optimization for use on today’s largest compu- learning C++, you can create applications that tational platforms. After introducing the fun- will run on a wide variety of hardware plat- damental concepts of parallel visualization, forms such as personal computers running the book explores approaches to accelerate vi- Windows, Linux, UNIX, and Mac OS X, as well sualization and analysis operations on high as small form factor hardware such as IoT performance computing platforms. Looking to devices like the Raspberry PI and Arduino– the future and anticipating changes to compu- based boards. What you’ll learn : C++ Syntax, tational platforms in the transition from the C++ Language Fundamentals, How to Create petascale to exascale regime, it presents the Functions in C++. Prepare yourself for inter- main research challenges and describes seve- mediate and advanced C++ topics in follow-up ral contemporary, high performance visuali- courses taught by Microsoft zation implementations. Reflecting major concepts in high performance visualization, Length: 4 weeks Effort: 3-5 hours/week this book unifies a large and diverse body of Price: FREE / Add a Verified Certificate for $90 computer science research, development, and Institution: Microsoft Subject: Computer Science practical applications at the intersection of Level: Introductory Languages: English scientific visualization, large data, and high Video Transcripts: English Link : https://www.edx. performance computing trends. org/course/introduction-c-microsoft-dev210x newsfeed sources

ENGINEERING SOFTWARE metaprogramming and reflection can improve AS A SERVICE (SAAS), PART 1 productivity and code maintainability. Weekly UC BerkeleyX coding projects and quizzes will be part of the Learn software engineering fundamentals learning experience in this SaaS course. Those using Agile techniques to develop Software as who successfully complete the assignments and a Service (SaaS) using Ruby on Rails. This in- earn a passing grade can get an honor code cer- termediate SaaS courses uncovers how to code tificate or verified certificate from BerkeleyX. long-lasting software using highly-productive The videos and homework assignments used in Agile techniques to develop Software as a Ser- this offering of the course were revised in Octo- vice (SaaS) using Ruby on Rails. Learners will ber 2013. The new class also includes embedded understand the new challenges and opportu- live chat with Teaching Assistants and other nities of SaaS versus shrink-wrapped software. students and opportunities for remote pair pro- They will understand and apply fundamental gramming with other students. programming techniques to the design, deve- lopment, testing, and public cloud deployment of a simple SaaS application. Using best-of- Length: 9 weeks Effort: 12 hours/week breed tools that support modern development Price: FREE / Add a Verified Certificate for $49 techniques including behavior-driven design, Subject: Computer Science Level: Intermediate user stories, test-driven development, velocity, Languages: English Video Transcripts: English and pair programming, learners will see how Link : https://www.edx.org/course/engineering- modern programming language features like software-service-saas-part-1-uc-berkeleyx-cs169-1x#!

Starts on November 2, 2015

PARADIGMS OF COMPUTER You’ll use simple formal semantics for all PROGRAMMING – ABSTRACTION concepts, and see those concepts illustrated AND CONCURRENCY with practical code that runs on the accompa- LouvainX nying open-source platform, the Mozart Pro- This course covers data abstraction, state, and gramming System. Louv1.2x (Abstraction and deterministic dataflow in a unified framework Concurrency) covers data abstraction, state, with practical code exercises. Louv1.2x and its and concurrency. You’ll learn the four ways to predecessor Louv1.1x together give an intro- do data abstraction and discuss the trade-offs duction to all major programming concepts, between objects and abstract data types. You’ll techniques, and paradigms in a unified fra- be exposed to deterministic dataflow, the most mework. We cover the three main program- useful paradigm for concurrent program- ming paradigms: functional, object-oriented, ming, and learn how it avoids race conditions. and declarative dataflow. The two courses are targeted toward people with a basic knowledge of programming. It will be most useful to Length: 6 weeks Effort: 4 hours/week beginning programming students, but the Price: FREE / Add a Verified Certificate for $100 unconventional approach should be insight- Subject: Computer Science Level: Advanced ful even to seasoned professionals. Louv1.1x Languages: English Video Transcripts: English (Fundamentals) covers functional program- Link : https://www.edx.org/course/paradigms- ming, its techniques and its data structures. computer-programming-louvainx-louv1-2x-0#! newsfeed observatoire

CHIFFRES TOP 500 CLÉS TOP 3

Tianhe-2 billion National Supercomputing Center, Canton : dollars 1 33863 /54902 TFlops Manufacturer NUDT Architecture Xeon E5–2692 + Xeon Phi 31S1P, TH 44 Express-2 Worldwide projected Titan HPC market value by 2020 Oak Ridge National Laboratory, 2 USA : 17590 / 27113 TFlops Manufacturer Cray XK7 Architecture Opteron 6274 + Nvidia Tesla K20X, Cray Gemini Interconnect Sequoia Lawrence Livermore National Laboratory, 3 USA : 17173 / 20133 TFlops Manufacturer 8,3% IBM Blue Gene/Q Architecture PowerPC A2 Yearly growth of HPC market The TOP500 classes every six months the 500 most powerful supercomputers in the world. The retained values, RMAX and RPEAK represent the maximum and theoretical Linpack computing power. Billion 220 dollars Compound market value over the 2015-2020 period GREEN 500 TOP 3 Source : Market Research Media 7031,6 MFlops/W 1 RIKEN Shoubu (Japon) 6952,2 MFlops/W Suiren Blue High Energy Accelerator 2 Research Organization /KEK (Japon) 6217 MFlops/W Suiren High Energy Accelerator Research 3 Organization /KEK (Japon)

Green 500 list ranks the most energy efficient supercomputers in the world. Energy efficiency is assessed by measuring performance per Watt. The unit here is the MFLOPS / Watt.

cover story

Hyperconverged Servers Compute + networking + storage appliances BENEFITS AND REALITY CHECK cover story Hyperconverged servers are modular systems designed to evolve over time depending on load and needs. Discover the advantages and disadvantages of these systems. cover story

Hyperconverged systems are a natural evolution of the traditional IT infrastructure, which is usually made up of silos according to business and operational needs.

f you are in the acquisition phase of groups and separate administrative systems storage or server platforms, it’s impos- for storage, the servers and the network. The sible not to trip over the recent offers group responsible for storage, for example, for hyperconverged systems. Your manages the purchase, supply, and storage first questions might be: What are the support infrastructure, but is not necessarily advantages of hyperconverged sys- concerned with network or enterprise ser- tems, and how are they different from ver infrastructure issues. The same situation convergedsystems? The best questions exists for the servers and the network. The to ask are: What are converged sys- relatively recent concept of hyperconverged tems, and how do they differ from tra- systems combines two or more of these infras- ditional infrastructure systems? tructure components as a solution.

HOW DO HYPERCONVERGED AT WHAT POINT SHOULD YOU SYSTEMS DIFFER FROM TRADITIONAL BECOME INTERESTED INFRASTRUCTURE SYSTEMS? IN A HYPERCONVERGED SYSTEM? Hyperconverged systems are a natural evolu- During the evolution of a business’ infras- Ition of the traditional IT infrastructure, which tructure, such as for a data center, there are is usually made up of silos according to busi- several pivotal moments that require consi- ness and operational needs. This results in deration about its evolution and composition. cover story

Advantages • No need to interrupt activity • Performance gain for storage and applications

Disadvantages • Risk of imbalanced network flows • Software Licenses to be considered (VMware) According to recognized needs, the identified uses and their alignment with the business’ SECOND OPTION: CONSOLIDATE THE strategy, the path to take will not be the same EXISTING INFRASTRUCTURE in all cases. To summarize, there are essen- When it comes to replacing all or part of a tially three options available for a company: company’s servers because of obsolescence expand, consolidate or renew the existing or performance reasons, one of the possibili- with one or more hyperconverged systems ties offered by hyperconverged appliances is in order to, and this is essential, support and allowing their virtualization. The advantages serve the business and its evolution. are many: replacing a physical server with a hosted virtual machine, providing a new way FIRST OPTION : EXPAND THE EXISTING of administration, more flexible and unified The first option is the expansion of existing through a hypervisor. The benefits are also infrastructure by adding one or more hyper- physical(space gained) and economic (reduced converged systems, which means retaining all energy consumption). Taking into account sa- or part of the existing infrastructure. This ap- vings on amortization periods facilitates cal- proach offers the advantage of not questioning culations and comparisons against a hyper- the existing infrastructure and simply adding converged server. The process of backup and new network and storage resources. Howe- disaster recovery is thereby also simplified, ver, keep the consistency of the infrastruc- virtual machines have replaced physical ser- ture in mind. The risk can be an unbalanced vers by being centralized in a hyperconver- architecture at two distant performance le- ged appliance. In this case, it enforces regular vels from each other, making load balancing backups or replication in a disaster recovery tricky. There two aspects to take into conside- scenario (DRP) or Business Continuity (BCP). ration: the network of a hyperconverged sys- tem is often at 10 Gbit/s, where the majority of Advantages firms still operate at 1 Gbit/s. The bottleneck is • Centralised administration sudden as the entire network operates at only • Saving floor space a tenth of the optimal performance of a new • Energy savings (performance, cooling) generation appliance. On the storage side it’s not much better, although being contained wi- Disadvantages thin the same server, hybrid or full flash, can • The hyperconverged server represents a single point significantly improve the flow rates and res- of failure ponse times. Sometimes there are unwelcome • Implementation of emergency backup and/or surprises due to the heterogeneity and length replication of storage devices, as demonstrated by the fin- dings of the DataCore survey SDS (see inset). THIRD OPTION: RENEW THE EXISTING Conversely, the addition of more powerful A hyperconverged approach can selectively recent equipment can slow down applications replace certain application servers depending and critical business data, requiring faster ac- on whether they are job or business oriented. cess and treatment such as VDI. It’s not uncommon that a server is traditio- cover story

nally dedicated to each application, such as a Microsoft Exchange mail server, for example. P2V Migration (Physical to Virtual) of such a server within a hyperconverged appliance en- sures its continuity without requiring reins- tallation (the included migration tools abound) while benefiting from the material gain of the appliance. It is therefore possible to gradually replace physical servers by their equivalent in performance of the underlying hardware. and virtual machines. With one caveat, however. 2) the management layer, which controls all the If this applies to the vast majority of x86 ser- storage resources, calculation and networking vers and applications, it does not hold true for from a single point of administration. A third «Legacy»servers based on AS/400 hardware, factor relates to scalability, or according to the for example. long-term planning, scale-out of the capacity, that is to say to changing either by adding exis- Advantages ting modules, or by interconnecting servers to- • Selective P2V Migration spread over time gether to form a logical aggregation of multiple • Progressive decommission of physical servers hyperconverged servers, they can contain each • Intrinsic performance gain multiple server blades with their processor, me- mory, storage, and network interfaces, as offe- Desadvantages red with its range of HP servers HyperConver- • Some «Legacy» applications will not tolerate ged Systems HC-200. virtualized operation (AS/400, etc.). IMPLEMENTATION: EXPRESS! Another aspect is often overlooked, but it is important: all manufacturers offer hypercon- verged deals whose components have been carefully selected and their operation vali- dated. This is not only to propose an offer a single support service, but also to simplify deployment. All the hyperconverged servers are preset and pre-installed to be deployed within fifteen minutes! An IP address for the start, followed by some identifiers is generally HOW HYPERCONVERGED SERVERS enough at a minimal level to be operational on DIFFER FROM CONVERGED SYSTEMS the business level and the current Active Di- While converged systems are separate compo- rectory . The next step consists of installing or nents designed to work well together, hypercon- migrating application software and to adjust verged servers are modular systems designed where necessary, their levels of tolerance/re- to evolve by adding additional modules. These dundancy and charge balancing. Witha single systems are designed primarily around sto- dashboard to administer all of it. rage and computation in a single chassis of x86 servers interconnected by high-speed 10 Gbit TWO WAYS OF UNDERSTANDING Ethernet network interfaces. It’s not just a stack HYPERCONVERGENCE of material in a single chassis. The two main The principle path to getting equipped with a differences are 1) the software-defined storage hyperconverged server consists of outfit your- (SDS), which benefits from the flexibility and self with a prefabricated system such as those cover story

On paper, hyperconverged servers have a lot to offer. They are not, however, exempt from inconveniences.

combined hardware and software prevent bringing substantial modifications to the lat- ter. If the manufacturers are eager to propose the most balanced appliances possible, the possibilities of configurations to gain perfor- mance are limited. Similarly, if the appliance has a functional imbalance, in computing power or regarding storage, it’s difficult to cor- offered byHP, Nutanix, SimpliVity or Scale rect it other than by adding another appliance Computing. It’s the simplest and least risky to compensates for the defect. A default which way advized if your activity depends on it. In may not be initially evident, but becomes ap- case you wish to create a custom solution, you parent in the course of operation, once appli- can take the server’s existing hardware base cation migrations and virtualization of physi- (available at Supermicro or CARRI Systems) cal servers have already been made. and combine software approaches to it and thus, create your own hyperconverged server. CONCLUSION Software offerings such as VMware’s VSAN Despite these reservations, hyperconverged and HP’s StoreVirtual are intended for custo- servers have valuable qualities vis-à-vis the mers wanting more control in the design of modular approach which have prevailed up their hyperconverged system. With this self- to now. Among its decisive advantages is the service approach, you will be able to select simplification of the infrastructure, the cohe- your server’s manufacturer and your prefer- rence favoring performance and the single red software setup, as long as the hardware is administration interface, as well as lower ge- supported by the software vendor. neral and administrative expenses and sim- plified vendor management for highly virtua- DISADVANTAGES lized environments. Before taking the plunge, OF HYPERCONVERGED SYSTEMS it is advised to verify possible gaps in relation On paper, hyperconverged servers have a lot to your existing operating environment in or- to offer. They are not, however, exempt from der to avoid unpleasant surprises. Joscelyn inconveniences. Their «black box» aspect, Flores

lab review lab review HOW WE TEST

HPC LABS HPC Bench HPC Labs is the technical unit of the HPC GLOBAL INDEX Media group and totally independent of the 9 108 manufacturers. HPC Labs’ mission is to de- A single synthetic index to help you compare our test results velop methodologies and materials testing and software metrics in the high perfor- allow direct comparison of solutions per- mance IT world. Capitalizing on best prac- taining to the same segment, resulting in a tices in the field, these tools are based on single index taking into account the specific several decades of joint experience of the hardware or software tested. For example, laboratorys’ management. an SSD will be tested with the HPCBench Solutions> HPCBENCH SOLUTIONS Storage, while a GPU accele- Specifically designed for HPC Review, the rator will be tested with the HPCBench Solutions assess not only per- HPCBench Solutions> accels. formance but also other equally important Rigorous and exhaustive, aspects in use, such as energy efficiency, these protocols allow you to A technical sound volume, etc. To differentiate synthe- choose what will be for you, recognition tic protocols like Linpack, these protocols objectively, the best solution. Award lab review Thinkpad W550s

he Lenovo ThinkPad W550s reinforced chassis, and stainless steel hinges, is a mobile for and is built to meet MIL-STD-810G standards, those who need high-end meaning it is designed to survive drops, performance in a package shocks, vibration, and extreme temperatures. that’s easy to carry and The W550s features an impressive 15-inch dis- travel with. Featuring an play with 3K (2,880-by-1,620 pixels) resolution. Core i7 processor, Nvi- With an In-Plane Switching (IPS) panel, the dis- dia graphics, and play looks sharp and vibrant from any angle, an impressive 3K display, and it’s a 10-digit touch screen, so it supports all the ThinkPad W550s is a of the gesture controls found in Windows 8. good blend of mobility and performance. The balance FEATURES between the two, however, The ThinkPad W550s is well-equipped. On the clearly favors portability, with excellent batte- right of the system are two USB 3.0 ports and ry life and a slim, semi-rugged design. a mini DisplayPort. The left side sports a third Weighing just under 2.5 Kg, the ThinkPad USB 3.0 port, VGA output, a Gigabit Ethernet TW550s is Lenovo’s lightest and thinnest works- port, headset jack, and an SD card slot. There tation to date, but it’s also built to stand up to is also an assortment of security and manage- the rigors of regular business travel. It has a ment features, ranging from a case-lock slot magnesium-alloy roll cage, a carbon-fiber- for physically securing the system to a built- lab review

in fingerprint reader for simple secure login. A docking connector on the underside of the system lets you dock it to the ThinkPad Ultra Dock, which provides additional connections for power, monitor, and peripherals. The sys- tem also boasts both 802.11ac dual-band Wi-Fi and Bluetooth 4.0, and supports Intel WiDi for wirelessly streaming HD video and audio. The 512GB solid-state drive (SSD) is double the capacity of the drives found in the HP ZBook 14$1,599.79 at Amazon or the Lenovo W540. There also aren’t too many preinstal- led programs taking up that space. The comes with a free 30-day trial of Microsoft Of- fice 365 for new customers, and a 30-day trial Quadro K620M graphics card, which offers of Norton Internet Security, but otherwise you the ISV certifications and capabilities neces- have one or two apps, like Evernote Touch, sary for a workstation, but with significantly Skype, and The Weather Channel. But while less muscle than the Nvidia Quadro K2100M there aren’t a lot of apps preinstalled, the sys- used in other systems, like the Lenovo W540 tem is made for broad compatibility, with in- and the HP Zbook 15. Instead, it’s closer in ca- dependent software vendor (ISV) certification pability to the HP ZBook 15u G2, which is bet- for a wide assortment of apps, with uses in ter suited to tasks like digital content creation engineering, design, and finance. and basic financial number crunching than traditional workstation uses like engineering PERFORMANCE and design. The ThinkPad W550s is equipped with a dual- The ThinkPad W550s has two batteries, an core Intel Core i7-5600u processor, an ultra- internal battery and a removable secondary book-class CPU is used here for its blend of one. In our rundown test, the system lasted 6 processing power and efficiency. Paired with hours 44 minutes using the internal battery 16GB of RAM, it does help the system offer alone, outlasting both the HP ZBook 15u G2 excellent battery life, but it also means that and the Lenovo W540 (both 6:13) by a half- the raw performance may not be up to your hour. With the addition of the second battery, expectations for a workstation. In PCMark 8 the time is extended by more than 10 hours Work Conventional, for example, the Think- (17:21). Pad W550s scored 2,736 points, lagging behind last year’s Lenovo W540 (3,105 points) and the CONCLUSION HP ZBook 15u G2 (3,124 points), though it does The Lenovo ThinkPad W550s is a solid laptop outperform the M3800 (2,664 for the user who needs basic workstation ca- points). In Cinebench R15, which takes ad- pabilities, but puts portability first. If you are vantage of multiple processing cores, the sys- frequently on the road, the ThinkPad W550s tem lagged behind, scoring 274 points, while is a great choice, thanks to the thin and light quad-core-based systems like the Dell M3800 design, combined with the extremely impres- (599 points) and the Lenovo W540 (637 points) sive 17+ hours of battery life. In terms of per- scored drastically higher. formance, however, the Lenovo ThinkPad In 3D rendering and other graphics-inten- W540 is still ahead of the game, offering more sive tests, the laptop was good, but not great. processing power and better graphics capabi- Lenovo outfitted the system with an Nvidia lity at the expense of portability.

HOW TO Building a supercomputer with 1PFLOPS of peak computing performance using many-core processors

tarting to use the second domestic performance (as of June 2014), with PFLOPS supercomputer at peak computing performance of 1.001PFLOPS, the Centre for Computational with supercomputers having Intel Xeon Phi Sciences, University of Tsu- coprocessors. At the Center for Computational kuba Center for Computatio- Sciences, in addition to promoting state-of-the- nal Sciences, University of art computational science research, the two Tsukuba (hereinafter, Center supercomputers, ’HAPACS’, the supercomputer for Computational Sciences), that came into use in 2012 with 1.116 PFLOPS promoting “Interdisciplinary that accelerates the computing device GPU, and Computational Science”, a fu- the new ’COMA’ are used towards the professio- sion of science and computer nal development of high-performance compu- science, introduced “COMA ting in the education of students and graduate (PACS-IX)” (COMA, PACS- students. Nine), a supercomputer of peak computing performance of 1.001PFLOPS, and has been in SELF-PROMOTION OF SCIENTIFIC use since April 2014. The CPU comes with two RESEARCH BY USING A UNIQUE Sunits of Intel Xeon Processor E5 family, the COMPUTER AT THE NATIONAL arithmetic accelerator which assists the CPU, UNIVERSITY INFORMATION CENTRE combined with 393 units of compute nodes With the Centre for Computational Physics that comes with two units of Intel Xeon Phi that was founded in 1992 as its predecessor, coprocessor, form an Ultra parallel PC cluster. the current Center for Computational Sciences It is a matter of pride to achieve the highest had its restructuring and expansion done HOW TO

“A large number of CPU cores are mounted on a single chip, and are attached to the CPU via the PCI Express bus in the Intel Xeon Phi coprocessor to achieve high cost performance, power performance and space performance on the super computer” Assistant Center Head Professor Taisuke Boku

in 2004. In 2010, it was certified by the Joint Usage/Research Center, Ministry of Education as ’AISCI (Advanced Interdisciplinary Compu- tational Science Collaboration Initiative)’. Currently, under the ’Multidisciplinary Joint- use Program’ that strengthens the collaboration of computer science and science, it provides the computer center to research institutes across the country, supports holding of workshops for interdisciplinary computational science promotion, supports invitation to researchers, supports exchange of students and researchers and conducting of research aimed at the deve- lopment of extra scale supercomputers. Deputy Director and Chairman of the Com- puter System operation, Mr. Taisuke Boku, ex- plains thus: “Its role is not just that of the infor- mation center where researchers from the field Challenges of science can use supercomputers but being a • Expanding needs a fusion of science and unique National University with a large num- of the high-speed computer science. system to support ber of teachers and researchers from various • Accumulation of computational science fields, application deve- ’Interdisciplinary Computational Science’, technology of many- lopment and scientific research using this com- core processors puter, and promoting the development of the computer itself, a feature that is not present Solution in the computer centers of other universities. • Intel Xeon processor • Comprehensive tool E5 family ’Intel Cluster Studio XE’ These initiatives are defined as ’Interdisciplina- towards MPI developers ry Computational Sciences’, and as a major fea- • Intel Xeon Phi and C++/ Fortran ture of the Center for Computational Sciences coprocessor programmers it is a place of exploration in computational sciences that have a wide scope.” ’Interdiscipli- Benefits nary Computational Science’ uses a high-speed • Introduction of • Research and develop network and ultra-highspeed computers as the supercomputer ’COMA tuning and coding (PACS-IX)’ of peak of next generation main research tool, where researchers from computing performance supercomputers that three fields, viz. science researchers who use of 1.001PFLOPS employ many-core the computer, information science researchers processors HOW TO

THE JOINT CENTER FOR ADVANCED HIGH PERFORMANCE COMPUTING performance through freedom from vendor dependence, and they are formulated based on the doctrine of “Openness” which represents openness of hardware and openness from user dependence. The establishment of JCAHPC is one step in this direction, and the design and development as well as the operation and management of the high-performance super computers are jointly carried out by the information centers of both universities with the objective of promoting state-of-the-art The Center for Computational 3 CENTERS FOR HPC computational science. Sciences, University of Tsukuba ACADEMIC RESEARCH and the Information Technology The Center for Computational NEW MANYCORE POLICY Center, University of Tokyo have Sciences, University of Tsukuba The policy framework which has set up a super computer system and the Information Technology been established for the design and to be designed primarily by the Center, University of Tokyo have development of super computing teaching staff of both institutions in implemented projects in the past systems does not follow existing the Tokyo University Information to set joint specifications and super computing products, Technology Center located in the procure super computers for each but instead, uses many-core Kashiwa Campus of the University university through the “T2K processors to design a state-of- of Tokyo used since April 2015, and Open Super Computer Alliance” the-art system, and also links the have commenced activities for the consisting of 3 centers of the system to the OS, programming launch of a new super computer Academic Information Media language, numerical computing system. This is the first time such Center of the University of Tokyo. library and other systems which an experiment has been conducted The specifications of the T2K form the core technology for the within Japan to set up such a Open Super Computer Alliance are software during the design and facility and to jointly operate and intended to reduce procurement development process. manage a super computer. cost and improve system

who research on data and media processing, high-speed simulation, largescale data analy- and computer science researchers who re- sis, and applied research in information tech- search on hardware, software, algorithms, and nology. Currently, using ’HA-PACS’ and ’COMA’, programming use these computers for their in- that brought into reality the PetaFLOPS, we aim tegrated but innovative applied research. The for a breakthrough in the research and deve- Computational Science Research Center, along lopment of computational sciences in the fields with building a joint research system over the of particle physics, space, nuclear physics, ma- years, has also been promoting basic science, terial, science, life, and global environment. HOW TO

EVALUATING THE HIGH COST ted ’JCAHPC (Joint Center for Advanced High PERFORMANCE AND THE POWER Performance Computing)’ an organization for PERFORMANCE AND ADOPTING OF INTEL co-operation and management. As the goal for XEON PHI COPROCESSOR FY 2015, JCAHPC is planning to introduce very In the recent years, speed and performance large supercomputers with scores of PFLOPS of CPU have helped performance of massi- class, at the Kashiwa campus of the Informa- vely parallel PC clusters to increase signifi- tion Technology Center, University of Tokyo. cantly. However, the conventional way of just COMA, which is provided with many-core pro- increasing the number of nodes by stacking cessors, also plays a role as an experimental CPU to increase computing performance has system for carrying out the research and deve- limitations due to limited space and power. At lopment of tuning and coding of the next gene- the Computational Science Research Center, ration supercomputer. a course concerning the study of low power high performance supercomputers accelera- ENSURING HIGH-SPEED PERFORMANCE ted by arithmetic devices that assist CPU fo- WITH BIASED MIC BOARD OF 2 UNITS cuses on many-core processors (MIC) in which WITH ONE CPUMIC AND CPU multiple cores are integrated in a single chip. COMA, introduced by the Center for Compu- Intel has been participating in the ’Intel MIC tational Sciences is an abbreviation of ’Clus- Beta Program’ initiated for evaluation of the ter of Many-core Architecture processor’ The architecture of MIC and continuously resear- term is also derived from the English name of ching on tuning and performance evaluation. ’Coma Berenices’, one of the typical clusters of The research center has decided to introduce galaxies. Collection of stars is a Galaxy (Many COMA instantaneously after the Intel Xeon Core), and a collection of Galaxies (Cluster) has Phi coprocessor is launched. the image of a Galaxy cluster. At the same time, The Intel Xeon Phi coprocessor has 61 CPU from the fact that it is a machine of the 9th ge- cores on a single chip. Just attaching this co- neration Supercomputer PACS Series of the Cen- processor to a CPU through PCI Express (gene- ter of Computational Sciences, the code name of ric bus) will achieve a high cost performance PACS-IX has been used in combination. COMA as well as bring out the performance of tera- is a parallel system equipped with 393 compute flops. This coprocessor gives an excellent space nodes where one compute node has two Intel performance and high power performance for Xeon processors E5-2670v2 (2.50GHz) having 10 every 1 watt. (Mr. Boku) Another benefit of the cores per CPU and two Intel Xeon Phi™ copro- Intel Xeon Phi coprocessor, which shares same cessors 7110P with 61 CPU cores. Peak compu- architecture as that of the Intel Xeon processor, ting performance of a single node is 0.4TFLOPS is its high level of programmability. . Mr. Boku for CPU, 2.147TFLOPS for MIC, and 2.547TFLOPS says, “There comes a sense of security when together. Peak computing performance of the en- programs can be developed without wasting tire system is 157.2TFLOPS for CPU, 843.8TFLOPS the existing programs and without having to for MIC, and 1.001PFLOPS together. In addition, remember new things. Just like the conventio- all compute nodes are connected by a mutual nal coprocessors, this coprocessor can be used network through a high-speed network (Infini- to write programs by using FORTRAN or C++ Band FDR), with a communication performance and Open MP can be used, so that developing ten times that of Ethernet and run a uniform programs for the future is possible” In 2013, parallel processing between each compute node. Computational Science Research Center had Lustre file server of total capacity 1.5PB by designed the next supercomputer system in RAID-6 is used for storage and through the Infi- collaboration with the information Technology niBand* FDR, it is possible to freely access from Center of the University of Tokyo and has crea- all compute nodes. ”Compute nodes consists of HOW TO

SYSTEM BLOCK DIAGRAM

two MIC boards equipped with the Intel Xeon • MIC only To use four cores of the Intel Xeon Phi coprocessor 7110P and a network board of processor E5-2670v2 and two cores of the Intel InfiniBand FDR, next to one Intel Xeon proces- Xeon Phi coprocessor 7110P (61 core ×2). Using sor E5-2670v2, ensuring high-speed communi- the offload functionality provided by the Intel cation capability between the coprocessor and compiler for offload execution of some opera- the network board. The merit is that just one tions on MIC. Intel Xeon processor E5-2670v2 alone can main- • CPU+MIC Monopolization of all computatio- tain a high-speed performance.” (Mr. Boku) nal resources (CPU+MIC) at the node level. To use hybrid programs in the MIC Native mode USES 3 OPERATING MODES or Symmetric mode.Intel Linux version of the TO SUPPORT PARALLEL PROCESSING development tool “Intel Cluster Studio XE” is OF COMPUTATIONAL NODES used in the programming environment, and COMA operations can handle three jobs viz. researchers use the Intel FORTRAN compiler/ CPU only, MIC only, and a combination of CPU Intel C++ compiler, and the Intel MPI library and MIC and are intended to be used in the fol- as per their needs to create the programs. Mr. lowing way. Boku says, “Since free compilers can also be • CPU only To support the multi-core oriented effective for some programs, we have deve- programs (+MPI) of the previous generation loped an application system in which we do super computer “2K-Tsukuba”, implementa- not specify a particular product for the com- tion of which has been stopped in February piler or library, but instead, allow the resear- 2014. Uses 16 cores of the total of 20 cores in chers to freely choose and use a compiler or the Intel Xeon processor E5-2670v2. library of their choice”. Used for development HOW TO

of computational science applications in fields Japan is running overwhelmingly short of re- such as particle theory and life science COMA searchers in the scientific field who can write can be used by researchers throughout the their own programs, when compared to the si- country without any charge under the “Inter- tuation abroad. Therefore, in order to develop disciplinary joint utilization program” to sup- competence in many of our researchers so that port advanced scientific research in the fields they can write their own computational pro- of computational science and computational grams in various fields of research to perform engineering. And, arrangements have also computations according to techniques of com- been made for its free use under the “HPCI pro- putational science, we are conducting classes gram” promoted by the Ministry of Education, in high-performance computation literacy for Culture, Sports, Science and Technology, and graduate school students. Besides this, we are under the “Large scale general use program” also running intensive courses on introduction which is used by researchers throughout the to high performance computing for 40 to 50 country free of cost. Research scholars belon- graduate school students every year, and conti- ging to the Center for Computational Sciences nuing with our initiatives to send out qualified are carrying out research by using “COMA”’s persons with knowledge of high-performance high speed parallel computing in the develop- computing” ment of computational science applications in And even in our “Global 30 (G30)” study pro- various fields such as the field of Particle Theo- gram for overseas students who can only at- ry, the fields of Astrophysics, Life Sciences and tend classes and obtain credits in English, we Physical Sciences, and the fields of Material are planning to conduct classes on super com- Sciences, Global Environment and Computa- puters such as COMA and HA-PACS directed tional Information. Professor Boku has this to at these overseas students in order to offer ad- say about where to use the large scale parallel vanced educational services. In the Center for GPU cluster “HA-PACS” which first began ope- Computational Sciences, we plan to continue to rations in 2012, and where to use COMA. use COMA in various fields of research and stu- “There is no classification based on the sub- dy the performance of many-core processors ject matter of the research. The researchers in order to gain knowhow and knowledge to can choose as they like, and after careful exa- help in the achievement of high-performance. mination of the subject matter of the research, In addition, we are planning to continue with they can make use of the interdisciplinary joint research and performance evaluation in order utilization program. But, HA-PACS require to achieve the planned launch in 2105 of the programs to be coded for GPU, as a result of super computer using a many-core processor which a large number of programs which run to be jointly operated by the Center for Compu- on it have been optimized for GPU. In contrast tational Sciences and the Information Techno- to this, COMA which uses Intel architecture logy Center, University of Tokyo. is highly generic, and can run many more Finally, Professor Boku says about Intel that programs.”(Professor Boku) “To implement EFLOPS, they are waiting for By delivering lectures on high-performance feedback based on verification and evalua- computing to graduate school students, we de- tion of super computing technology in various velop competence in them to carry out compu- fields”. Intel will continue to support the deve- tations according to the techniques of computa- lopment of inter-disciplinary computational tional science COMA is presently also active in science which is being promoted by the Center the education of students of the Center for Com- for Computational Sciences by improving the putational Sciences which promotes inter-dis- performance of the Intel Xeon processor and ciplinary computational science. In the words Intel Xeon Phi coprocessor, and making advan- of Professor Boku, “Today, the reality is that cements in MIC architecture.

viewpoint Ken Strandberg INSTITUTE FOR DIGITAL RESEARCH AND EDUCATION, UCLA

How Lustre Adds Value for Commercial HPC

growing number of file system (PFS) with I/O capable of hundreds industries require of gigabytes to several terabytes/second. The high-performance com- majority of parallel file systems being de- puting (HPC) clusters ployed today are IBM’s Spectrum Scale* (for- to enable discoveries merly called General Parallel File System*, or hidden in the massive GPFS*) and open source Lustre*. These two amounts of data they solutions present completely different pur- collect. In the field of ge- chasing models for organizations looking to nomics, for example, re- deploy a high-performance data platform for searchers at Iowa State massively large data sets. Lustre, being a com- University work with munity-developed and open source licensed hundreds of gigabytes model, can create significant value for high- for studies of a single ge- performance data solutions. nome. These scientists, working on assembling sequences of Zea mays, are dealing with mul- A LITTLE PFS HISTORY ti-gigabyte genomes, but they must sequence Spectrum Scale (or GPFS) was developed at Aa single genome 150X to get reliable data from IBM in the early 1990s and finally commercia- it for their research. In another example, for lized at the end of that decade. It can be de- climate models used in weather prediction, ployed in different distributed parallel modes the high resolutions of today’s simulations can and is used in large commercial and super- generate as much as 100 terabytes of data, ac- computing applications. It began its life on cording to a “Journal of Advances in Modeling IBM’s AIX systems, but has since appeared on Earth Systems” report by Michael F. Wehner Linux* and Windows Server* systems. and others. Analyses from high resolution Lustre’s architecture began as an acade- seismic data similarly can rely on incredibly mic project at Carnegie Mellon University in large data sets of over 100 terabytes in a single the late 1990s. The file system was developed file. under the Accelerated Strategic Computing To reliably process this magnitude of data in Initiative (ASCI) as part of a project funded by a reasonable amount of time using HPC, very the U.S. Department of Energy that included few storage architectures can keep up with the Hewlett-Packard and Intel. (ASCI drove the computing systems enterprises and academia creation of the first teraflop supercomputer, use. The solution is usually to deploy a parallel the Intel-based ASCI Red installed at Sandia viewpoint

National Labs in 1996.) Today, Lustre develop- ring a level of personal attention and instant ment continues under an open source model, response to customers that rivals the big and with releases managed by OpenSFS*. Intel expensive enterprises. Most of these integra- Corporation contributes the majority of code tors are part of Intel Corporation’s Lustre re- enhancements to Lustre, while also adding seller program, which numbers more than 20 its own set of unique features in several Intel- companies. They can offer highly competitive branded versions that make it more desirable solutions around Lustre. for enterprise and cloud applications and de- ployments. ADDING VALUE THROUGH OPEN SOURCE Open source has been the mode of enterprises PARTNERSHIPS FOR SUCCESS deploying cost-effective solutions built on re- High-performance storage solutions capable liable, performant software for decades. Linux of serving today’s petascale and tomorrow’s is the open source poster child, and Red Hat is exascale computing are complex and large the standard bearer for building a profitable systems with tens of thousands of spinning enterprise business around Linux by provi- or solid-state drives. These are not typically ding enterprise-grade services along with its something an in-house IT department will own productized enhanced offerings around tackle alone. Organizations usually seek out a the Linux kernel, while continuing to contri- partner with expertise in the software, hard- bute to the community’s development tree. ware, and overall requirements for scalabi- Other companies, like Novell with their SUSE lity, optimization, configuration, networking, distribution, have followed the same model. backup, and disaster recovery. And therein When Lustre entered the open source com- lies one of the critical decisions companies munity, several organizations began to form face in finding their high-performance data that wanted to maintain Lustre’s traction and solution—to go with a closed-source, fee-based momentum in HPC. They also grasped the licensing solution with potential vendor-loc- incredible potential of Lustre for both acade- ked-in or expand their choices and flexibi- mia and enterprise. So, they began building lity for the system makeup that incorporates on the same open source model: contributing both hardware neutrality, so purchasers can to the code while offering enhanced product choose from a wide variety of manufacturers and support offerings around the software. and technologies, and an open source licen- This was good for enterprise, because, similar sing model with Lustre. to the expansion of Linux into the enterprise, Many customers choose a proprietary-based company IT managers wanted the assurance solution from enterprises that have strong re- of support and longevity behind Lustre before putations of expertise, reliability, and trust. committing critical applications to the file sys- But, in today’s marketplace, dollars are short tem. and competition is fierce. And choice is a key “Intel is the clear leader in the Lustre Open ingredient that makes the economics of tech- Source project,” according to Brent Gorda, nology efficient, successful, and cost-effective. Intel’s General Manager of their High Per- New company entrants in high-performance formance Data Division (HPDD), the group data solutions designed around Lustre have responsible for Lustre at the company. “We been making serious inroads to the enterprise are the chief contributors to the Lustre code storage market for several years. and have the largest pool of Lustre experts in There is a wide selection of smaller yet the community. Building on the open source equally competent technology integrators version, we offer enterprise-class and cloud around the world who have developed storage versions of Lustre with Intel software enhan- solution expertise with Lustre, while delive- cements and enterprise-grade services to cus- viewpoint

tomers. In this way we are adding value to formance data storage systems in academia open source Lustre and helping commercial and enterprise HPC. And Lustre continues to HPC easily take advantage of the fastest sca- undergo critical enhancements that enter- lable parallel file system on the planet.” prise wants, with releases managed through According to Gorda, Lustre eliminates ven- OpenSFS and the Lustre community. dor lock-in and enables more choices to custo- Intel’s Lustre offerings include Intel® En- mers. If they choose to use a technology par- terprise Edition for Lustre software, Intel® tner for their high-performance data solution, Cloud Edition for Lustre software, and Intel® they can RFP across many vendors to obtain Foundation Edition for Lustre software. Each the optimum hardware technology at the best is designed to meet specific needs in the mar- value, and use Lustre as the software. This ketplace. The Enterprise Edition brings toge- way, they avoid the often steep and changing ther powerful tools and features in the latest license fees that go along with proprietary so- Lustre release that enterprise IT has been lutions, while still getting the support exper- requesting to make it easy to deploy, to signi- tise they need. ficantly expand the number of metadata ser- vers, increase reliability, include hierarchical LUSTRE IN LEADING INSTALLATIONS storage management, and allow it to interface Lustre is the high-performance file system of with Hadoop workloads. Support is available choice for many large installations, including directly and through Intel resellers, giving Lawrence Livermore National Laboratory’s companies greater choice without sacrificing Sequoia supercomputer. Core to the HPC clus- the ease-of-manageability and support they ters at San Diego Computing Center (SDSC) at want in a storage solution. The Cloud Edition, the University of California, San Diego is Data available through the Amazon Web Services Oasis, a scalable, Lustre parallel file system Marketplace, allows customers to stand up with up to 12 PB (petabytes) of capacity and a scalable parallel file system in minutes for exceeding 200 gigabytes/second while sup- any number of applications a customer wants porting 10,000 simultaneous users. Designed to run on Amazon’s Elastic Compute Cloud and deployed by Aeon Computing, also of San (EC2). This open source approach from the Diego, the 72-node system is linked to SDSC’s Lustre community and from companies that Trestles, Gordon, TSCC, and Comet clusters. are building operations around the software, Data Oasis’ sustained speeds mean resear- add considerable value to customers needing chers can retrieve or store 240 TB of data in high-performance storage solutions today. about 20 minutes. Early this year, Data Oasis began undergoing significant upgrades, inclu- ding ZFS, a combined file system originally designed by Sun Microsystems and mated in ABOUT THE AUTHOR a new hardware server configuration under a partnership between SDSC, Aeon Computing, Ken Strandberg is a technical story teller. He writes and Intel. articles, white papers, seminars, web-based training, video and animation scripts, and technical marketing GAME CHANGING OFFERINGS and interactive collateral for emerging technology As the core to today’s Open Source, high-per- companies, Fortune 100 enterprises, and multi-natio- formance, scalable data storage solutions, nal corporations. Mr. Strandberg’s technology areas Lustre offerings, with Intel enhancements include Software, HPC, Industrial Technologies, Desi- and support, present a valuable alternative gnAutomation, Networking, Medical Technologies, to closed-source, proprietary solutions. Thus, Semiconductor, and Telecom. Mr. Strandberg can be Lustre has been a game changer for high-per- reached at [email protected] tech zone tech zone

It is important to understand the workload performance and endurance requirements before making a decision.

ig data applications handle THE BIG PICTURE OF BIG DATA extremely large datasets In a scenario where data grows 40% year-over- that present challenges of year, where 90% of the world’s data was created scale. High-performance in the last two years, and where terabytes (TBs) IT infrastructure is neces- and petabytes (PBs) are talked about as glibly sary to achieve very fast as megabytes and gigabytes, it is easy to mista- processing throughput for kenly think that all data is big data. In fact, big big data. Solid state drives data refers to datasets so large they are beyond (SSDs) based on NAND Flash the ability of traditional database management memory are well-suited systems and data processing applications to for big data applications capture and process. While the exact amount of because they provide ultra- data that qualifies as big is debatable, it general- fast storage performance, ly ranges from tens of TBs to multiple PBs. The quickly delivering an impressive return on Gartner Group further characterizes it in terms investment. SSDs can be deployed as host of volume of data, velocity into and out of a sys- cache, network cache, all-SSD storage arrays, tem (e.g., real-time processing of a data stream), Bor hybrid storage arrays with an SSD tier. and variety of data types and sources. Big data Depending on the big data application, ei- is typically unstructured, or free-floating, and ther enterprise-class or personal storage SSDs not part of a relational database scheme. may be used. Enterprise SSDs are robust and durable, offering superior performance for Examples of big data applications include: mixed read/write workloads, while personal • Business analytics to drive insight, in- storage SSDs typically cost less and are sui- novation, and predictions table for read-centric workloads. It is impor- • Scientific computing, such as seismic tant to understand the workload performance processing, genomics, and meteorology and endurance requirements before making • Real-time processing of data streams, a decision. such as sensor data or financial transactions tech zone sSDs are an order of magnitude denser and less expensive than DRAM, but DRAM has higher bandwidth and significantly faster access times. Compared to HDDs, SSDs offer orders of magnitude faster random I/O performance and lower cost per IOPS, but HDDs still offer the best price per gigabyte.

• Web 2.0 public cloud services, such as SSDS FOR ULTRA-FAST STORAGE social networking sites, search engines, video SSDs have emerged as a popular choice for sharing, and hosted services ultra-fast storage in enterprise environments, including big data applications. SSDs offer The primary reason for implementing big a level of price-to-performance somewhere data solutions is productivity and competis- between DRAM and hard disk drives (HDDs). tive advantage. If analyzing customer data SSDs are an order of magnitude denser and less opens up new, high-growth market segments; expensive than DRAM, but DRAM has higher or if analyzing product data leads to valuable bandwidth and significantly faster access new features and innovations; or if analyzing times. Compared to HDDs, SSDs offer orders seismic images pinpoints the most productive of magnitude faster random I/O performance places to drill for oil and gas—then big data and lower cost per IOPS, but HDDs still offer is ultimately about big success. Big data pres- the best price per gigabyte. With capacity pri- ents challenges of extreme scale. It pushes the cing for Flash memory projected to fall faster limits of IT applications and infrastructure than other media, the SSD value proposition for processing large datasets quickly and cost- will continue to strengthen in the future. effectively. Many technologies and techniques have been developed to meet these challenges, SSD Benefits such as distributed computing, massive paral- lel processing (e.g., Apache Hadoop), and data • Exceptional Storage Performance – structures that limit the data required for Deliver good sequential I/O and outstanding queries (e.g., bitmaps and column-oriented da- random I/O performance. For many systems, tabases). Underlying all of this is the constant storage I/O acts as a bottleneck, while power- need for faster hardware with greater capa- ful, multicore CPUs sit idle waiting for data to city because big data requires fast processing process. SSDs remove the bottleneck and un- throughput, which means faster, multicore leash application performance, enabling true CPUs, greater memory performance and ca- processing throughput and user productivity. pacity, improved network bandwidth, and • Nonvolatile – Retain data when power is higher storage capacity and throughput. removed; no destaging required, like DRAM. tech zone

FIGURE 1. Decision Tree for Enterprise-Class vs. Personal Storage SSD

• Low Power – Consume less power per sys- Drive TCO = Cost of drives + Cost of downtime + Cost tem than equivalent spinning disks, reducing of slowdown + Cost of IT labor + Risk of data loss data center power and cooling expenses. • Flexible Deployment – Available in a unique variety of form factors and interfaces compared to other storage solutions: dant SSDs are recommended for write-back to • Form factors – Half-height, half-length ensure data is protected. (HHHL), 2.5-inch, 1.8-inch, mSATA, m.2, etc. • Network Cache – Similar to host cache, ex- • Interfaces – PCIe, SAS, and SATA cept SSDs reside in a shared network appliance that accelerates all storage systems behind it. SSD Deployment Options Out-of-band cache is read-only, while in-band is write-back. Network cache offers a better • Host Cache – SSDs reside in the host ser- economic benefit because it is shared, but it ver and act as a level-2 cache for data moved can be slower than direct host cache. out of memory. Intelligent caching software • All-SSD Storage Array – An enterprise determines which blocks of data to hold in storage array that uses Flash for storage and cache. Typically, PCIe SSDs are used because DRAM for ultra-high throughput and low la- they offer the lowest latency because no host tency. All-SSD arrays offer features like built- controllers or adapters are involved. Best re- in RAID, snapshots, and replication traditio- sults are achieved for heavy read workloads. nally found in enterprise storage. They may Cache may be read-only or write-back. Redun- include technologies like inline compression tech zone

Personal storage SSDs are designed for good read performance and tailored reliability and durability. They are optimized for workloads where reads are more frequent than writes.

and deduplication to shrink the data footprint The price per gigabyte is 2 to 30 times more for and maximize SSD efficiency. This option pro- enterprise-class SSDs. Big data applications vides additional management of SSDs as it re- in large corporate data centers, like scientific lates to wearout over the entire array. computing and business analytics applica- • SSD Tier in a Hybrid Storage Array tions, are often characterized by mixed read/ – A traditional enterprise storage array that write workloads that require very low latency includes SSDs as an ultra-fast tier in a hybrid and massive IOPS—a good match for durable, storage environment. Automated storage ma- robust enterprise-class SSDs. nagement monitors data usage and places hot Personal storage SSDs are designed for good data in the SSD tier and cold, or less-frequent- read performance and tailored reliability and ly accessed, data in high-capacity, slower HDD durability. They are optimized for workloads tiers to optimize storage performance and where reads are more frequent than writes. cost. This option works well for mixed data, Personal storage SSDs offer high capacity and some of which requires very high perfor- lower price per gigabyte than enterprise-class mance. A variation on hybrid storage is when SSDs. Web 2.0 public cloud applications like an SSD is incorporated as secondary cache in social networking sites are characterized by the storage controller’s read/write cache. users uploading images, video, and audio files, which are subsequently downloaded or strea- CHOOSING THE RIGHT SSD med by other users. This type of write-once, IN BIG DATA DEPLOYMENTS read-many-times workload is a good candi- SSDs in general are rated for 1 or 2 million date for personal storage SSDs. device hours (MTTF), which translates to at least a century or two of operation. NAND APPLICATION CONSIDERATIONS Flash cells only wear out if they are being FOR ENTERPRISE VS. PERSONAL written to. Enterprise-class SSDs are designed STORAGE SSDS for high reliability, maximum durability, and Not all big data deployments are the same, and fast, consistent performance. Enterprise-class not all SSDs are the same. The question is how SSDs last 10 to 1000 times longer than personal to match the right SSD to the right big data de- storage SSDs under write workloads. While ployment. Choosing an SSD solution is based Flash memory performance tends to degrade primarily on the performance and availabi- with use, enterprise SSDs maintain perfor- lity requirements of the application. The deci- mance over time. Write performance is 2 to 12 sion tree in Figure 1 and the following Q&A times better, and read performance is compa- will help you choose the optimal SSD solution rable to or better than personal storage SSDs. for your application. tech zone

The first step is to quantify the workload that SSDs will support.

Q. #1: What is the IOPS performance to wearout, and multiply this figure by the ac- requirement for your application, quisition cost. including the read/write mix? • Cost of Application Downtime – If the application needs to be taken offline to replace The first step is to quantify the workload that an SSD, what is the cost for that lost produc- SSDs will support. An application workload tivity? Multiply this figure by the number of can be measured using a variety of perfor- replacements. mance monitoring tools. Beyond workload, • Cost of Slower Application Perfor- also consider the configuration of the system mance – If the application does not have to go and the impact on the overall platform. offline for drive replacements, but system per- formance will slow during the replacement Q. #2: What are the endurance and subsequent data replication or RAID re- requirements for your application? build, how will this affect user productivity? Multiply this cost by the number of replace- For mixed read/write workloads, it is impor- ments. tant to look closely at SSD durability ratings. • Cost of Labor for Drive Replacement This is usually expressed in terms of total – Drive monitoring and replacement will be bytes written (TBW) or full drive writes per an additional management task for the IT day over a 5-year period. By comparing an staff, so the cost of labor should be included. application’s daily write total with the durabi- • Risk of Data Loss – For unprotected drives, lity rating of an SSD, it is possible to estimate there is a significant risk of data loss, and even the drive’s lifetime in your environment (as- for RAID-protected drives, there is a small risk suming a constant workload; it might be wise during the RAID rebuild window. Though dif- to also estimate future growth). If the write ficult to quantify, these risks should be fac- workload is small enough that the estimated tored into the cost. lifetime of a personal storage SSD will at least equal the 3 to 5 years typically expected of an CONCLUSION IT system, and performance is sufficient, then SSDs are a popular solution for big data appli- personal storage SSDs can be a good choice. cations. Deciding between personal storage However, if personal storage SSDs will likely and enterprise-class SSDs will depend on per- wear out and need to be replaced during the formance and endurance requirements and IT system’s lifetime, then replacement costs TCO. should be considered.

Q. #3: What is the SSD total cost of ownership over 3 to 5 years? MICRON TECHNOLOGY, INC is a global leader in advanced semiconductor systems. Micron’s broad The SSD TCO over the system lifetime includes: portfolio of high-performance memory technologies— • Cost of Drives – Determine how many including DRAM, NAND and NOR Flash—is the basis drives will need to be purchased during a 3- for solid state drives, modules, multichip packages and to 5-year period, including replacements due other system solutions. next issue

#8 tech zone october Why Full 31th, 2015 flash storage is gaining momentum cover story lab review Big Data Microsoft Driving the Active World Directory

interview Sandeep Gupte, Senior Director, NVIDIA Professional Solutions Group

PUBLISHER Roberto Donigni Frédéric Milliot Ramon Lafleur Stacie Loving EDITOR IN CHIEF Renuke Mendis HPC Media Joscelyn Flores Marc Paris 11, rue du Mont Valérien F-92210 EXECUTIVE EDITOR ART DIRECTOR St-Cloud, France Ramon Lafleur Bertrand Grousset CONTACTS CONTRIBUTORS [email protected] TO THIS ISSUE SUBMISSIONS We welcome submissions. Articles must [email protected] Amaury de Cizancourt be original and are subject to editing for [email protected] Steve Conway style and clarity