Compute Canada — Calcul Canada A proposal to the Canada Foundation for Innovation – National Platforms Fund

Hugh Couchman (McMaster University, SHARCNET) Robert Deupree (Saint Mary’s University, ACEnet) Ken Edgecombe (Queen’s University, HPCVL) Wagdi Habashi (McGill University, CLUMEQ) Richard Peltier (University of Toronto, SciNet) Jonathan Schaeffer (University of Alberta, WestGrid) David S´en´echal (Universit´ede Sherbrooke, RQCHP)

Executive Summary The Compute/Calcul Canada (CC) initiative unites the academic high-performance comput- ing (HPC) organizations in Canada. The seven regional HPC consortia in Canada —ACEnet, CLUMEQ, RQCHP, HPCVL, SciNet, SHARCNET and WestGrid— represent over 50 institutions and over one thousand university faculty members doing computationally-based research. The Compute Canada initiative is a coherent and comprehensive proposal to build a shared distributed HPC infrastructure across Canada to best meet the needs of the research community and en- able leading-edge world-competitive research. This proposal is requesting an investment of 60 M$ from CFI (150 M$ with matching money) to put the necessary infrastructure in place for four of the consortia for the 2007-2010 period. It is also requesting operating funds from Canada’s research councils, for all seven consortia. Compute Canada has developed a consensus on national governance, resource planning, and resource sharing models, allowing for effective usage and man- agement of the proposed facilities. Compute Canada represents a major step forward in moving from a regional to a national HPC collaboration. Our vision is the result of extensive consultations with the Canadian research community. 1 Introduction High Performance Computing (HPC) is transforming research in Canadian universities and industry. Computer simulations and models now supplement or even supplant traditional field or laboratory experiments in many disciplines. Massive data-sets from large-scale field experiments are being manipulated, stored and shared. Numerical laboratories open up otherwise inaccessible realms and enable insights that were inconceivable a few years ago. Research worldwide has seen a dramatic increase in the demand for HPC in the traditional areas of science and engineering, as well as in medicine and the social sciences and humanities. In 1999, Canada occupied an inconsequential position in HPC-based research, but that year saw the first funding of HPC by the Canada Foundation for Innovation. The subsequent combination of federal, provincial and industrial funding is enabling Canada to develop a strong foundation in HPC research, train highly qualified personnel and attract international experts.

1.1 Background In 1995, thirty Canadian researchers met in Ottawa to discuss the inadequate computing fa- cilities available in Canada for academic researchers. The action plan arising from this meeting eventually led to the creation of C3.ca (www.c3.ca) in 1997, a national organization for advocating research in high-performance computing. C3.ca represents the imagination, good will and shared vision of more than 50 institutions and thousands of researchers, post-doctoral fellows, graduate students, and support personnel. C3.ca’s vision is to create “a Canadian fabric of interwoven technologies, applications and skills based on advanced computation and communication systems applied to national needs and opportunities for research innovation in the sciences, engineering and the arts.” This vision is still relevant today. The creation of C3.ca fortuitously aligned with a new government of Canada initiative: the Canada Foundation for Innovation (CFI). Since the first CFI competition in 1998 (with results announced in 1999) and subsequent announcements in 2000, 2002 and 2004, Canadian researchers have actively pursued building competitive HPC infrastructure across the country. This has been facilitated by the creation of seven regional consortia of research universities with the mandate to apply for, acquire and operate HPC facilities that would be shared among researchers in their respective consortia. The consortia details are given in Table 1.1, while Table 1.2 lists the major research organizations that have partnered with the consortia. The number of faculty members using HPC has increased from a few hundred in 2000 to over one thousand in 2006 (and that number is growing). Maintaining the HPC infrastructure required multiple CFI applications to be approved each funding cycle. This led to the desire for a more stable and coordinated source of funding. In response, a two-year-long effort culminated in the October 2005 publication of the C3.ca Long- Range Plan (LRP) for high performance computing in Canada (Engines of Discovery: The 21st Century Revolution1), jointly funded by C3.ca, the National Research Council, CFI, NSERC, CIHR, SSHRC and CANARIE. CFI created the National Platforms Fund (NPF) program, in part as a response to the Long Range Plan. It recognized the large funds invested by CFI in shared consortia-based HPC infrastructure, as well as the significant investments made in research-group- specific, non-shared facilities.

1.2 Vision Our vision is for a national collaboration to acquire and support world-class fully-shared HPC in- frastructure across the country, creating an environment that fosters and enables new research insights and advances.

1 The plan is available at www.c3.ca/LRP

– 1 – Consortium Provinces Members ACEnet Newfoundland, Dalhousie U., Memorial U., Mount Allison U., www.ace-net.ca Nova Scotia, New St. Francis Xavier U., St. Mary’s U., U. of New Brunswick, Prince Brunswick, U. of Prince Edward Island. Edward Island Soon to join: Acadia U., Cape Breton U. CLUMEQ Qu´ebec McGill U., U. Laval, UQAM, and all others branches www.clumeq.mcgill.ca and institutes of l’Universit´edu Qu´ebec: UQAC, UQTR, UQAR, UQO, UQAT, ETS, ENAP and INRS RQCHP Qu´ebec Bishop’s U., Concordia U., Ecole´ Polytechnique, U. www.rqchp.qc.ca de Montr´eal,U. de Sherbrooke HPCVL Ontario Carleton U., Loyalist College, Queen’s U., Royal www.hpcvl.org Military College, Ryerson U., Seneca College, U. of Ottawa SciNet Ontario U. of Toronto www.scinet.utoronto.ca SHARCNET Ontario Brock U., , U. of Guelph, Lakehead www.sharcnet.ca U., Laurentian U., Sir. Wilfrid Laurier U., McMaster U., Ontario College of Art and Design, U. Ontario Institute of Technology, , Trent U., U. of Waterloo, U. of Western Ontario, U. of Windsor, York U. WestGrid Alberta, Athabasca U., Brandon U., Simon Fraser U., U. of www.westgrid.ca British Columbia, Alberta, U. of British Columbia, U. of Calgary, U. of Manitoba, Lethbridge, U of Manitoba, U. of Northern British Saskatchewan Columbia, U. of Regina, U. of Saskatchewan, U. of Victoria, U. of Winnipeg

Table 1.1. Academic consortia membership (research hospitals affiliated with many institutions above are full partners in their respective consortia.)

National Institute for Nanotechnology Sudbury Neutrino Observatory Canada Light Source Perimeter Institute for Theoretical Physics TRIUMF Fields Institute for Mathematical Sciences NRC Herzberg Institute for Astrophysics Robarts Research Institute Ouranos Centre de recherches math´ematiques Natural Resources Canada Banff Centre Institut de recherche d’Hydro-Qu´ebec

Table 1.2. Major research partners.

This proposal responds fully to CFI’s integrated strategy for HPC investments. Our proposal calls for a strengthening of the Canadian HPC collaboration and a metamorphosis of C3.ca into Compute Canada (Calcul Canada in French), reflecting the priority investment recommended in the LRP. Compute Canada will serve the HPC needs of Canadian University researchers, regardless of their affiliation. It represents a major (and evolutionary) leap forward in two significant ways. First, instead of thinking regionally we are thinking nationally; all seven consortia are full partners

– 2 – in this proposal to create a national initiative. Second, CFI funds have historically been targeted towards meeting the computational needs of individual consortia (with a CFI requirement for 20% sharing with the rest of Canada). All the consortia are working together to build a world-class fully shared national infrastructure for computational-based research in Canada. We anticipate that the HPC community and this proposal will continue to evolve to meet the needs of Canadian research in the future. Thus, the partners will continue working with the research community to refine and improve the vision. 1.3 Application Process This proposal is based on an extensive consultation with the Canadian academic research community. The formal process began in October 2005 with a CFI-sponsored workshop. However, the consultations began much earlier in preparation for the next CFI application cycle. The proposal has been co-authored by the National Initiatives Committee (NIC), consisting of one representative from each consortium. High-level decisions were approved by the National Steering Committee (NSC), consisting of a Vice President (Research) from each consortium. Consultation with the research community included surveys, submissions, and interviews. Every attempt was made to engage as much of the research community as possible. Although this is a national proposal, the requested infrastructure is targeted to giving four of the consortia a much-needed technology refresh. CLUMEQ was last funded by CFI in 2000, while RQCHP, SciNet and WestGrid were last funded in 2002. These four consortia represent 61% of the faculty members and 69% of the research funding in Canada.2 CLUMEQ, SciNet, and WestGrid have exhausted their funding; RQCHP will complete its acquisitions in 2006. SHARCNET, HPCVL and ACEnet were funded in 2004 and are not yet finished deploying all of their infrastructure. 1.4 Outline This document describes this vision and, with CFI’s help, plans for its realization. This proposal includes the following: • an outline of the history of HPC efforts and investments in Canada (Section 2), • a discussion of past successes —the impact of HPC on Canadian research and development (Section 3), and the potential for HPC to drive innovation (Case Studies), • a national HPC vision and strategy for HPC acquisition, coordination, management, and sus- tainability of the infrastructure (Section 4), • plans for the efficient and effective operation and support of the infrastructure, so as to maximize the benefits to researchers (Section 5), and • a detailed budget and justification for the proposed HPC acquisitions (Budget Justification). A glossary is provided as an appendix to assist with the numerous acronyms used in this document. Finally, the list of CFI evaluation criteria for this program is provided on the last page. References to these criteria appear between brackets in the margins where appropriate, as a guide to the evaluation of this proposal.

2 Impact of Past Investments in HPC The foundations of today’s HPC resources have been built from strategic past investments in both computing infrastructure and support personnel. The result, seven regional consortia with an agreement to share, is extremely efficient and is strongly endorsed by the recent Long Range Plan for HPC in Canada. The LRP’s executive summary cites Professor Martyn Guest of the UK’s Central Laboratory of the Research Councils (CLRC) at the Daresbury Centre: “Canada has invested wisely in mid-range computing over the last five years and has created the best developed, most broadly accessible mid-range High Performance Computing facilities in the world.”

2 Based on 2003-2004 data, the most recent that were available at the time of this writing.

– 3 – 2.1 Facilities and User Base [ 1a ] Canada’s HPC facilities support thousands of researchers, graduate students and research as- sociates (see discussion below), attract bright young academics and students to their universities, and they create a skill development environment critical to the research capability of the HPC user base. They enable universities to offer new, often interdisciplinary and/or multidisciplinary, pro- grams in computational science, engineering, medicine, and the arts. These include computational health and tele-robotics, computational biology and drug design, computational nanotechnology, computational fluid dynamics, aerodynamics and combustion, computational forecasts for weather and the environment, computational oil and gas reservoir modeling, and disaster remediation. Figure 2.1A shows the CFI investments made in the consortia from 1999 to 2006 – 108 M$. Since CFI funds are at most 40% of the cost, the investment has been leveraged to acquire at least 270 M$ of infrastructure. CFI has also invested in large non-shared computer facilities for specialized research that are not part of any consortium. Calculating this investment is difficult, but our best lower bound estimate is 40 M$ (100 M$ including leverage). Figure 2.1B shows that this investment is benefitting a large and growing user community.

25 1200 WestGrid WestGrid SHARCNET 20 SHARCNET 1000 SciNet SciNet HPCVL 800 15 RQCHP HPCVL CLUMEQ RQCHP ACEnet 600 CLUMEQ 10 ACEnet 400 5

Millions of CDN$ 200 Number of accounts 0 0 June 1999 July 2000 Jan. 2002 March 2004 2001 2002 2003 2004 2005 (A) award date (B) year of operation

Figure 2.1: (A) – Left : CFI investments in the consortia, in millions of dollars. (B) – Right : Growth of the user accounts over the years

Table 2.1 shows the CFI infrastructure of the four major consortia being funded in this proposal. The table reflects the fact that there has been no new funding grant to CLUMEQ since 2000, and to RQCHP, SciNet, and WestGrid since 2002. All consortia have stretched their dollars out over many years to minimize the gap between the time the funding is exhausted and the next opportunity for a new CFI application. CLUMEQ, SciNet, and WestGrid are out of funds; RQCHP will finish spending in 2006. The success of this consolidation of university-based resources into well-managed HPC consortia is reflected in the growth of the user base. Figure 2.1B shows the total number of user accounts at the consortia. Counting accounts is easy, but identifying “real” users is a difficult task. The existence of a user account may not translate into a “real” HPC user. Even the definition of a user is not obvious, since some researchers are actively involved in HPC-related work, but do not ever need to log in to an HPC facility (e.g., people who design parallel algorithms, versus those that implement them). Further, the philosophy of shared HPC access complicates the issue since a user may have multiple accounts – one for each consortium. One way of identifying real users is by their usage. In 2005, across Canada there were 1,854 user accounts that utilized at least 10 CPU hours of consortia resources. Of these, 455 used more than 10,000 CPU hours and 96 used over 100,000 CPU hours in the past year (despite, for example, there being few facilities at ACEnet and old facilities at CLUMEQ and SciNet). These numbers are conservative lower bounds of usage,

– 4 – Consortium Affiliation Install Vendor and CPUs Peak RAM Architecture (Gflops) (GB) CLUMEQ McGill 2002 AMD (capability) 256 819 384 SciNet CITA Self assembled (capacity) 538 2,500 264 SciNet High Energy IBM (capacity) 448 2 100 448 SciNet Planetary Physics NEC (vector) 16 128 128 WestGrid U. Victoria IBM (capability) 364 910 728 RQCHP U. de Montr´eal 2003 SGI (SMP) 128 768 512 WestGrid U. of. Alberta SGI (SMP) 256 358 256 WestGrid U. of Calgary HP (capability) 128 256 128 WestGrid U. of BC IBM (capacity) 1,008 6,100 1,008 RQCHP Sherbrooke 2004 Dell (capacity) 872 5,580 1,744 SciNet Aerospace HP (capability) 140 840 360 RQCHP Sherbrooke 2005 Dell (capability) 1,152 8,294 4,608 WestGrid U. of Alberta IBM (SMP) 132 800 520 WestGrid U. of Calgary HP (capability) 260 1,040 520 WestGrid U. of BC IBM (capacity) 672 4,184 832

Table 2.1. Recent acquisitions by CLUMEQ, RQCHP, SciNet, and WestGrid (minimum of 128 CPUs – except for the vector architecture). For an explanation of architecture types, see p. 26.

since they only account for one dimension of HPC, ignoring, for example, memory and disk needs. The community size is much larger than these numbers portray; for example, an active graduate student account may reflect the work of a team including a professor, postdoctoral fellow, and other students. All seven consortia have experienced enormous growth in their HPC user communities over the past five years (Figure 2.1B), and this growth is expected to continue. All facilities across the country are at full capacity, often with long waiting lines for access. This state of affairs is an underestimation of the future reality. CFI’s new policy is to not fund requests for comput- ing infrastructure outside the NPF program if the computing needs can be met by NPF funds. Applicants requesting non-NPF resources from CFI will need to make a compelling case in their application. In the past, CFI has funded dozens of clusters dedicated to specific research projects. Most of these users will not be able to get their equipment refreshed by CFI, and they will turn to the shared consortia resources to meet their future needs. At this point in time, it is hard to estimate the impact of this policy. Conservatively, we expect to see a growth of more than 100 ad- ditional new users (for a community growth of 400) in the next couple of years as a consequence of this new CFI policy. In addition, universities are increasing the number of computationally-based researchers that they are hiring, further increasing the expected size of the user community. The investments in HPC need to increase to accommodate our growing user community. 2.2 Competitive Advantage Nationally and Globally [ 1b ] The nature of research today has demonstrated a compelling need for comprehensive, collabo- rative, and advanced computing resources. These tools provide the capability of solving large-scale scientific problems that were not even imaginable 10 years ago. Today, Canadian consortia are armed with computing resources that enable their member institutions and researchers to be com- petitive both nationally and internationally. As a result, this has attracted and retained excellent researchers, helped reverse the so-called “brain drain” (e.g., the world-class SciNet climate mod- elling group attracted two researchers from outside of Canada), strengthened partnerships among

– 5 – Nov 1993 Nov 1994 Nov 1995 Nov 1996 Nov 1997 Nov 1998 Nov 1999 Nov 2000 Nov 2001 Nov 2002 Nov 2003 Nov 2004 Nov 2005 first Ranking

Figure 2.2: Canadian HPC systems on the Top 500 list. institutions (this proposal being a prime example), and contributed to the country in all sectors of society (see Section 3 and the Case Studies). The CFI investment in HPC has also resulted in several academic sites appearing in the Top 500 list (www.top500.org). The Top 500 list has been used since 1993 to provide a gauge of performance capabilities of systems from around the world. The list is issued twice a year (June and November) and in recent years has seen huge changes with the performance of the minimum entry position (# 500 on the list) increasing at a rate of roughly 40% every six months. Canada has never placed a large number of systems on the Top 500 list. The current list has six Canadian systems, three of which are academic systems, one in a government research organization, and two in the industry sector. For comparison, the United States has placed 33 in the academic sector, 82 in research organizations, and 156 in the industry sector. Over the past five years, systems from the different consortia have (often briefly) entered the list. Currently, two systems from RQCHP and one from WestGrid are on the list. Figure 2.2 shows the history of Canadian academic systems on the list. Of note is how quickly they disappear from the list, illustrating how important stable and continued funding is to being competitive in HPC-supported disciplines and sectors. Computational-based research has become increasingly competitive. Access to more parallel processing capacity, faster machines, larger memories, and bigger data repositories allow one to address leading-edge, so-called “grand challenge” problems. Internationally, it is a race and those with access to the best resources often win. There are no prizes or patents awarded for being second to solve a research problem. Pre-CFI, most Canadian research was not competitive in these areas because of a lack of access to competitive resources. The advent of CFI has changed this, but Canada is still lagging behind the international efforts. For example, numerous countries have facilities beyond that of anything in Canada, including Japan (Top 500 list entries #7, #12, #21, #38, #39), Spain (#8), The Netherlands (#9), Switzerland (#13, #31), South Korea (#16), China (#26, #42), United Kingdom (#33, #34, #46), Australia (#36), and Germany (#37). The United States has the most powerful HPC resource in the world – roughly 45 times more powerful than Canada’s highest entry in the Top 500 list. Spain’s top entry is roughly five times more powerful than the best such facility in Canada — a massive computational edge for Canada to overcome if it wants its researchers to be internationally competitive.

2.3 Attracting and Retaining Excellent Researchers [ 1c ] One of the reasons for Canada’s HPC research success is our ability to attract and retain world-class researchers. The combination of federal, provincial and industrial funding has enabled

– 6 – Canada not only to encourage Canadians to return home and international experts to relocate here, but also to keep the country’s best minds living and researching here. Since 1999, the investment in computational infrastructure by CFI and its provincial, industrial and university partners has rejuvenated computational research across the country and stopped the drain of faculty, graduate students and skilled technology personnel in HPC-related areas of study. The number of researchers working on HPC-related research projects has increased from a few hundred in 2,000 to roughly 6,000 today. All institutions in Canada use the accessibility to HPC facilities as an important recruiting tool. There are 163 Canada Research Chairs (CRC) who are currently benefitting from the consortia HPC resources. User surveys showed an additional 13 CRCs who planned to use the facilities in the near future. Some consortia have access to funds to accelerate the recruitment process and to strengthen their retention efforts. For example, SHARCNET has fostered and grown its research community through its Chairs and Fellowships program, specifically in the key research areas of computational materials, computational finance, bio-computing, and HPC tools. Its Chairs program has secured 13 world-class researchers into tenure-track faculty positions at its partner institutions. Of this number, six were recruited from outside Canada, one was from industry, one was from another province, and five were used for retention. SHARCNET has also awarded Fellowships to 123 provincial researchers, including 24 PDFs, 10 international visiting fellows, 43 graduate fellowships, and 40 undergraduate fellowships. HPCVL has a similar program and has awarded 52 HPCVL–Sun scholarships. Both of these initiatives were partly funded through a Province of Ontario program. HPC infrastructure has acted as magnet for attracting HQP (highly qualified people), includ- ing postdoctoral fellows, graduate students, and programmer/analysts. As one data point, the WestGrid survey of their user community (November 2005) had 72 postdoctoral fellows and 101 graduate students indicate that HPC access was a major factor in deciding which university to attend. Of these, one third came from outside Canada. The survey also provided some illustrative quotes of the importance of HPC as an attraction and retention tool: “Our involvement with WestGrid has without a doubt been instrumental in attracting programmer-analysts wanting to acquire high-performance computing skills as well as senior research assistants.” (Gordon Brod- erick, University of Alberta, Project CyberCell). “Without (WestGrid) facilities, I would not be able to conduct the research we are currently doing and I would not be able to attract the people I have.” (Kenneth Vos, University of Lethbridge). “While WestGrid wasn’t the main reason for their choice to come here, it did play an important role in their decisions to come, and more so in their decisions to stay.” (David Wishart, University of Alberta).

2.4 Enhanced Training of Highly Qualified Personnel [ 1d ] Maintaining a leadership position in the global research community means enhancing not only the infrastructure in place, but also the skills and training of the personnel who use, operate, and enhance the HPC infrastructure. The potential for training highly qualified personnel is immense. There are over 1,000 investigators and many more graduate students and post-doctoral fellows with access to Canada’s HPC facilities at any one time. The shared facilities create an environment for skill development that is critical to Canada’s ongoing research capability. Support teams made up of these personnel are essential in helping to minimize the challenging startup period for researchers learning to work with HPC resources. Training sessions, “How-To” series, and skill development workshops connect researchers with highly trained technical support staff whose lending of knowledge, experience, and guidance results in more effective use of the HPC infrastructure (there were over 2000 registrants in consortia-sponsored HPC courses offered in 2005). This skill set is also imparted to students and postdoctoral fellows, giving them both the scientific knowledge and the programming experience necessary to create new computational

– 7 – methods and applications in their various fields, eventually leading to dramatic new insights. Interactions of HPC support staff with graduate students and PDFs will provide a fertile training ground to develop the next generation of researchers. The highly collaborative environment that has emerged from HPC research in Canada has produced a web of HPC facilities and technical analysts that has created an effective and pivotal support network. For example, the NSERC-funded Technical Analysts Support Program (TASP – see Sect. 5.6 below) initiative for HPC is a program that assists in the training of highly qualified personnel through research, including programmer analysts, visualization experts, and network engineers. These individuals have the opportunity to be trained on a variety of systems and develop computational and visualization skills on a wide spectrum of HPC resources. The TASP program is ideal for joint industrial/academic research projects and allows students to develop contacts across the country. By sharing solutions across scientific fields, they have direct and frequent access to professionals in other fields. Also, the development of highly qualified personnel and the subsequent movement of these people among organizations and sectors constitute the most effective form of technology transfer. In addition, through the use of Access Grid,3 technical and research presentations are being broadcast across the country (to roughly a dozen sites). The current proposal will expand the usage of Access Grid by ensuring that all participating institutions have an appropriate Access Grid meeting room (see p. 30 below).

2.5 Strengthened Partnerships Among Institutions [ 1e ] Another benefit of past HPC investments has been the pan-Canadian collaborations they have sparked. The model of geographical cooperation that has evolved over the last four years has proven to be an extremely cost-effective and efficient way of kick-starting Canadian expertise in HPC. Each of the seven consortia leverages the value of regional cooperation. Getting institutions to work together towards common goals has been a major success of past HPC initiatives. Over time, most of the consortia have grown, as institutions see the benefit of working together. For example, MACI was the province of Alberta’s HPC initiative in 1998. With British Columbia universities joining MACI in 2001, WestGrid was created. Following the addition in 2005 of the University of Victoria and of the provinces of Manitoba and Saskatchewan, WestGrid now encompasses all academic research institutions in four provinces. The seven consortia represent over 50 institutions, plus numerous industrial and research insti- tute partnerships. The HPC facilities are already shared across the country, with many consortia reporting external usage in excess of the 20% CFI target. This sharing has fostered national co- operation and good will. Further, the national pool of applications analysts (see Sect. 5.6) has created a distributed but shared resource of HPC expertise, allowing, for example, a researcher in Vancouver to get HPC assistance from an analyst in St. John’s. TASP is run by C3.ca, which represents the HPC interests of Canadian researchers. C3.ca is the best example of strengthening partnerships. This researcher-driven initiative led to the Long Range Plan, the TASP program, and a national strategy for HPC advocacy, and underlies the focus and scope of this proposal. Last summer (2005), two of Canada’s largest distributed computing environments were con- nected over a dedicated high-speed optical link. The new bridge between SHARCNET and WestGrid represents the first step towards a pan-Canadian network of HPC facilities. “CA*net 4 was built with applications like this in mind,” said CANARIE President and CEO Andrew K. Bjerring. While the move does not yet fully integrate WestGrid and SHARCNET’s facilities into a unified computing “grid”, the dedicated high bandwidth connection means researchers working at mem- ber institutions can share and transmit massive amounts of data, with virtually no constraints

3 Access Grid is an open source suite that supports large-scale distributed meetings, collaborative work sessions, seminars, lectures, tutorials and training. See www.accessgrid.org.

– 8 – on bandwidth. Discussions are under way with CANARIE to extend this arrangement across the country to include all seven consortia. Efforts to maximize the use and power of existing HPC resources in Canada is leading to innovative collaborative approaches. One interesting example of creating a world-class competitive facility in Canada is the CISS project (Canadian Inter-networked Scientific Supercomputer), led by Paul Lu (University of Alberta). Dr. Lu and his team have developed the Trellis software package, which allows for the sharing of computational resources using a minimal software infrastructure on a host system. To date, four CISS experiments have taken place. The most recent one, in 2005, had all the consortia contributing resources to the project. Over two days, over 4,000 processors were used, allowing two researchers to get a total of 20 years of computing done. These kind of initiatives illustrate how the consortia can work together and share to support research excellence.

2.6 Resources Used To Their Full Potential [ 1f ] Canada’s HPC resources are currently over-subscribed and constantly being pushed, by the very research they support, to be faster, more powerful and more accessible. The CISS experiments illustrate this point, and demonstrate the need for access to computing resources well in excess of what any single consortium can obtain. In effect, all the computational resources across the country are fully used, sometimes with long queues. For example, WestGrid’s shared-memory computers have had periods of eight-day waits to get access to the machine. Similarly, the WestGrid capacity cluster (1680 processors) is 100% busy, with an average of 2000 jobs waiting in the queue to run. The advent of regional consortia saw not only a giant leap forward in the power of available HPC resources, but at the same time an increase in inter-institutional migration of users in order to use the architecture most appropriate to their needs. For instance, users with needs for a capacity cluster could use such a facility at a different institution, instead of running serial jobs on a relatively more expensive SMP located at their own institution. Critical to the success of the HPC investment is the work of application analysts (see Sect. 5.6 below). These are trained scientists, many holding a PhD, whose role is to provide specialized technical assistance to researchers. They may spend a few days to a few weeks working on a particular code, looking for ways to optimize the performance, often by instrumenting the code, and parallelizing the code if needed. They also provide general training on HPC, bring new users up to speed, etc. For instance, a RQCHP analyst at U. Montr´eal parallelized a user’s code with OpenMP, obtaining 90% of the peak performance on 128 CPUs. Another RQCHP analyst at Sherbrooke parallelized and ran a quantum dynamics code on up to 800 CPUs. A CLUMEQ analyst at McGill has parallelized several computational fluid dynamics and computational aero-acoustics codes to nearly 90% efficiency on a 256-processor machine. A WestGrid analyst reorganized an application’s data structures to increase the overall parallel efficiency by a factor of 10. An analyst at ACEnet was able to increase the use of a 128-node cluster at Dalhousie. His proactive initiatives resulted in the system moving from six users and roughly 60% CPU utilization, to one with 23 active users and virtually 100% utilization. A group of analysts at RQCHP designed a set of high- level scripts, called bqTools, to allow users to submit literally hundreds or thousands of jobs on a cluster with a single command. bqTools is an interface between the user and the PBS queueing system that submits multiple copies of the same code with different input files, generated from a common template and the user’s specification of the range of parameters to explore. This set of tools is crucial in making efficient use of the RQCHP clusters.

2.7 Bringing Benefits to the Country [ 1g ] The investments of CFI have resulted in world-competitive research that would not be possible without the requisite computing infrastructure. Section 3 and the Case Studies go into more detail about the scientific outcomes past, present, and future.

– 9 – In the June 1993 Top 500 list, academic systems comprised 29% of the entries (research systems 31% and industrial systems 31%). The current list (Nov. 2005) has academic systems comprising just 14% of the entries (research 24% and industry now comprising 53%). More than anything, this is reflecting the increased role that HPC is playing in the commercial segment of the world’s economy and the increased importance training in the use of HPC will play in preparing workers to contribute to that economy. Canada is far behind on the industrial HPC side. Whereas we can argue that the number of Canadian academic facilities on the Top 500 list at 10% of the United States is an appropriate number (3 versus 33), the disparity on the industry side is astounding (1 versus 156 facilities). This shows that benefits of HPC to Canadian industry are still in their early stages.4 There is huge potential for increased competitiveness in industry as they realize the benefits of HPC—something that can really only happen if the needed expertise, toolsets, experience, and HQP are available. The CFI investments in HPC have not had sufficient time to mature, allowing students to graduate and use their skills to create new companies or enhance existing ones. The United States Top 500 numbers clearly show the significance of HPC to industry and, thus, the critical need for Canada to foster development of HPC-related expertise. This being said, there clearly is significant HPC presence in the Canadian economy. Oil compa- nies in Alberta use computing clusters to perform seismic analysis and determine the best drilling locations for oil. The quality of Environment Canada daily forecasts has risen sharply by roughly one day per decade of research and advancements in HPC hardware and software (Environment Canada has a perennial Top 500 entry). Accurate forecasts transform into billions of dollars saved annually in agriculture and natural disaster costs. In Qu´ebec, aerospace companies such as Bom- bardier and Pratt & Whitney are heavy users of HPC for all design. The rapid dissemination of the SARS genomic sequence through the Internet allowed groups all over the world to partici- pate in the computational analysis of the genome and the 3D modeling of the protein sequences. Visualization systems allow medical doctors to see patient organ structures in 3D, foresters and geologists to visualize landscapes as if they were standing in them, and engineers to view and refine their designs before having to manufacture them. Many such examples are given in Section 3.

3 Success and potential: HPC and Canadian Innovation This section provides a sample of Canadian innovation using HPC, in the context of past ac- complishments, current needs and the benefits accrued to Canada, and across a variety of scientific fields. The discussion is supported by more specific case studies appearing later in the proposal. The following is not in any way an exhaustive description of HPC-based research in Canada, but rather a sampling. It is understood that the researchers named in this section will not have, on that basis, greater access to the infrastructure than the many more researchers who are not cited.

3.1 Elementary Particle Physics This field seeks to unveil the fundamental constituents of matter and their interactions. The international ATLAS collaboration is undertaking one of the largest scientific experiments ever, by designing, building, and operating the giant ATLAS particle detector. ATLAS is located at the 27 km circumference Large Hadron Collider (LHC) at the European Centre for Particle Physics (CERN). This detector is expected to start acquiring data in 2007, after which a growing volume of data will fuel an unprecedented worldwide data analysis effort. At present, 33 Canadian particle physicists (PIs) at universities from BC to Qu´ebec are part of the ATLAS collaboration; R. Orr (SciNet) is the spokesman of the Canadian group, and M. Vetterli (WestGrid) is the co-ordinator

4 Undoubtedly, the number of Canadian HPC sites is under-reported on the list. But then that is true for all countries.

– 10 – of ATLAS computing in Canada. HPC needs for data storage and analysis will be satisfied by a worldwide infrastructure: the LHC Computing Grid (LCG), as explained in more detail in Case Study 7.1, p. 37. Canada will contribute a Tier-1 facility for large-scale, collaboration-wide computing tasks. This is made possible by a separate CFI grant. Tier-2 needs, to continue the [ 2a,2b ] analysis for the extraction of physics results by various analysis groups or individual researchers, will be met by the shared use of the National Platform. This makes the most efficient overall use of computing resources. The requirements for disk storage are substantial and the analysis can be carried out on capacity clusters, with standard interconnect. Particle physicists have been [ 1a ] running ATLAS simulation software, as well as developing grid computing tools for a few years now, notably on equipment at WestGrid, SciNet, and RQCHP. Many of the tools for distributed analysis have been tested and used in production for simu- lation and data analysis by the D0 and CDF experiments at Fermilab (Chicago), and by BaBar (Stanford). This work has used facilities in the existing consortia. Canada is also known for the Sudbury Neutrino Observatory (SNO), a large underground neutrino detector. The SNO collabo- ration is headed by A. McDonald (HPCVL). It uses facilities at HPCVL (data storage and detector simulation) and at WestGrid (simulation).

3.2 Astrophysics Understanding the origin of the Universe, the formation of structures such as planets, stars and galaxies, catastrophic events such as the formation of a black hole or the supernova explosion of a massive star are some of the key questions facing contemporary astrophysics. Canadian re- searchers are among the international leaders in all of these areas. Astronomy and astrophysics in Canada ranks third in the world in terms of impact behind only the US and the UK. HPC plays a pivotal role in all of these investigations. Broadly speaking, computational needs divide into data analysis, and modelling and simulation. The last decade has seen dramatic advances in the volume and quality of data from ground- and space-based observatories and a huge rise in the effort to extract the key scientific results from these data. A prime example is the pio- [ 1a,2a ] neering work of R. Bond (SciNet) to analyse data from satellites measuring the character of the Cosmic Microwave Background. Observations at other wavelengths are also increasingly demand- ing access to large-scale HPC for reduction and analysis; examples include the large-scale optical surrvey work of R. Carlberg (SciNet), D. Schade (WestGrid) and M. Hudson (SHARCNET). The theoretical counterpart of this observational work is the numerical simulation of complex systems. The range of scales involved, several hundred billion in mass, drives a relentless requirement for the largest parallel systems available. This is particularly true of simulations of large-scale cosmic [ 2a ] structure, galaxies and star formation as is typified by the work of J. Navarro (WestGrid), U.-L. Pen (SciNet), H. Couchman (SHARCNET), J. Wadsley (SHARCNET) and H. Martel (CLUMEQ). These [ 1a,b ] researchers and other Canadian astrophysicists have also authored or co-authored several of the leading simulation codes now used worldwide including Gasoline, HYDRA and Zeus. M. Chop- tuik (WestGrid) is one of the world’s leading authorities on the numerical solution of the general relativistic problem of coalescing black holes. R. Deupree (ACEnet) has developed numerical meth- ods for stellar hydrodynamics that can treat, for instance, close binary stars, also the object of L. Nelson’s (RQCHP) research. P. Charbonneau (RQCHP) models solar activity through detailed magneto-hydrodynamic computations. Forthcoming experiments and observations drive an anal- [ 2a,b ] ysis requirement for serial farms with several thousand processors, whilst on the simulation side a great deal of effort has been invested over the last decade to develop and optimize parallel codes and the necessity is for large capability clusters with high performance (low latency and high bandwidth) interconnects. Case Study 7.2 (p. 38) provides more detail on specific needs and past accomplishments.

– 11 – 3.3 Chemistry and biochemistry Chemistry is one of the fields in which HPC is used by a substantial fraction of the commu- nity. This is due in part to the wide availability of high-performance electronic structure software, which allows one to predict the conformation and chemical function of molecules (so-called ab initio computations).Canada counts many established groups whose expertise in this field is recog- [ 2a ] nized worldwide (Boyd at ACEnet, Ziegler and Salahub at WestGrid). Ziegler’s work includes many [ 4a ] practical applications to industrially important processes, in particular in the field of catalysis; his group is involved in collaborations with several industrial partners. Becke (HPCVL/ACEnet), Ernzerhof (RQCHP) and Ayers (SHARCNET) are internationally known for their development of new methods/theories in Density Functional Theory. Many other groups in Canada are well known for the application of these methods to molecules or solids (Cˆot´e(RQCHP); Stott, Zaremba, St- Amant, Woo (HPCVL); Tse, Wang (WestGrid)). J. Polanyi (Nobel Laureate in Chemistry, SciNet) applies ab initio calculations to the reactions of organic halides with surfaces that result in nano- patterning. Further understanding of this nano-patterning by geometrically-controlled chemical [ 2a,2b ] reactions will require large-scale parallel computing, since 50 to 100 atoms should be used in the simulation. Computational chemists using ab initio methods are, as a group, among the largest users of HPC resources in Canada. Over the last few years, many new faculty members across [ 1c ] Canada have been hired in this field (e.g. Bonev (ACEnet), Iftimie (RQCHP), Schreckenbach (WestGrid)), which reflects the strong activities of this area of research. Ab initio calculations, [ 2b ] depending on the size of the problem, may be handled on a single processor – but within paramet- ric studies, requiring many instances, i.e., capacity clusters – or on distributed or shared memory machines. Case Study 7.3 (p. 40) reports on the specific needs associated with calculations of the electronic structures of complex molecules or solids. Increasingly complex structures are be- [ 2b ] ing investigated (e.g. metal-organic frameworks) and HPC needs are evolving from capacity to capability computing. A particularly HPC-intensive branch of physical chemistry is coherent control, which seeks ways to guide the quantum mechanical motion of electrons in molecules with the help of extremely short laser pulses (10−15 seconds) to control chemical reactions at the molecular level. Canadian [ 1a ] chemists such as A. Bandrauk (RQCHP), P. Brumer (SciNet) and M. Shapiro (WestGrid) are internationally recognized leaders in this field and have been major HPC users on the Canadian scene. As suggested by HPC simulations, extremely short and intense laser pulses may also be used [ 4b ] to accelerate electrons to relativistic speeds, thus making possible the advent of table-top particle accelerators that could replace bulkier technologies in the context of nuclear medicine. Coherent [ 3f ] control is an excellent example of interdisciplinary research (chemistry, physics, photonics). The [ 2a,2b ] specific HPC needs of this field are cutting-edge: for instance, very large distributed memory with low-latency interconnect are used in order to solve the Schr¨odinger equation in the presence of a laser pulse in real time. Even more powerful resources will be necessary to treat the Schr¨odinger and Maxwell equations on the same footing, i.e., to incorporate the quantum coherent nature of molecules and the classical coherent aspects of light. This increases the dimensionality of the system of partial differential equations to be solved, dramatically increasing the computational effort. The problem of coherent control is a particular application of quantum dynamics. Another example is the study of quantum effects in the motion of nuclei and their implications on re- action rate constants, photo-dissociation cross sections and spectra, as studied by T. Carrington (RQCHP). These quantum effects must be treated in order to correctly understand many biological and combustion processes, and such calculations require solving huge systems of linear equations and computing eigenvalues of matrices whose size exceeds a million. Large memory systems, such [ 2b ] as SMP computers, are needed, and the required memory increases exponentially with the num- ber of particles involved. For this reason, larger systems of molecules must be treated within the simpler, approximate framework of classical dynamics, sometimes with some quantum mechanical

– 12 – input (semi-classical dynamics). R. Kapral (SciNet) has developed such methods to simulate pro- ton/electron transfer reactions, and to develop an understanding of how such reactions take place in complex chemical and biological environments. G. Peslherbe (RQCHP) develops and employs similar methods to investigate clusters of atoms and molecules, either as novel materials with tailored properties or as a tool to unveil the fundamental role of solvation in chemistry and bio- chemistry. The work of G. Patey (WestGrid) focuses on the theory and simulation of dense fluids and interfaces in the framework of Statistical Mechanics. The fundamental research conducted in Patey’s group encompasses phenomena of immense practical importance for chemical as well as biological systems. All of these problems involve the classical dynamics of thousands, or even [ 2b ] millions of particles (much like many problems in astrophysics) and, depending on scale, may be treated on capacity clusters with large memory or on capability clusters. 3.4 Nanoscience and Nanotechnology Nanoscience deals with structures that can be fabricated or have emerging properties at the nanometer to micrometer scale. This involves contributions from physics, chemistry, electrical en- [ 3f ] gineering, and some efforts are models of multidisciplinary research. Much activity in nanoscience [ 4a ] and nanotechnology is motivated by the continuing drive towards miniaturization in the micro- electronics industry, which needs new paradigms for nano-devices in order to continue: transistors based on small clusters of atoms or even single molecules (molecular electronics). At this level, an institution like the National Institute for Nanotechnology (NINT), based in Edmonton (West- Grid), plays an important leadership role. For instance, T. Chakraborty (U. Manitoba, WestGrid) is studying the interaction of electron spins in quantum dots measuring only a few nanometers in diameter, by using monster matrices to calculate the probabilities of electron spin jumping from one level to another. R. Wolkow (WestGrid) and collaborators have shown through exper- [ 1a,1b ] iment and simulations that the electrostatic field emanating from a fixed point charge regulates the conductivity of nearby molecules, thus showing the feasibility in principle of a single-molecule transistor, a discovery with huge potential impact in nano-electronics. Understanding how the atomic-scale structure affects the electronic properties of materials on a larger scale is the gen- eral objective of L. Lewis (RQCHP), who leads a long-established group that resort to a variety of computational schemes: Ab initio calculations for small systems, molecular dynamics for very large systems of several million atoms, simulated for very long times (billions of time-steps). HPC [ 2b ] needs are therefore varied, from capacity to capability clusters. Research on quantum materials deals with more fundamental aspects that may have an important impact on nanotechnology. This fields focuses on properties of materials that are es- sentially quantum mechanical (i.e. not accessible to classical approximations) but that go beyond the single molecule. For instance: understanding the mechanism for high-temperature supercon- ductivity or the effect of impurities in metals (case Study 7.5, p. 43). This class of problems (often referred to as Strongly correlated electrons) has been the object of intense numerical effort in the last 15 years, in particular in the development of new algorithms. It has become a major HPC consumer worldwide, both for capacity and capability architectures. Exotic superconductivity is the focus of the work of A.M. Tremblay and D. S´en´echal (RQCHP) with quantum cluster methods, of A. Paramekanti (SciNet) using variational wave-functions, and of G. Sawatzky (WestGrid) and Th. Devereaux (SHARCNET). E. Sorensen (SHARCNET) and I. Affleck (WestGrid) made key con- tributions to the study of magnetic impurities, as have M. Gingras (SHARCNET) and B. Southern (WestGrid) on frustrated magnetism. Canadian researchers in this field have already performed [ 1a,1b ] world class calculations (exact diagonalization of matrices in the hundreds of GB of memory, ca- pacity calculations with hundreds of CPUs over many days) that make them leaders in their field. Access to large capacity and capability is essential in order to maintain their competitive edge. [ 2a,2b ] Closely related to quantum materials are efforts at designing and simulating physical realizations of quantum computers in which quantum interference effects are controllable (A. Blais (RQCHP), F. Wilhelm (SHARCNET)).

– 13 – 3.5 Environment Science Climate modelling. Future climate change as a result of human activities is an issue of great [ 1g,4a,4b ] social, economic and political importance. Global climate modelling is key to both public policy and business strategy responses to global warming, the polar ice melt and long-term pollution trends. The same modelling methods are also being applied to better understand such high-profile environmental issues as water scarcity, water basin management, forest fires, ocean current and temperature changes that influence local climate and fisheries, and long term trends in ozone levels. The policy debates around some of these issues are intense – driving intense demand for better scientific models and simulations. Canada is a world leader in weather forecasting, climate modelling and prediction. This is driven both by the Meteorological Service of Canada (MSC) and by strong research programs in many universities. The work of R. Peltier (SciNet) on climate change is particularly relevant to the current debate on the impact of human activity on the global climate. Case study 7.6 (p. 44) [ 1a,2a ] deals precisely with this question. Modelling of climate change at the regional (continental) scale is the focus of the Canadian Regional Climate Modelling and Diagnostics Network (CRCMD), led by CLUMEQ scientists such as R. Laprise and C. Jones. High northern latitudes are the region [ 1e,3f,4b ] of Earth that we expect to be most strongly affected by greenhouse-gas-induced global warming. Since the Canadian land-mass and adjacent shelves of the Arctic Ocean constitute a major portion of this region, the issue of the stability of northern ecosystems is an important national concern. The Polar Climate Stability Network (PCSN) brings together many Canadian researchers involved in both observation and computation, with the goal of assessing and predicting the effects of global warming on Canada’s northern climate. Oceans play a vital role in climate change because of their ability to store and transport heat [ 1g,4a,4b ] and carbon. The Labrador Current is responsible for the presence of large numbers of icebergs off the Atlantic Canadian coast that pose a hazard to shipping and the offshore oil and gas industry. The oceans also provide a significant fraction of the global food supply, an important sector in the Canadian economy. The Canadian effort in ocean modelling is spread over a large number of [ 1e,3f ] Canadian universities at WestGrid, SHARCNET, SciNet, CLUMEQ and ACEnet, as well as MSC and the Department of Fisheries and Oceans (DFO). Expertise exists over the whole range of space and time scales, from the role of the oceans in climate, the role of mesoscale eddies in driving the ocean and tracer transport, and small scale mixing processes. In particular, eddy currents, [ 2a,2b ] with length scales on the order of tens of kilometers, play a vital role in the ocean circulation, transporting heat and carbon, yet ocean circulation models with sufficient resolution to adequately resolve eddies and important bathymetric features are computationally very demanding, and are currently beyond the reach of Canadian researchers. For instance, an eddy-resolving model of the North Atlantic with eight tracers, including biogeochemistry, and 45 vertical levels, would require 84 days on a single node SX-8 (vector computer) for one 20 year integration. On a different level, [ 2a,2b,3f ] Project Neptune (based at U. Victoria, WestGrid) involves sea-floor based interactive laboratories and remotely operated vehicles spread over a vast area. It will enable researchers to study processes previously beyond of the capabilities of traditional oceanography. This project requires state-of- the-art communication and data storage infrastructure. Canadian researchers (WestGrid, HPCVL, CLUMEQ, ACEnet) also participate in the international ARGO project, that collects data from thousands of floats disseminated across the world’s oceans and distributes it through a data grid centered in Toulouse. The value of the models used in environment science depends, of course, on their quality and [ 2a,2b ] accuracy, hence on increasing HPC performance. One of the great difficulties is the existence of many characteristic time scales in the same problem. HPC needs in this field are met by fine- grained parallelism, which traditionally means parallel vector computers, although large capability clusters are also increasingly used (by MSC for instance).

– 14 – Forestry. Canada has over 418 million hectares of forests, representing 10% of the forested land in the world. With this 10%, Canada is a global leader in some of the areas of forestry research, such as remote sensing research in forestry, forest monitoring and carbon accounting. The impact of [ 4a,4b ] this research plays a pivotal role in the timely analysis of results for forest management, forest in- ventory, forest industry, and public information. HPC helps manage and preserve Canadian forests in many ways. For instance, through Genomics research on forest pathologies. Laval University (CLUMEQ) researchers aim to develop methods and strategies for the biological control of diseases of trees and microbial invaders of wood. The latter includes the use of genomic approaches for large-scale profiling of microbial populations that may harbour potential biocontrol agents. HPC [ 2b ] needs in this sector are those of Genomics in general: large capacity clusters. Also, HPC resources are crucial for monitoring the state of our forests. Forest researchers can now access an innovative, Canada-wide data-storage and management system for large-size digital images, a project initi- ated by the Canadian Forest Service and the University of Victoria (WestGrid). SAFORAH, or [ 1g,2a,4b ] System of Agents for Forest Observation Research with Advanced Hierarchies, is a virtual-world networking infrastructure that catalogues, stores, distributes, manages, tracks, and protects the earth-observation images that the Canadian Forest Service retrieves from remote-sensing sources to measure such things as forest cover, forest health, and forest-carbon budgets. Beyond the essential [ 2b ] storage and network resources, capacity clusters are also needed to analyse digital images.

3.6 Energy [ 4a,4b ] The energy sector is arguably the most important component of the global economy from the strategic point of view, and Canada is a major energy producer and consumer. Offshore oil re- sources offer many benefits to the economy of Atlantic Canada, and there too HPC is playing an important role. For instance, T. Johansen (ACEnet) aims to develop new simulation methodologies and tools that will assist in marginal fields and deep-water applications. Enhancing the recovery rate of a single field like Hibernia by 2% could yield an additional 500 M$ in oil revenues. Be- yond this specific project, ACEnet institutions have formed the Pan-Atlantic Petroleum Systems Consortium (PPSC), to “develop world-class technology and skills, which not only meet the needs of the petroleum sector in Atlantic Canada but also position the region as a leading exporter of innovative products and services to marine and offshore markets worldwide”. Hydroelectricity also benefits from University-based HPC research. A. Soulaimani’s (CLUMEQ) finite element code, tailored for free surface flows, is used by Hydro-Qu´ebec for river engineering simulations. A. Skorek (CLUMEQ) is simulating electro-thermal constraints in power equipment (transformers, circuit breakers, cables etc.). The goal here is to minimize the thermal losses in such equipment, thus generating power savings. A. Gole (WestGrid) is developing parallel electromagnetic simulations for the design and operation of reliable power networks in co-operation with Manitoba Hydro. Canada also relies heavily on nuclear energy. S. Tavoularis (HPCVL) is conducting high resolution simulations of the thermal-hydraulic performance of the CANDU nuclear reactor core and pioneering simulations aimed at predicting the performance of Generation IV supercritical water nuclear reactors. Canada has made important contributions towards the use of hydrogen in the transportation sector, which may help reduce urban pollution and CO2 emissions. However, production and storage issues, as well as the absence of specific standards for hydrogen as a vehicle fuel, are widely regarded as obstacles to the introduction of hydrogen on the transportation energy market. Canadian researchers within the Auto21 project (CLUMEQ, SciNet, WestGrid and RQCHP) are de- veloping a scientific and engineering basis for determining standards and industry practices specific to hydrogen, focusing on refuelling stations and hydrogen-fuelled vehicles, taking into account the hydrogen storage technology (compression, sorption and liquefaction). Specifically they develop CFD models based on validated HPC simulations to estimate quantitative clearance distance cri- teria, emergency response and the location of leak detectors inside vehicles. Fuel cells are also a

– 15 – promising technology involving hydrogen; A. Soldera (RQCHP) collaborates with General Motors via a NSERC Collaborative Research Development grant on applying computational Chemistry to the Study of Proton Exchange Membranes. In most of these examples capability clusters are [ 2b ] required. 3.7 Aerodynamics and Aerospace The global aerospace industry is extremely competitive, with a constant push for leading- edge aircraft, jet engines, avionics, landing gear and flight simulators. The aerospace industry’s challenge is not only to design the most efficient and safest product, but also to be on time, on budget, and within regulatory requirements. The Canadian aerospace industry, ranking fourth in the world, is a 23 B$/year industry of which two thirds are exports. It employs 80,000 people, and involves all aspects of aircraft/rotorcraft/engine design, manufacture, simulation and operation. In Canada, the industry’s strategy in accelerating its 1 B$/year R&D has quite visibly included a significant expansion of its computational resources. Through HPC, large-scale multi-disciplinary and multi-scale approaches are permitting the analysis and optimization of several limiting factors simultaneously, while new optimal design approaches are generating configurations that not only have significantly improved performance, but are indeed “optimal”. The Canadian aerospace industry has a tradition of strong university-industry interaction through large research grants and several NSERC Industrial and Canada Research Chairs. Many [ 1a,1d ] aerospace scientists and innovators in Canada continue to have a national and international scien- tific profile and impact, and to succeed in training a large number of highly qualified Masters and PhD graduates grounded in HPC who are quickly absorbed by Canadian industry. For example, W. Habashi, A. Fortin, S. Nadarajah, A. Soulaimani, J. Laforte (CLUMEQ), D. Zingg, J. Mar- tins, C. Groth (SciNet), A. Pollard (HPCVL), J-Y. Tr´epanier, and M. Paraschivoiu (RQCHP), in their ensemble, develop models and mathematical algorithms to analyze and optimize large-scale multidisciplinary CFD problems that couple aerodynamics, conjugate heat transfer, ground and in-flight icing, acoustics, structures and combustion. A measure of their success is that many [ 1g ] of their codes and by-products have been adopted and are in daily use by the major aerospace companies in Canada and, in the case of W. Habashi’s codes, around the world. In the last funding round, large-scale parallelism has enabled Canadian aerospace researchers [ 1a ] to tackle three-dimensional problems involving more than one discipline. In the current round, [ 2a ] flow unsteadiness will become a major focus, given the significant increase in computing power. See Case Study 7.8 (p. 47) for more details.

3.8 Industrial Processes [ 1g,4a,4b ] Canada is the world’s third most important aluminium producer (after China and Russia), with about 10% of the global output. Much of the Aluminium industry related research in Canada is conducted at the Aluminium Research Centre – REGAL (CLUMEQ and RQCHP). HPC plays an important role, for instance, in simulating electrolysis cells. A virtual cell is operating at [ 1d ] REGAL, for research and training. REGAL studies all limiting aspects of aluminium electrolysis reduction cells. This requires the solution – either sequentially or in a coupled manner – of problems involving magnetohydrodynamics, structure and conjugate heat transfer that necessitate large HPC resources. ALCAN and ALCOA are two important industrial partners of REGAL. K. Neale (RQCHP), from REGAL, uses HPC for in-depth modelling of metal alloy microstructure with application to key industries involved in metal fabrication and forming processes. His program is to establish a rigorous framework, consisting of original numerical tools and innovative experimental techniques that will become the platform to engineer new material systems for specific ranges of practical applications, particularly in the areas of civil engineering infrastructure and metal forming. The finite element code developed in his group requires a large distributed memory [ 2a,2b ] facility, particularly because of the move from two-dimensional to three-dimensional modelling.

– 16 – Canada is also one of the world’s leading pulp and paper producers. This industry has long established roots in the country but needs a growing research sector (much of it based on HPC) to face international competition. F. Bertrand (RQCHP) uses HPC to simulate the consolidation of paper coating structures. More generally, his work on numerical models for the mixing of granular materials has applications in many sectors of the chemical industry. Again, capability clusters are [ 2a,2b ] needed. The Centre de recherche en pˆateset papiers of UQTR (CLUMEQ) deals, as part of its mission, with HPC simulations of pulp and paper processes. 3.9 Genomics and Proteomics Sequencing the genome of living organisms (humans, plants, bacteria and viruses) has become a major industry that pervades all the life sciences. It helps scientists understand mechanisms for, [ 1g,4b ] or resistance to, disease at the DNA level, thereby opening the way to the conception of new vac- cines, antibiotics, genetic tests, etc. The Canadian genomics community is large, influential, and well supported by Genome Canada and corresponding provincial/regional agencies. The HapMap [ 1a,2a ] project (200+ researchers worldwide, Nature cover story, 27 Oct. 2005) is a major Canadian initiative put forward by T. Hudson (CLUMEQ). Its objective is to chart the patterns of genetic variation that are common in the world’s population. The results obtained so far provide convinc- ing evidence that variation in the human genome is organized into local neighbourhoods, called haplotypes. Genomics research relies heavily on HPC as it involves matching together a very large number of overlapping puzzle pieces (parts of the genome) into a coherent whole. This requires computerized search and pattern matching, as well as access to large databases. Such databases must be explored with machine learning algorithms, such as the ones developed by T. Hughes and collaborators (SciNet). In general, research in genomics requires large data storage and access to [ 2a,2b ] capacity computing. See Case Study 7.9 (p. 49) for more detail. Genomics is an area where technology transfer is accelerated by the use of HPC, as is the [ 1a,1g ] case with the Montreal-based company Genizon. Previous attempts at disease gene identifica- tion using family linkage analysis, while very successful at identifying genes for simple monogenic human diseases, have generally failed when applied to complex diseases characterized by the inter- action of multiple genes and the environment. However, a more powerful gene mapping approach, called Whole Genome Association Studies (WGAS), has recently become possible. But the sys- tematic examination of hundreds of thousands of markers for statistical association with disease requires massive computational resources. The analysis workload and HPC requirement increases as researchers zero in on specific genes and attempt replication of results in different populations. Genizon, in association with CLUMEQ, has recently analysed data from five WGAS, and will com- plete an additional three studies in the near future. Already, Genizon has pinpointed up to 12 of the genes that cause Crohn’s disease, an affliction of the bowel – compared with the two genes that were previously known. Many such studies are in the works within this effort. [ 4a,4b ] As another example, the Canadian potato genome project (led by ACEnet) aims at relating genomics information to the potato’s practical traits, especially those related to disease resistance and suitability for value-added processing. The potato crop is one of the world’s four most impor- [ 4a,4b ] tant, and the tuber is eaten by over a billion people daily. In Atlantic Canada alone, more than 1,000 potato farms employ more than 10,000 people. Proteomics is a logical follow-up of genomics and studies how amino acid sequences, generated from a piece of genetic code, fold into a three-dimensional structure by the simple action of inter- atomic forces and, occasionally, the help of other proteins. This is just one step away from predicting the chemical and biological activity of the protein, which is one step from understanding the results of a genetic modification. There are about 100,000 proteins in the human genome and the structural information about these proteins is critical in understanding how they work for drug development, cell differentiation, protein production and regulation. HPC plays a dual role [ 2a,2b ] in protein science: (1) it is essential in analysing the data from X-ray diffraction experiments that

– 17 – allow scientists to observe the 3D structure of proteins, such as the one conducted at the Canadian Light Source facility in Saskatchewan, and (2) it is used to predict the shape of proteins from the amino acid sequence, in what is already a major field of research in HPC. Many Canadian researchers study the structure and interaction of proteins using molecular dynamics simulations. For instance, J.Z.Y. Chen (SHARCNET) investigates the folding properties of prions – such as those [ 1g,4b ] causing BSE (Mad Cow Disease) – at the molecular level and attempts to re-construct the process by which prions mysteriously convert themselves from a normal form into a disease-rich structure. P. Lag¨ue(CLUMEQ) carries simulations of cell membrane fusion and fission. The HPC needs [ 1a,1c,2b ] in proteomics are enormous: folding a 60-residue protein using the ART algorithm developed by N. Mousseau (RQCHP) takes 4 weeks on a single-processor machine, but tens of folding trajectories must be generated to describe the statistics of the folding mechanism. Again, Case Study 7.9 refers to these needs. 3.10 Medical research Cardiac Research. Heart disease is the most important cause of death in the western world. With major improvements in the treatment of acute cardiac infarction in place, further reduction of mortality depends more and more on our understanding of the amazingly complex cellular processes that govern the electrical activation of the heart, and their relation to clinically measurable signals such as the electrocardiogram. HPC is crucial because macroscopic tissue modelling requires simulation at several millions of points. Canada has a strong position with several researchers who are recognized worldwide for their work in large-scale simulations of the heart, both in model development and in clinical applications. Interactions between these groups and with clinical and experimental investigators are intensive, [ 1a,2a,1e,3f ] guaranteeing a high practical impact of their work. The group of S. Nattel (RQCHP) works on atrial fibrillation, a condition that affects about 10% of people over 70 years of age and can have life-threatening complications, such as stroke and heart failure. To learn more about the cellular [ 1a,1g ] and molecular mechanisms behind atrial fibrillation, this group has used computer models of the electrical activity in the heart’s upper chambers and conducts simulations to see if new or existing drugs might be beneficial. J. Leon (ACEnet) and E. Vigmond (WestGrid) use simulations to explore ways of improving implantable defibrillators, small devices that are implanted under the skin, containing a sophisticated microprocessor and a battery. The device constantly monitors [ 1g,4b ] the hearts rhythm, so that when a cardiac patient has an episode of sudden cardiac infarction, the defibrillator delivers a strong electrical shock to start the heart pumping again. Implantable defibrillators have become indispensable in patient care and monitoring. M. Potse and A. Vinet [ 1a,2a,4b ] (RQCHP) use one of the world’s largest and most realistic heart models to make better sense of surface electrocardiograms, i.e., to better understand the relation between the clinical data and heart disease on the cellular level. The problem requires solving systems of tens of millions of [ 2b ] coupled differential equations, which is best accomplished on large-memory SMP computers. R. Ethier (SciNet) and R. Mongrain (CLUMEQ) study blood flow through the heart and arteries using hydrodynamics. Current work focuses on blood flow through heart valves, where transitional and turbulent flow is present. Modelling of such flows to improve heart valve and stent design is [ 4b ] clinically important. Understanding blood flow is also a crucial component in our interpretation of magnetic resonance scans. Imaging. Magnetic resonance (MR), CAT and even PET scans are now commonplace in the lives of Canadians. These weakly intrusive techniques have dramatically improved diagnosis quality over the last two decades and correspondingly increased the quality of life. All of these techniques have a strong computational component, and research towards improving these techniques (better resolution, better models, etc.) relies on HPC. In neuroscience, as an example, every step of an [ 1g,4b ] MR brain scan relies on fast and effective computing, which produces three-dimensional images that may then be stored. Just as demanding are the data mining aspects of imaging. Thousands of

– 18 – brain scans may be needed to determine the appropriate treatment. Archived brain scans contain a wealth of information that doctors and researchers may mine to analyse and understand brain function and disease. Brain research in Canada (e.g. A. Evans (CLUMEQ), M. Henkelman (SciNet), T. Peters (Robarts, SHARCNET)) is increasingly making use of large brain image databases, often constructed from research sites around the world, to understand brain disease and treatment. HPC [ 2a,2b ] needs are centered on storage and data transfer (tens of Terabytes are needed), as well as capacity computing. See Case Study 7.10 (p. 50) for more detail. Radiotherapy. Knowing the effect of ionizing radiation on human tissues in general and genetic ma- [ 4b ] terial in particular is highly relevant to cancer treatment by radiotherapy and to radio-protection. J.-P. Jay-G´erin(RQCHP) simulates the penetration of fast electrons in aqueous solutions or cells. One of his objectives is to test a hypothesis according to which biochemical events in the cellular membrane or mitochondria trigger the cell’s response to low doses of radiation. These simula- [ 2a,2b ] tions, conducted on a capacity cluster, are being moved towards a capability cluster where larger problems can be addressed. On the diagnosis side, S. Pistorius (WestGrid) uses multidimensional [ 4b ] image processing to register, segment, and quantify the types and location of tumours for patient treatment at CancerCare Manitoba. Pharmacology. The design of new molecules from ab initio calculations – the expression in silico is used – is now a standard procedure in the pharmaceutical industry. The pharmaceutical sector [ 4a ] is an important component of the economies of developed countries and of Canada in particular. Quantum chemistry and molecular dynamics simulations using HPC allow testing of the function- alities of new molecules before engaging in the long process of organic synthesis of these molecules. Computer assisted molecular design (CAMD) is used not only within drug companies, but also at [ 3e ] many research universities, where specific drug research is conducted and future drug designers learn their trade. For instance, D. Weaver’s group (ACEnet) is focusing on the design and synthesis [ 4a, 4b ] of novel drugs for the treatment of chronic neurological disorders, such as epilepsy and Alzheimer’s dementia. Attempts are made to correlate basic science with clinical science, thereby enabling a “bench-top to bedside” philosophy in drug design. An example of a Canadian success story is [ 1g ] Neurochem, a pharmaceutical company located in Qu´ebec that has long been heavily involved in HPC developments. Originally spun out of Queen’s University, it has now grown into a mid-sized drug company with more than 200 employees. With promising drug candidates currently in clini- [ 4a ] cal trials and expanding drug development programs for other research projects, Neurochem is a leader in its development of innovative therapeutics for neurological diseases. In another project, [ 1g ] J. Wright (HPCVL) develops and tests synthetic antioxidants that may someday help slow down the aging process and promote greater health in our senior years.

3.11 Social Sciences and Humanities Canadian HPC is also reaching out to non-traditional areas. For instance, in psychology, D. Mewhort (HPCVL) works on the concept of human performance and how to model it. Current work centres on computational models for perception and memory with an emphasis on decision and choice in speeded-performance tasks. Projects include studies of semantic acquisition and representation, recognition memory, reading and lexical access, pattern recognition, attention, etc. The computational aspects of this research are carried out on capability clusters. HPC research in [ 2b ] the humanities is by and large based on text gathering and analysis. The TaPoR group (WestGrid, SHARCNET, SciNet, RQCHP, ACEnet) develops digital collections and innovative techniques for analyzing literary and linguistic texts (see Case study 7.11, p. 52). HPC plays an important role in econometric model estimation in areas such as labour eco- nomics, transportation, telecommunication and finance. In most cases, model estimation involves dealing with huge micro-level databases that capture information at the level of individuals, house- holds or firms (this is known as micro-econometrics). Although a wide range of topics is covered,

– 19 – the micro-econometric models that are formulated all share commonalities. They are usually for- mulated at a level of sophistication that requires one to depict or to mimic decisions at the micro level. Such a high level of refinement translates, most of the time, into a prohibitive level of computational complexity, which is attributable to the computation of multidimensional integrals. Theoretical work on simulation has led to accurate simulators that can reliably replace those integrals with easy to compute functions. Such simulations usually run on capability clusters. B. Fortin (CLUMEQ) leads a group that uses huge micro databases and sophisticated econometric models to analyse the impact of social policies. D. Bolduc and L. Khalaf (CLUMEQ) work on a new generation of discrete choice models that involves the presence of latent psychometric variables in the econometric models. 3.12 Information Technology HPC by itself can be the object of research with important HPC needs. Within this area, the field with arguably the most impact on the way HPC resources are used is grid computing. Today, grids enable a number of large scale collaborative research projects that would not have been possible a decade ago. The major issues addressed by grid research include security mechanisms to enable controlled co-operation across institutional boundaries, creation of open standards for interoperability and scalability issues faced by automation tools. Grid researchers do not have large computational needs over those of the application scientists they work with, but do need priority, sometimes exclusive, access to resources to test new solutions and assistance from site administrators when deploying new software systems. It is also important to have access to a range of systems of different architectures so that interoperability issues can be fully explored. Past and present successes of Grid research in Canada include: (1) The Trellis Project (P. Lu, WestGrid) has [ 1a–f ] created a software system that allows a set of jobs to be load balanced across multiple HPC systems, while also allowing the jobs to access their data via a distributed file system. Along with many partners, it performed a series of on-line, Canada-wide, and production-oriented experiments in 2002 to 2004: The Canadian Internetworked Scientific Supercomputer (CISS). For instance, from September 15 to 17, 2004, in a 48-hour production run, CISS-3 completed over 15 CPU-years of computation, using over 4,100 processors, from 19 universities and institutions, representing 22 administrative domains. At its peak, over 4,000 jobs were running concurrently. (2) High-energy [ 1a–f ] physicists at WestGrid and computer scientists from NRC have established a computational grid (GridX1) using clusters across Canada, as a testbed for ATLAS. (3) One part of the Canadian [ 1g,4b ] DataGrid project involves managing sensor data collected on the Confederation Bridge that links Prince Edward Island to New Brunswick. Data collected at the bridge are registered and replicated at WestGrid and HPCVL and are used to predict critical structural problems that could be caused by ice flows. This predictive information is used to trigger preventative ice engineering measures and to suggest times that the bridge should be closed. (4) The Grid Research Centre (GRC) based at the University of Calgary collaborates on projects with researchers in academia and industry in Canada, the US and Europe. In a joint project with HP Labs, automatically deployable model based interoperable monitoring tools are being created. These tools are already deployed and used on WestGrid systems. Data mining is about extracting the full value from massive datasets that could not, in many cases, be attempted without HPC and sophisticated tools. The best-known examples are the Inter- net search engines, notably Google, which use data-mining techniques (and huge HPC resources) to rank web pages, images, etc. in order of relevance. Research on data mining algorithms also requires large HPC resources (large capacity clusters, and now capability clusters as well). For instance, J. Bengio (RQCHP) works on Statistical Learning Algorithms (SLA). These allow a com- puter to learn from examples, to extract relevant information from a large pool of data and, for instance, to estimate the probability of a future event from a new context. Recent progress in this [ 1a, 4a, 4b ] field has led to many scientific and commercial applications. Bengio’s laboratory receives contri- butions and contracts from business partners that benefit from his research: insurance companies, pharmaceuticals, telecommunication services, Internet radio, financial risk managers, and so on.

– 20 – Data mining is one aspect of the more general field of Artificial Intelligence (AI). One of the early objectives of AI research was to demonstrate the technology by building programs capable of defeating the best human players at playing games of skill. The first such demonstration was J. Schaeffer’s (WestGrid) work that resulted in a computer winning the World Man-Machine Checkers Championship in 1994 (the first program in any game to win a human title). This result heavily depended on extensive use of HPC resources (hundreds of processors, primarily located in the United States due to a lack of available Canadian resources at that time). More [ 1a ] recently, Schaeffer’s group has used HPC resources (large shared-memory systems) to build a strong poker-playing program, developing new techniques for automated reasoning in domains with incomplete and imperfect information. Navigating through the vicissitudes of the real world requires the ability to reason when only partial information is available, and a healthy skepticism about the veracity of the information that is available. Thus poker is a much better test-bed for exploring these ideas than games like chess. The research has implications in many areas, especially negotiations and auctions. The poker research has been commercialized by a Canadian [ 1g ] company (www.poker-academy.com). There are a number of leading research groups in visualization across Canada, at WestGrid (the IMAGER, SCIRF, IRMACS, AMMI, Graphics Jungle and the IMG groups), SciNet (the DGP group) and RQCHP (the LIGUM group). These groups play a significant role in the international graphics and visualization community. This leadership is exemplified by Tamara Munzner (West- [ 1a,2a ] Grid) being one of the co-authors of the US NIH/NFS Visualization Research Challenges report (2005). Case Study 7.12 (p. 54) provides more detailed information about specific project needs. Numerical methods using HPC also arise in the fields of transportation and network planning. Using operations research algorithms, T.G. Crainic, M. Gendreau and B. Gendron (RQCHP), for instance, are solving, often in real time, large-scale problems related to traffic control, be it traffic of people, vehicles or data. B. Jaumard (RQCHP) studies the problem of real-time design and maintenance of a communication path going through several satellites in motion. These are [ 2b ] examples of large-scale combinatorial optimization problems that must often be solved in real time, usually with the help of a capability clusters. 3.13 Mathematics While advanced mathematics is involved in almost all the research described above, mathe- maticians are more and more using computation as a central tool of pure mathematical discovery. J. Borwein (ACEnet) uses advanced combinatorial optimization to hunt for patterns from very high [ 1a ] precision floating-point data. This floating-point computation is very intensive and a single run typically involves more than 5,000 CPU-hours. His two 2004 books on Experimental Mathemat- ics (with D. Bailey, Lawrence Berkeley Labs) have already been cited in the National Academy of Science report Getting Up to Speed: The Future of Supercomputing (June 2004). P. Borwein [ 1g ] and M. Monagan (WestGrid) have worked with MapleSoft and MITACS to successfully integrate such tools into Maple, the premier (Canadian) computer algebra package, along with F. Bergeron (CLUMEQ) and R. Corless (SHARCNET) who are also deeply involved in such co-development with more emphasis on dynamical systems and special function evaluation. N. Bergeron (SHARCNET) applies parallel computer algebra methods in combinatorics while Kotsireas (SHARCNET) has had great success implementing Groebner basis techniques to solve non-linear equations. Financial mathematics has assumed critical importance in the Canadian financial industry. At [ 1g,4a ] the heart of modern-day financial mathematics is the modelling of the pricing and hedging of option contracts. The central importance of option and derivatives pricing stems from the pivotal role of options in the daily risk management practices and trading strategies employed by leading financial institutions. P. Boyle (SHARCNET) is internationally known for his broad contributions [ 1a ] to computational finance. At CLUMEQ, K. Jacobs and P. Christoffersen are well-known for their work on option valuation and for fixed income products such as bonds. It is worth mentioning that [ 1d,1g ]

– 21 – financial mathematics has attracted many young theoretical physicists and mathematicians who have been trained to think quantitatively and to critically use or design approximate models. This is one example of the impact of highly qualified personnel (HQP) trained in a fundamental science and contributing to the Canadian economy in a priori unsuspected ways. T. Hurd (SHARCNET) is an example of the above in the research/academic sector.

4 A National Strategy for HPC 4.1 The National Vision The current Canadian HPC model, described in Sect. 2, consists of seven regional consortia providing shared resources within their borders, with limited sharing with the rest of the country. This proposal is for a new model in which the regional consortia first select together and then manage locally a distributed infrastructure that is shared globally across Canada, within a collab- orative national entity called Compute Canada. Compute Canada, in which researchers, consortia [ 3h ] and University administrators will have representation, will ensure that the proposed platform effectively functions as a national infrastructure, and establish procedures such that the current NPF round is only the first of a series that will provide Canada with a sustained world-class HPC infrastructure, together with the personnel needed to operate it efficiently. Thus, Canadian researchers needing HPC will have access to a variety of architectures and will [ 2b,2c ] be able to select, for a particular scientific task, the one most appropriate to their needs. The proposed model for national sharing will serve the needs of researchers more effectively, in a more comprehensive way than the regional model brought about in the past through the individual consortia. The installed infrastructure will be used in an even more efficient manner than before because of the larger pool of potential users and load balancing between different facilities at the national level. 4.2 National Sharing of Resources: Principles and implementation The heart of the Compute Canada proposal must therefore be the principles intended to govern national resource sharing and the mechanisms proposed to manage the collectively acquired HPC infrastructure. In designing a strategy there are clearly a series of balances that need to be optimized: between the mix of architectures, between the degree of local as opposed to national resource allocation, and between the provincial and individual institutional jurisdictions, etc. In previous CFI competitions, each consortium was requested to commit a minimum of 20% of its resources for use by other Canadian scholars not affiliated with that consortium. The current vision goes well beyond this in making all Compute Canada resources, past and future (i.e. not only those acquired with support from the current NPF but also those previously funded by CFI at the consortia) accessible to all Canadian researchers at publicly funded academic and research institutions. National sharing of Compute Canada resources across consortia is the sine qua non mode of efficicent operation. Research needs that cannot be fulfilled by local facilities, in terms of machine type or in terms of cycle availability, will be addressed by application to the facilities at other locations. In order to maximize the use of the resources, it is proposed to operate under the following set of guidelines. 1. Each consortium will be responsible for providing access to, and user support for, the local computational resources. 2. All users will obtain a local default allocation of resources, irrespective of their affiliation within Canada. Each consortium shall provide a single point of access for requests for its resources from any researcher in Canada. Resource allocations beyond this default level, i.e., special or extraordinary requests, will be set by local and national resource allocation committees (see below).

– 22 – 3. A local Resource Allocation Committees (LRAC), attached to each consortium, will treat special requests from researchers belonging to that consortium. 4. Special requests that cannot be accommodated within the consortium, either because of over- subscription of resources or for lack of the appropriate architecture, shall be promptly chan- nelled to a National Resource Allocation Committee (NRAC). 5. Non-local users (from other consortia) who have been granted access through the procedures outlined above will have the same privileges and responsibilities as local users. 6. The priority attached to the use of resources by individual researchers will be set by fair-share principles. The idea behind fair-share is to increase the priority when scheduling jobs of the user groups who are below their fair-share target, and decrease the priority of those user groups that are above their fair-share. In general, this means that when a group has recently used a large number of processors, they will have a lower priority on jobs waiting to run. All users get a default fair-share allocation. Projects requiring more resources can apply to the local or national resource allocation committee for an increased fair-share allocation. 7. Software acquired with Compute Canada funding and installed on specific systems shall also be made available to non-local users, subject to licensing constraints. 8. Information concerning Compute Canada resources and application procedures will be made available on a national web-site in both official languages and will be linked to consortia web- sites. 9. The current intent is that there will be no differential fees to limit access to or sharing of these resources. This will be possible only if sufficient funding is provided to cover operating costs.

Compute Canada Board of Directors up to 26 members SAC Scientific Advisory Committee 4 members Executive Others VP(R) Representatives up to 10 representatives 8 members * 8 members * (Users, Industry, CANARIE, etc.)

*comprised of 8 regional representatives : 1 from Atlantic Canada NRAC 2 from Québec National Resource Staff 3 from Ontario Allocation Committee 2 from Western Canada

Figure 4.1: The Compute Canada Council

4.3 Governance Model [ 3g,3h ] The challenge of devising a structure capable of effectively managing the shared National Plat- form resource is to balance the operational effectiveness of local control with the need to ensure and promote national accessibility. The national structure proposed here will be a construct of the Canadian university community, and governed by a national Compute Canada Council that is composed of a number of distinct elements, as follows. 1. A Compute Canada Council (CCC) will be formed to oversee the access and sharing of the NPF-funded resources in order to ensure fairness and efficiency . This Council will be com- prised of up to 26 members, as follows: An Executive Committee (EC) comprised of 8 regional representatives (1 appointed by the consortium in Atlantic Canada (ACEnet), 2 appointed from

– 23 – the consortia from Quebec (RQCHP and CLUMEQ), 3 appointed by the consortia from Ontario (SHARCNET, HPCVL and SciNet), and 2 appointed by the consortium from Western Canada (WestGrid). This group would play the role of the present National Initiatives Committee in future rounds of the NPF process. A National Steering Committee (NSC) comprised of repre- sentatives of the Vice-Presidents Research of the lead institutions of each of the existing HPC consortia, with the regional distribution as described above. It is envisioned that this group would play the same overseeing and dispute resolution role in future rounds of the NPF pro- cess, as it has very well played in the current round. A User’s group comprised of as many as 10 others representing academic, industrial, government and other users of HPC resources. A Nominating Committee within Compute Canada will select such representatives, who will probably will be invited to serve on a rotating basis. An organizational chart for the proposed Compute Canada Council is provided in Fig. 4.1, which displays not only the above described three primary elements that will make up the Board of Directors (BoD) but also displays the additional elements of the proposed structure that will report to, or be advisory to, the BoD. These additional elements are: 2. The Executive Committee, whose role will be to overview and co-ordinate, between Board meetings, the national aspects of Compute Canada’s operations. This includes: (1) making sure that the above sharing principles are applied effectively, which can be done by periodically reviewing statistics concerning access requests for, and usage of, the installed facilities. (2) co- ordinating any action by consortia and local staff to implement a national vision: construction and co-ordination of web sites, a knowledge base for user support across Canada, national workshops and conferences, etc. (3) reporting to the Canadian HPC community on the activities of Compute Canada. 3. A National Resource Allocation Committee (NRAC), whose role is to decide on large resource allocations referred to it by the LRAC’s in order to rapidly and efficiently enable research projects requiring large HPC resources not available locally. The NRAC will consist of at least one representative from each consortium and will strive for balance across disciplines. 4. A Scientific Advisory Committee, comprised of distinguished advisors on HPC from outside the Canadian HPC community, will annually review the performance of Compute Canada, and serve as a standing sounding board for the BoD. 4.4 Application Context In assessing this proposal, it is important to recognize several constraints that strongly influence the implementation of a national vision. 1. The CFI NPF was formally discussed in October 2005, with a call for proposals only being finalized in February 2006. 2. CFI provides at most 40% of the funds. An equivalent share comes from the provinces of Canada and at least 20% from industry (mostly in the form of vendor discounts, beyond the academic discount). Most Canadian provinces insist that their investments in CFI-funded infrastructure must be spent in that province. Consequently, the HPC infrastructure requested in this proposal is distributed across the country. 3. Within a consortium, major HPC infrastructure may have to be distributed across several institutions, to strengthen the integration of that institution to the global collaboration and to benefit from the human and physical infrastructure already in place. Ideally, given the national (CANARIE) and regional high-speed networks, the location of the resources should not matter. In practice it does. We view this as a strength of the proposal since it enforces a distributed solution, giving it robustness, limiting single points of failure, maximizing provincial and institutional leverage and bringing support staff closer to more users. 4. We have done an extensive consultation with the Canadian academic research community. The four consortia expecting major funding have conducted extensive surveys of their HPC com- munities. For example, more than 200 PI’s responded to WestGrid’s internal web poll; RQCHP

– 24 – has established a list of about 150 PI’s following its survey; CLUMEQ has produced a 108-page assessment of its user base and needs; SciNet has produced a detailed document outlining the research excellence that HPC infrastructure is essential in supporting. We have attempted to be as thorough as possible in reaching out to the entire Canadian research community. We strongly feel that the infrastructure proposed in this document has sufficient variety and flexibility so as to be beneficial to virtually all Canadian researchers. 5. This proposal should be viewed as a major evolutionary step in creating a truly national shared HPC infrastructure. The research community will continue to work to refine and improve the vision.

4.5 The proposed infrastructure Use of funds in the first NPF round. The LRP called for public investments of 76 M$/year in HPC infrastructure and personnel, including a high-end facility, near the top of the top500 list. The current level of NPF funding, and the timing, do not allow for such a high-end facility, nor for funding a national organization as proposed in the LRP. The NPF funding is more commensurate with the consortia-level funding proposed in the LRP. Moreover, the passage from consortium- based funding to the NPF model cannot be abrupt: the structures in place (the consortia) already operate with a high degree of efficiency and they must be maintained at least for a few years. In addition, Canadian provinces provide matching funds for CFI’s contribution and therefore a regional, if not provincial, distribution of the infrastructure is still a necessity. Thus, the NPF will fund infrastructure still geographically located and managed by the cur- rent consortia, but based on agreement on architectures and distributions that facilitate full and effective sharing of resources between consortia, as described above. Three of the seven consortia – SHARCNET, HPCVL and ACEnet – have been awarded major funding in 2004 and are currently in a deployment phase (years 2006 to 2007). They agreed that the bulk of the funding of this first NPF round will be channelled through the other four (WestGrid, SciNet, RQCHP and CLUMEQ), at levels that roughly reflect both the size of the institutions involved and the depreciation of their current infrastructure. In particular, RQCHP agreed to a smaller allocation in this application than its historical funding would dictate, given that it is in the process of acquiring the balance of its infrastructure. An analysis of the depreciated installed equipment at all consortia (not shown here) has indicated that the distribution of funds the consortia agreed upon (Table 4.1) is equitable in terms of consortium size, historic funding and province population. Future NPF rounds will strive to maintain balance across consortia or regions.

WestGrid 20.0 M$ SHARCNET 1.2 M$ SciNet 15.0 M$ HPCVL 0.8 M$ CLUMEQ 15.0 M$ ACEnet 0.5 M$ RQCHP 7.5 M$ Total 60.0 M$

Table 4.1. Proposed CFI funding of the regional consortia in the current NPF round.

This being said, the current proposal is not about dividing spoils between regional consortia. While the choice of architectures located at each consortium reflects to an extent its specific needs, the ensemble has been carefully debated and selected to satisfy the needs of all. This, of course includes large projects like ATLAS tier-2 computing (see Case Study 7.1, p. 37), whose national and local needs have also been covered.

– 25 – Categories of Architecture. In recent months, the needs of researchers located at the four major consortia being funded by the NPF have been carefully surveyed. Extensive discussions took place on how to satisfy the generic and specific needs expressed in the surveys. It was realized that these needs could be accommodated by roughly four types of architectures: 1. Clusters with standard interconnects, also called capacity clusters in this proposal. These are suitable for codes that can fit entirely within the physical memory of a single node, and are tailored for treating a large number of otherwise identical problems, as in parametric studies. They are, in fact, the most cost-effective way of solving such problems. 2. Clusters with low-latency/high-bandwidth interconnects, suitable for distributed memory ap- plications written, for instance, with the Message Passing Interface (MPI) Library. Problems requiring very large memory can only be tackled with such machines, hence the name capability clusters used in this proposal. 3. Shared memory machines, also called SMP (symmetric multi-processors) computers. Their interconnect between processors outperforms that of capability clusters in a way that allows for a shared memory model, e.g., using the OpenMP library, or with explicit threads. Such machines are suited for strongly-coupled problems that scale poorly on capability clusters. 4. Vector computers. They are also frequently capability clusters, but whose processors are capa- ble of performing a large number of pipelined floating-point operations, typically operating on an array of numbers with the same speed as a single number on a standard computer. They are traditionally used in climate and weather modelling. These categories are not exclusive. For instance, capability clusters can be composed of so- called thin nodes (e.g. 4 cores or less per node) or of so-called fat nodes (e.g. 16 cores, with up to 64 GB of RAM). Fat node clusters are thus also considered clusters of mid-sized SMP machines. In addition, the apparent slowing down in the growth of CPU speed favours the introduction of coprocessor technologies, permitting the introduction of a vector component into each node of a capability cluster. Finally, the decreasing cost of high-performance interconnects and the advent of multi-core processors tend to attenuate the difference in cost between capacity and capability solutions, at fixed memory per core. The proposed platform will be formulated in terms of this generic infrastructure without ref- erence to specific vendors (no quotes are enclosed with this proposal). Each of the major HPC infrastructures proposed here can be satisfied by several vendors. If this grant application is funded, the vendor decisions will be determined by an RFP process. In the same spirit, storage needs are formulated in terms of a generic SATA-disk based solu- tion, with optional tape backups. A unified and generic cost template has been used for the global budget, irrespective of the choice of final vendor. The storage capacity necessitates an initial in- [ 2d ] vestment in hardware, following which increasing the capacity can be done periodically, depending on the actual demand. HPC equipment is very much coupled to the room in which it is installed. These rooms contain [ 2f,3h ] components specific to HPC: raised floors, large cooling and electrical capacities, uninterrupted power sources (UPS), security features, etc. These form an integral and necessary part of the HPC infrastructure and tend to increase the part of the budget devoted to construction/renovation, beyond the cost of a general-purpose space. The lifetime of these components is also longer than the computing equipment, and they therefore constitute a long-term investment in HPC. Distribution of the proposed platform. Table 4.2 summarizes the architecture and geographic distribution of the platforms chosen by the four consortia that would receive major funding. It shows a good balance between capacity and capability clusters, and a moderate request for vec- tor and SMP machines. Table 4.3 summarizes the characteristics and relevance of the different architectures.

– 26 – Architecture WestGrid SciNet RQCHP CLUMEQ Total Capacity clusters 8,400 6,200 256 0 14,856 Thin-node capability clusters 2,800 3,200 3,700 3,900 13,600 Coprocessor clusters 0 0 1,000 0 1,000 Fat-node clusters 0 0 0 2,800 2,800 Large SMP 700 0 128 0 828 Vector cores 0 160 0 0 160 Disk storage (TB) 2,750 1,200 216 1,200 5,366

Table 4.2. Platform infrastructure (approximate number of cores) vs. architecture proposed at the four consortia receiving major funding.

Remarks: 1. A more detailed description of each facility is provided in the budget justification section. In particular, memory configurations are not indicated here and some capacity clusters will be heterogeneous. 2. The Montreal site of CLUMEQ is planning a large fat-node capability cluster, with about 175 nodes, each with 16 cores and 64 GB of RAM. Even though this machine is in the capabil- ity category, it can also be considered an SMP cluster and can satisfy moderate SMP needs (memory < 64 GB). 3. The WestGrid capability cores will be distributed across three sites, with respectively 800, 900 and 1,100 cores. The other capability machines (fat-nodes excluded) will have between 3,000 and 4,000 cores and will be able to host very large-scale calculations if needed. Thus, fractionating the capability clusters across six sites will not be a impediment to large-scale computing. 4. Fractionating the capacity clusters is not a problem either since they will be grid enabled. 5. SciNet will install a vector computer optimized for, but not dedicated to, climate and weather studies. 6. The Montreal site of RQCHP is planning a capability cluster with special coprocessors, or equivalent technology, to be installed in year 3. This will be optimized for codes based on stan- dard linear algebra libraries, but could also add a vector component to an otherwise common capability cluster. 7. Users in need of capacity computing coupled with large data storage, in particular the ATLAS collaboration, will find such facilities located at the WestGrid, SciNet and CLUMEQ. The proposed configurations are based on today’s technologies and prices, and are obviously [ 2d ] subject to changes, especially for those configurations that are planned for years 2 and 3. The pre- cise configuration for each architecture can only be determined near the time of purchase, through a formal Request for Proposal (RFP) process, designed to provide the users with the most advan- tageous solution, in particular through the usual mechanisms of competition between vendors. The committee has incorporated today’s educational prices, to which a reasonable additional discount has been applied (based on past experience with consortia RFPs). The same prices have been applied across Canada. This led to an estimation of the scope of the infrastructure, as given in Table 4.2.

– 27 – Architecture Characteristics Fields of application (partial list) Capacity Standard interconnect. Data analysis and simulation in cluster Calculation must fit in each node’s memory Astrophysics and Particle Physics. (between 2 and 16 GB). Genomics and Proteomics. Used typically for naturally parallel Quantum chemistry, quantum materials. problems. Nanotechnology. Most economic solution per peak Gflop. Data mining, artificial intelligence. Grid research. Thin-node High-performance interconnect (low latency, Large-scale simulations of particles in capability high bandwidth). astrophysics, molecular dynamics. cluster Problems must be coded within a distributed Coherent control (photonics) and other memory model (MPI); they are parallelized large-memory quantum problems. because of very large memory requirements Quantum materials. and/or to gain speed of execution. Computational Fluid Dynamics. Other finite-element problems in Engineering. Economics. Coprocessor Nodes with coprocessor technology (e.g. Quantum chemistry and ab initio cluster Clearspeed technology, Cell, etc). calculations. Advantageous for codes that use standard Any code relying heavily on standard libraries (e.g. linear algebra routines) that linear algebra routines. are already optimized for that architecture. Low power consumption per peak Gflop. Fat-node E.g., nodes with 16 cores and 64 GB Computational fluid dynamics. capability each, with high-performance interconnect Climate modelling. cluster between nodes. Meteorology. For problems that require large memory Imagery, e.g. brain research. (< 64 GB) but are not parallelized in a Quantum chemistry. distributed memory model: can be used as an “SMP farm”. Also for distributed-memory problems that tolerate a lower ratio of inter-node bandwidth to node speed. Large SMP Shared memory model. Artificial intelligence. For applications that cannot be parallelized Quantum chemistry. within a distributed memory model or Medical (e.g. cardiac) research. require very large memory. Network science, Operational research. Parallel For problems that scale poorly in a Climate modelling. Vector distributed memory model and for which Meteorology. computer speed of execution is the most important Oceanography. factor. Hydrology. Least economic solution per peak Gflop. Computational fluid dynamics.

Table 4.3. Proposed architectures, characteristics and relevant application [ 2b ]

– 28 – 4.6 Relation between infrastructure and needs [ 2b ] As mentioned previously, there has been an extensive survey within each consortium of the needs of the current major users, as well as of emerging users of HPC. There is no doubt that the [ 2c ] task of selecting appropriate platforms for solving problems as varied as engineering, chemistry, physics, medicine and finance are facilitated by the growing flexibility of cluster platforms. One has only to observe the trend at Supercomputer conferences to notice that all these disciplines, so exclusive in their requirements only a few years ago, today have a common platform through clusters. In Section 3, a wide sampling of HPC-based research programs were described, and the as- sociated HPC needs were briefly mentioned, referring to the generic architectures above. This is complemented, later in this proposal, by a number of case studies that describe in more detail – albeit not comprehensively – some areas of research that have achieved international prominence and where the lead scientists are heavy users of HPC. We have attempted to illustrate their cur- rent use of HPC and how they would benefit from enhanced infrastructure. Table 4.3 summarizes the characteristics of the different architectures, and their relevance to the research described in Section 3 and the case studies. 4.7 Networking The proposed infrastructure described above (and in more detail in the budget section) consists of a distributed resource of individual HPC platforms linked through a high bandwidth national backbone developed by CANARIE, a network referred to as CA*net4 . This network interconnects the individual regional networks (ORANs), and through them the universities, research centres and government laboratories, both with each other and with international peer networks. Through a series of five optical wavelength connections, provisioned at OC-192 (10Gbps), CA*net4 yields a total network capacity of 50 Gbps. This most recently commissioned version of this network was made possible by an investment of 110 M$ from the federal Government . Going forward, Compute Canada will work with CANARIE and the ORAN’s to plan and imple- [ 3f ] ment a truly national HPC backbone network – building on the work already done with selected consortia. It is anticipated that this network will include multiple dedicated HPC light-paths with a combined bandwidth of at least 10 Gigabits per second.

4.8 Compute Canada mechanisms to enhance collaboration [ 3f ] Because the National HPC Platform is to be a purposefully shared resource, it will prove to be an extremely effective structure through which to enhance research collaborations nation-wide. Examples of such existing or envisioned collaborations are described below. These collaborations are expected to form as a consequence of shared research goals focused upon the use of specific HPC platforms. Examples of HPC-based Research Collaborations. An extremely instructive example of the way [ 2a,b,3a–c,f ] in which HPC drives research collaborations is that provided by the ATLAS project (See Case Study 7.1, p. 37). The project is to have a lifetime of more than a decade, at which time ATLAS- [ 3d ] Canada will comprise about 50% of the Canadian experimental particle physics community, i.e. roughly 40 PIs. The HPC needs of the ATLAS community are well matched to the proposed shared resource environment of Compute Canada. The ATLAS application is best served by capacity clusters with a standard interconnect as the event analyses that must be performed constitute a highly parallel application but one that also requires a significant amount of storage (several petabytes in total) A second example of fostering research collaborations is the national effort in the development [ 2a,b,3a,c,f ] and application of global climate models to predict changes that are occurring as a consequence of increasing greenhouse gas concentrations in the atmosphere (see Case study 7.6, p. 44). Research

– 29 – in the area of climate predictability involves the solution of a complex initial value problem for a model of the entire “Earth System” of interacting ocean, atmosphere, sea ice and land surface processes. Each of these components of the system is described by a coupled set of non-linear partial differential equations. Their solution constitutes a tightly coupled problem which requires high bandwidth, low latency interconnect between a relatively modest number of extremely powerful processors, or processing nodes, each of which has significant shared memory. The Canadian community has expressed a strong interest in having access to a significant system in this class and our proposed platform will meet this need by installing a single system of this kind in the SciNet consortium. Access Grid: Infrastructure to support collaborative interaction. This proposal includes the out- [ 3e,3f ] fitting of approximately 50 Access Grid rooms across Canada. These facilities, which have become increasingly common in recent years, are intended to facilitate group-to-group interactions. An Access Grid node involves 3-20 people per site and are “designed spaces” that support the high-end audio/video technology needed to provide a compelling and productive user experience. An Ac- cess Grid node consists of large format multimedia display, presentation, and interaction software environments; interfaces to grid middle-ware; and interfaces to remote visualization environments. With these resources, Access Grid supports large-scale distributed meetings, collaborative team- work sessions, seminars, lectures, tutorials and training. It is perhaps especially in the training area that our proposed network of Access Grid facilities will prove to be most effective. These facilities will be employed extensively by the technical support groups at each of the major instal- lations as an efficient means of delivering the training tutorials required for the efficient use of the national HPC infrastructure.

5 Operations and the Human Infrastructure In this section we describe the operational requirements and the investments into highly- qualified personnel (HQP) that will enable Compute Canada to effectively implement the proposed strategy. The CFI Infrastructure Operating Funds (IOF) are automatic with an award and are specifically geared to cover operational costs. This proposal to CFI is also intended to fully justify the need for additional funding from the Canadian research agencies, specifically NSERC, SSHRC and CIHR.

5.1 The Need for Highly Qualified Personnel Although it is the researchers that assume the high-profile, visible role in the research process, in reality HPC-related activities require a large support team working behind the scenes to make this possible.5 An effective HPC facility is much more than hardware. Highly trained individuals are needed to manage, operate and maintain the facility to ensure it runs smoothly and efficiently. They also play a key role in the selection and installation of equipment. For instance, the RQCHP clusters (installed in 2004 and 2005) provide a cost-effective solution to HPC, in large part because of the experience and diligence of the local team of analysts. Further, to maximize the use of expensive HPC facilities, it is equally important to have highly trained technical support staff who train and assist researchers in making the best use of HPC resources. The investment in people is a critical component of this proposal. In many respects, computing infrastructure can be more readily acquired than human infrastructure. Given adequate funding, upgrading of the capital equipment is straightforward; one can simply buy whatever is needed. However, obtaining human infrastructure is much more challenging. It can take years to train people with the necessary skill sets. Many such highly skilled people are currently working at

5 Note that there is no reference to funding graduate students and PDFs. They are expected to be supported through individual researchers’ grants.

– 30 – various consortia across the country. These people constitute a precious human resource that must be retained, but that can be easily lost, e.g., by being enticed away from Canada by the lure of better opportunities coupled with higher salaries. In addition, over the years, many talented analysts working at consortia have left for high-paying jobs in Canadian industry, sometimes founding their own companies. While this is beneficial to the Canadian economy, it adds to the challenges faced by the consortia. If Canada is to invest in people, then it must also invest in creating the environment to attract and retain them. 5.2 Expense Categories Effectively operating and using an HPC facility is an expensive proposition. The costs fall into roughly four categories: 1. Operating infrastructure costs. This includes space, power, and day-to-day expenses (e.g. backup tapes). 2. Systems personnel. They are primarily concerned with the day-to-day care of the hardware and software infrastructure, i.e., the proper functioning of the HPC facilities, system manage- ment, operator support, etc. This involves a wide number of disparate activities, all of which are crucial to ensuring that the system is fully functional and available to the community. These include installing and maintaining the associated operating system(s) and updates/patches, the management of file systems and backups, minor equipment repairs, and ensuring the integrity and security of the user data. 3. Application analysts. Their role is to provide specialized technical assistance to researchers, conduct workshops and training, and to both evaluate and implement software tools to effec- tively make use of the available resources. This work can have a major impact on the scientific productivity of the community. Typical HPC applications operate at a sustained rate well below the theoretical peak performance; the applications analyst’s role is to improve this. An application analyst might, for example, double the performance – through code optimization, algorithm re-design, enhancing cache utilization and/or improving data locality. This would correspond to twice the delivered science for the same cost. The added value from such activities can be huge. 4. Management personnel. A management structure needs to be put into place, both at the consortium and national levels, to take care of, for example, human resources, financial issues, public relations, secretarial tasks, and coordination of activities. As well, we include in this category the costs associated with hosting an annual international scientific advisory board meeting. Systems personnel are usually located at each major installation site. Each major installation needs a core set of people to address the day-to-day operations of the facility(s). Application analysts do not need to have close proximity to the hardware. Ideally, they should be close to the users since their work involves a collaboration with them. As this is not always possible, it is expected that much of their interaction with users will proceed through email and/or tele- /video-conferencing (as currently happens). Management personnel can be in support of a site, a consortium, or the national initiative. 5.3 Funding Sources The Canadian system makes it awkward to fund the above; no single funding source covers all the needs. The consequence is that all consortia have to apply to multiple agencies to assist in providing operating funds. These sources include: • CFI Infrastructure Operating Fund (IOF). CFI provides funds for the operation of the proposed facilities (30% of the CFI dollars). These funds can be spent on operating infrastructure costs and system personnel.

– 31 – • NSERC MFA grant. The NSERC Major Facilities Access (MFA) fund has been used to fund application analysts to support research application development (the so-called TASP pro- gram). This reflects the science and engineering dominated HPC usage to date. However, medical/biotech-related research is emerging as an important user of HPC resources in Canada (as well as the arts and humanities, but currently to a smaller extent), reflecting the need for coordinated granting council funding. • Provincial funding agencies. In the past, several provinces offered programs that could be used for HPC operations support. Some of these programs no longer exist. • Institutions. All institutions hosting HPC facilities make a contribution to the cost of the facilities. The most common form of support is one or more of space, power, technical support personnel, management personnel, supplies, and cash. The CFI IOF initiative has helped address the operational side and the NSERC MFA has helped on the research applications side, but to date neither has reflected the true cost of operating these facilities. As well, there are no funding sources for management expenses, other than the institutions.

5.4 People Resources in Perspective In comparison to the major HPC centers in Europe or the United States (or even Environment Canada, the national weather service), the current Canadian level of HPC support is low. Table 5.1 shows the user-support commitment at several major international and Canadian HPC sites (taken from the LRP using June 2004 data). There are significant differences in the total number of support personnel. The ratio of resources (# of CPUs) to the number of support people ranges from 38 for Environment Canada (a production facility) to 80-160 for international research facilities, to 334 for a typical Canadian HPC site. The table does not show the number of management personnel (it was difficult to identify all of them). At least for the three international sites, these numbers are significant, further widening the disparity in support personnel abroad compared with CFI-funded infrastructure.

Category nersc (usa) psc (usa) hpcx (uk) Env. Can. WestGrid Rank in Top 500 14 25 18 54 38 # of CPUs 6,656 3,016 1,600 960 1,008 Support personnel 24/7 24/7 24/7 24/7 10/5 availability (hours/days) Operations personnel 9 6 4 12 2 Systems support person- 11 10 2 7 1 nel Networking & security 6 3 1 5 0 personnel User support personnel 15 11 13 1 0 Total 41 30 20 25 3

Table 5.1. Operations and user support personnel at selected HPC facilities (Source: LRP)

– 32 – 5.5 Operating Expense Plan We now look in detail at each of the expense categories, projecting what their needs will be for the next five years. Operating Infrastructure Costs. The major item here is power. At this point, it is difficult to evaluate the cost of power over the next five years. However, a detailed analysis based on the cost of power in each province, current power consumption per CPU, and projected number of CPUs concluded that a conservative estimate of this cost is 2.1 M$ per year for the infrastructure requested in this proposal. For the five years of operations that the IOF provides, this amounts to 10.5 M$ of the 18 M$ available (30% of 60 M$ requested). This also assumes that the cost of power does not increase. In the past, most institutions hosting CFI-funded HPC installations paid the power bill. With the rising price of power and demand from increasingly large HPC installations, many became aware that these costs were substantial and some now require the consortium to pay their own power bills. This single expense can dramatically affect the ability of a site to fund sufficient people to professionally operate the installed systems. Systems personnel. These people are essential for the smooth day-to-day operations of these expensive systems. Until the 2001 competition and the introduction of the IOF fund, CFI-funded facilities were run using a shoestring staff. IOF funds (net the power costs) are used to hire the support staff that are needed to ensure the professional operation of the infrastructure. This proposal has the majority of the requested infrastructure going to 12 sites. If one assumes that a systems person earns 70 K$ plus 22% benefits per year, then putting just one person per site costs 1 M$ per year – 5 M$ over the 5-year CFI time-frame. Given the size and scope of the infrastructure at the sites, multiple systems personnel are needed (as many as four at some sites). In addition, each consortium will need personnel to provide technical leadership and coordination across all member sites. This includes a chief technology officer, director of operations, access grid coordinator, etc. Coupled with the power costs above, there are few funds available for operating the facilities. This situation can only be rectified by having the institutions make contributions to the operating costs. Applications analysts. User support by application analysts makes the difference between a merely operational HPC facility and an effective one. Some consortia are well supported (from their institutions or from provincial funding opportunities) and employ a few applications analysts. The majority of these people across the country are, however, funded through NSERC MFA grants. The roughly 1 M$ annually supports a distributed team of applications analysts, but is quite inadequate in view of the current and projected needs, in particular given the new challenges arising with the national character of the facilities. A substantial increase of funding is required in this category, and is justified in more detail in Section 5.6 below. This reflects the need for greater attention to application development and performance enhancements, the larger user community, and the national nature of the proposal. A detailed operating budget (not included in this proposal for lack of space) has been prepared that allocates the proposed funds to best meet the needs of our large and distributed user community. Management personnel. Unfortunately, there are currently no opportunities to apply for funding to cover the costs of managing the multi-million dollar HPC facilities installed in Canada. CFI IOF cannot be used for any management expenses (including administrative, secretarial, financial, and public relations). NSERC MFA funds must be spent in support of research programs only. All participating institutions must contribute cash to ensure proper local and national management. All the partners have committed the resources necessary to ensure this project is properly managed. The contribution of granting councils. This proposal is requesting 32.4 M$ over five years from the granting councils. The majority of the funds will be used to support applications analysts, distributed across the country at all major sites. Unlike the capital and IOF funding requests, which are targeted towards the needs of four consortia, the TASP proposal seeks funding for all

– 33 – seven consortia representing the entire HPC community (as past NSERC MFA grants have done). The funds would be used as follows: • ∼ 65 applications analysts, with an approximate distribution (reflective of geographical distri- bution) of 12.5% located in Atlantic Canada, 25% in Quebec, 37.5% in Ontario, and 25% in Western Canada. The average salary would initially be 70 K$/year with 22% benefits (both numbers represent current averages across the country). A 4.5% per year increase in salaries is part of the planning. • An additional cost of 5 K$/year per applications analyst, to include the costs of a personal computer, travel to HPCS (the annual Canadian HPC conference), and one other conference (such as SuperComputing, or Gaussian users group). • One communications/web master per consortia. This person’s responsibility would be to doc- ument user support, to facilitate comprehensive access to the resources, to document the suc- cesses of the HPC research, and ensure the successes are properly publicized to our communi- ties. • Provision for an annual meeting of the international scientific advisory board (estimated at 30 K$: four international members, one representative from each of the consortia, local ar- rangements, etc.).

5.6 The TASP Program High performance computing cannot flourish without the appropriately trained support per- sonnel. This is currently provided by the Technical Analyst Support Program (TASP) model, a unique but unfortunately underfunded initiative. Many international installations have budgets to hire sufficient people to serve their strategic user community. The Canadian solution is to build a national support network – TASP – that serves a large user community with diverse needs. The TASP program is currently funded by an NSERC MFA grant with a value of $1,038,000 per year.6 The program is being used to hire 22 programmer/analysts (many of these analysts have half their salary paid for by their host institution). Current TASP analysts form a highly trained, highly experienced team of computer scientists, mathematicians, computational scientists and system analysts who execute a wide range of activities: • Assistance in porting code from smaller systems to HPC resources; • Assistance in parallelization and optimization of codes; • Assistance with storage and data retrieval issues; • Consultation on numerical/computational methods; • Provision of widely disseminated training courses on parallelization and optimization; • Technology watch on HPC methods and libraries; • Assistance in advanced scientific visualization; • Distributed system administration; and, • Provision of training courses on the use of the AccessGrid. TASP analysts work as a cohesive national team. They meet regularly via teleconferences, as well as face-to-face annually at the Canadian HPCS conference. Although a particular provider site may not have a consultant in a particular area, there are both formal and informal mechanisms within TASP to access the needed support. This has proven to be an extremely efficient way to deliver these support services, to cross-fertilize and to avoid duplication. This led to a biweekly Trans Canada Computational Science Seminar being held over the AccessGrid between WestGrid and ACEnet Universities, with expansion to other sites soon. In the past two years, the TASP has partially or fully funded 66 workshops across the country, with a total of over 2,000 person days of

6 Note that the NSERC Major Facilities Access (MFA) program is being replaced by the new Major Re- sources Support (MRS) program. At the time of this writing, we do not know much about the MRS, and are assuming it will be similar in scope and funding model to the MFA.

– 34 – attendance. Topics regularly covered in workshops and online presentations include introductory, intermediate and advanced MPI programming; programming for shared memory architectures; performance-tuning tools; AccessGrid use; advanced scientific visualization; and grid middleware. As a result, the TASP team and the consortia have built a large resource base of presentation materials of both introductory and advanced tutorials on HPC, parallel programming, and various computational methods. In summary, the overall objective of the TASP analysts is to ensure that Canadian scientists and their graduate students receive the computational support that they need to carry out their research. However, a large number of research groups need extensive help to get their computational research up to speed. For instance, the parallelization of a code cannot always be accomplished by a member of the group; that and/or the optimization of the algorithm, or the introduction of high- performance libraries within a code, may require the investment of several weeks of an analyst’s time. Such investments sometimes lead to radical changes, not only in the performance of the code, but also in the scope of problems that can be solved, thus opening new research opportunities. Most researchers do not have the background, or the funds to hire someone with the expertise, to accomplish these tasks. A single TASP analyst attached to a consortium can therefore enhance several research applications to effectively use HPC resources, but the needs are enormous. The current level of funding for this program is insufficient, leaving the analyst needs of many research groups not addressed. The consequence is that many groups still run inefficient codes, or codes that lead them to lower expectations and that are not competitive in the international context. Worse, many groups stay clear of HPC and its benefits for lack of adequate support. Expansion of the TASP program is required to meet the needs of the current and expanding community. The current funding level for applications analysts is small given the size of our research com- munity. For example, the facilities mentioned in Table 5.1 (NERSC, PSC, HPCX, Environment Canada) have an application analyst for every 10 to 42 users. In contrast, the two largest consortia, WestGrid and SHARCNET, have roughly 125 to 225 users per analyst. A consortium like RQCHP employs a more adequate number of analysts, but just maintaining that number (despite the re- cent expansion of its membership) precisely requires the level of funding asked for in this proposal, given the increase of other expenses (e.g. power), the disappearance of the provincial program contributing to operations and the need for a fair distribution of analysts accross the country. Increasing the number of analysts to 65 will bring the ratio of analysts to users to roughly 1:50 (averaged across Canada), more in line with the ratios at the international sites. The additional (at most consortia) TASP analysts would be proactive in a number of initiatives, including: (1) Giving greater attention to the application needs of more users; (2) Supporting and encouraging use of the grid infrastructure; (3) Developing portals to simplify job submission; (4) Adopting/developing meta-scheduling tools; (5) Adopting/developing tools for monitoring, accounting, reporting and analyzing HPC resource usage.

6 Conclusions This proposal represents a groundbreaking approach to HPC infrastructure in Canada. Getting seven consortia, over 50 institutions, and the research community, to agree on a united HPC vision is a major accomplishment and a reflection of the essential role that HPC plays in research today. The Compute Canada initiative is a comprehensive proposal to build a shared distributed HPC infrastructure across Canada. The national consensus on governance, resource planning, and resource sharing models are major outcomes of developing this proposal. C3.ca has begun the process of becoming Compute Canada. With CFI’s initiative and the committed support of the provinces, industry, and the partner institutions, Compute Canada will meet the needs of the research community and enable leading-edge world-competitive research.

– 35 – 7 Case Studies

Title Authors 7.1 Particle Physics R.S. Orr (U. of Toronto, SciNet) M. Vetterli (SFU/TRIUMF, WestGrid) Brigitte Vachon (McGill U., CLUMEQ) 7.2 Astrophysics R. Bond (U. of Toronto, SciNet) H. Couchman (McMaster U., SHARCNET) 7.3 Quantum Chemistry M. Cˆot´e(U. de Montr´eal, RQCHP) M. Ernzerhof (U. de Montr´eal, RQCHP) R. Boyd (Dalhousie, ACEnet) 7.4 Nanoscience and A. Kovalenko (NINT, U. of Alberta, WestGrid) Nanotechnology G. DiLabio (NINT, WestGrid) 7.5 Quantum Materials D. S´en´echal (U. de Sherbrooke, RQCHP) E. Sorensen (McMaster U., SHARCNET) 7.6 Global and regional climate W.R. Peltier (U. of Toronto, SciNet) change A.B.G. Bush (U. of Alberta, WestGrid) J.P.R. Laprise (UQAM, CLUMEQ) 7.7 Hydrology E.A. Sudicky (U. of Waterloo, SHARCNET) L. Smith (UBC, WestGrid) Ren´eTh´errien (U. Laval, CLUMEQ) 7.8 Aerospace W.G. Habashi (McGill U., CLUMEQ) D.W. Zingg (U. of Toronto, SciNet) 7.9 Computational Biology C.M. Yip (U. of Toronto, SciNet) P. Tieleman (U. of Calgary, WestGrid) Hue Sun Chan (U. of Toronto, SciNet) Th.J. Hudson (McGill U., CLUMEQ) J. Corbeil (U. Laval, CLUMEQ) 7.10 Brain Imaging A. Evans (McGill U., CLUMEQ) M. Henkelman (U. of Toronto, SciNet) 7.11 Large-Scale Text Analysis G. Rockwell (McMaster U., SHARCNET) I. Lancashire (U. of Toronto, SciNet) R. Siemens (U. Victoria, WestGrid) 7.12 Collaborative Visualization Brian Corrie (Simon Fraser U., WestGrid) Pierre Boulanger (U. of Alberta, WestGrid) Denis Laurendeau (U. Laval, CLUMEQ)

– 36 – 7.1 Particle Physics The Challenge. Experimental high-energy particle physics is about to enter an extremely exciting period in the study of the fundamental constituents of matter. Since the discovery of the carriers of the weak force (W and Z bosons) at the CERN laboratory in Geneva, and the completion of the precision study of parameters, many physicists are now convinced on the validity of the “Standard Model”. The Standard Model provides an accurate picture of the lowest level constituents of matter and their interactions in the accessible energy regime. For many reasons, the Standard Model is believed to be a low energy approximation to a unified, and possibly supersymmetric, theory that spans all energy domains. If experimentally confirmed, such a supersymmetric theory could solve simultaneously the problems of the basic structure of matter and the riddle of the composition of dark matter in the universe. The Standard Model requires the existence of at least one new particle, the Higgs boson. Its confrontation with experimental data, combined with its known theoretical shortcomings, is fueling expectations of new and exciting physics close to the TeV scale.7 This was the central motivation for the construction of the Large Hadron Collider (LHC), which will produce the highest energy proton-proton collisions ever achieved under laboratory conditions. How to extend the Standard Model to overcome its known deficiencies is one of the most exciting experimental challenges in modern science. The ATLAS collaboration has almost completed the construction of a general-purpose detec- tor designed to record the results of high energy proton-proton collisions and to fully exploit the discovery potential of the LHC. This detector is designed to meet the diverse and exacting re- quirements of the LHC physics programme while operating in a very high-luminosity environment (luminosity is a measure of the interaction rate). This high-performance system must be capable of reconstructing the properties of electrons, muons, photons and jets of particles emerging from the collisions, as well as determining the missing energy in the event. Its radiation resistance must allow operation for more than ten years of data taking at high luminosity. ATLAS-Canada. The ATLAS-Canada collaboration comprises 33 grant eligible scientists who all [ 2a,3f ] take an active part in the ongoing projects. Including engineers, research associates, technicians and students, ATLAS-Canada is a group of 88 highly trained people. The ATLAS detector project has an NSERC capital expenditure of 15.5 M$ for the completed detector construction, and an in- tegrated operating expenditure of 15.9 M$ to date. In addition, Canada has contributed 37.3 M$ to the construction of the LHC. ATLAS was identified as the highest priority project by the Canadian particle physics community in the last two long-range plans. The ATLAS-Canada collaboration includes an excellent mix of leaders in Canadian high-energy physics, with a proven track record, and some of the best young scientists in the country, including several Canada Research Chair recipients. HPC Requirements. The ATLAS experiment will produce 2-3 Petabytes of data per year, with significant additional storage required for secondary data sets. Analysis of particle physics data is carried out in stages, starting with calibration and alignment, event reconstruction, event filtering, and finally to the extraction of physics results. Secondary data sets are produced at each stage of the analysis chain. Each successive data set is smaller than the previous one due to event selection and synthesis of the information from each collision. Eventually, the data sets get to a size where it becomes practical for smaller research groups to access the data easily. The staged nature of the analysis lends itself well to a tiered system of analysis centres. CERN is coordinating an international network of high-performance computing centres to provide the resources necessary for the analysis of data from the four LHC experiments. This network will use grid computing tools to manage the data and to make efficient use of the computing resources. Over 100 sites of varying sizes in 31 countries currently participate in the LHC Computing Grid (LCG). ATLAS

7 One electron-volt is the energy gained by a unit charge that is accelerated through a voltage difference of 1 Volt. 1 TeV is 1012 eV; the mass of the proton is a little less than 1 GeV (109 eV).

– 37 – will have 10 Tier-1 centres and 40-50 Tier-2 centres around the world. Canada will provide one Tier-1 and the equivalent of two Tier-2 centres. Technical details on the LCG can be found on the web at lcg.web.cern.ch/LCG/. The ATLAS-Canada computing and data analysis model is illustrated here through a set of [ 2b,3c ] typical use-cases: 1. Raw data handling: The raw data from the experiment will be processed, when possible, on the Tier-0 system at CERN. Raw data, and the results of preliminary first- and second-pass analysis, will be sent to the Tier-1 centres. As better detector calibrations and reconstruction code become available, the Tier-1 centres will reprocess the raw data to produce higher quality secondary data sets, which will be distributed to all ATLAS Tier-1 centres, and also to the Tier-2 centres. 2. Simulation data handling: One of the primary uses of the Tier-2 systems will be to produce the large amounts of simulated data needed for analyses. The simulation is strongly limited by processing capacity, and the Tier-2 centres will have large CPU requirements for this purpose. As the simulated data at the Canadian Tier-2 centres are produced, they will be copied to the Canadian Tier-1 centre. 3. Physics analysis: Individual Canadian physicists will typically use desktop systems for prepar- ing and finalizing data analysis, but every analysis will require significant access to Tier-2, and in some cases Tier-1, capabilities. Most analyses will be based on the secondary data sets stored at the Tier-2 centres, once the analysis chain is stable. The demands on the Canadian Tier-1 centre are exceptional. In addition to the processing and storage capabilities, the Tier-1 centre must continuously receive raw data from CERN, as well as reprocessed data from other Tier-1 centres. It must also distribute the results of its own data reprocessing to other Tier-1 centres and to the Canadian Tier-2 sites. The Tier-1 centre must therefore be dedicated to ATLAS. Funds for the Tier-1 are being provided through the CFI Exceptional Opportunities Fund. On the other hand, access to the Tier-2 facilities would be more erratic with periods of low usage as physicists consider the results of their analyses. These [ 2c ] facilities can then be used by researchers in other fields, and it is therefore appropriate that the Tier-2 centres be part of shared facilities in the HPC consortia. This model makes efficient overall use of the computing and storage resources in Canada.

7.2 Astrophysics The Challenge. Astrophysicists seek to answer fundamental questions about the universe that span the vast extremes of time, space, energy and density from the Big Bang 14 billion years ago to the present and into the future. They are working on fundamental questions such as: how do complex structures form, ranging from the nuclear and chemical elements to the vast interconnected cosmic web of galaxies? How do stars, solar systems, planets, and indeed life, develop? Canadian research institutions, in collaboration with other renowned international centres, are making massive investments in observational hardware to solve these cosmic mysteries of origin and evolution. To extract the appropriate implications from this massive data set, which is growing exponentially in both volume and quality, requires significant increases in computational statistical data-mining power. It will also require the development of large-scale numerical simulations of non-linear astrophysical processes. A sense of the wide-range of Canadian astrophysical HPC needs can be gained by considering the following research topics: 1. Gasdynamical and gravitational simulations of nonlinear phenomena are critical in all areas of astrophysics. For example, 3D numerical plasma simulations are an essential tool for under- standing the turbulent processes involved in the formation, growth and behavior of black holes,

– 38 – the explosion of supernovae, and the origin and dynamics of magnetic fields in stars, the galaxy and the universe. Simulations that track both gas and dark matter are required to investigate the formation of large-scale structure in the Universe. 2. Cosmology has progressed from an order-of-magnitude science to a precision discipline. Increas- ingly ambitious cosmic microwave background (CMB) experiments, large-scale galaxy surveys, and intergalactic and interstellar medium observations, allow the determination of basic cos- mic parameters to better than 1% accuracy. They also address issues as far-reaching as the nature of the dark energy and dark matter that determine the dynamics and ultimate fate of the Universe. Analysis and interpretation of the new generation of space, ground-based and balloon-borne experiments, fueled by major technological advances, will require a hundred-fold increase in processing power. Canadian Successes. The previous generation of Canadian HPC resources played a vital role in na- [ 1a,1b ] tional and international collaborations and enabled key advances on some of the most fundamental challenges in astrophysics. SHARCNET researchers developed one of the world’s premier numerical codes (Hydra) for simulating cosmological structure and collaborated on the “Millennium Simula- tion” which is the largest cosmological simulation ever performed (more than 14 billion particles). SciNet researchers and infrastructure have played a major role in analyzing recent CMB data; successes include the best measurements of the high-l polarization spectrum to date (2004 Science cover story), the first detection of peaks in the EE spectrum and the best intensity measurement of the third total power peak. WestGrid researchers are leaders in the numerical study of galaxies and galaxy clusters, as well as the “Grand Challenge Problem” of computing involving black-hole collisions. Members of these consortia are at the leading edge in all aspects of simulations of the “cosmic web” that include gas, dark matter and dark energy. ACEnet astrophysicists are leaders in developing new algorithms and models to probe the interiors of stars, required to match the increasingly more detailed observational data. These efforts will allow ACEnet to develop a com- mon simulation code for magneto-hydrodynamic astrophysical applications. RQCHP researchers are making fundamental discoveries concerning the physical processes occurring within stars (such as diffusion), and are using large-scale simulations to develop a unified picture of the formation of compact objects (such as black holes) in close binaries. CLUMEQ researchers are using high- performance hydrodynamical simulations to study the formation and evolution of galaxies, the evolution and metal-enrichment of the intergalactic medium, the origin of Pop III stars, and the formation of star clusters and its feedback effect on the interstellar medium. Community Strength. Canadian scientists are internationally recognized for excellence in HPC- related research in astrophysics, and are involved in many observational projects. In the projects listed below, Canadian researchers require HPC resources to carry out large-scale data-analysis and numerical simulations. Canada has made significant investments in the: [ 2a ] 1. Canada-France-Hawaii Telescope Legacy survey (CFHTLS) that is mapping the dark matter in the universe to unprecedented accuracy 2. International Galactic Plane Survey (IGPS), the successor to the highly successful Canadian Galactic Plane Survey that is providing an unprecedented view of our Galaxy and the complex interactions mediated by star formation and star death 3. Sudbury Neutrino Observatory (SNO) which was instrumental in solving the Solar Neutrino problem and continues to probe the inner workings of the Sun, the nature of dark matter and basic neutrino properties 4. Upcoming ground-based ACT, balloon-borne SPIDER and space-borne PLANCK missions, that will determine cosmic parameters to unprecedented accuracy and possibly detect the unique signature of primordial gravity waves 5. Thirty-Meter Telescope (TMT), which will be the largest optical (and infrared) telescope ever built.

– 39 – HPC Requirements. The projects described in this case study require Monte-Carlo simulations [ 2b ] on thousands of CPUs. For example, the analysis of the Boomerang CMB data used 25% of all available cycles on a 528-CPU SciNet cluster for three years. As computing needs increase dramatically with the number of detectors, this means that ACT and SPIDER analysis will require roughly 300 times as much computiong power. Highly-parallelized cosmological simulations that follow both gas and dark-matter require a [ 2a ] capability cluster with low-latency interconnect, thousands of CPUs and many TB of RAM. For example, a simulation with four billion particles and 40,000 time-steps would require 5 TB of RAM and 60 wall-clock days on a 5,000-CPU cluster. With such a system it will also be possible, for the first time, to perform dynamical simulations of galaxies that have the same resolution as in nature (with at least 10 billion particles) and to make the most realistic 3D turbulence and convective simulations ever attempted (with at least 5, 0003 cells).

7.3 Quantum chemistry The Challenge. The properties of atoms, molecules and even solids can in principle be calculated exactly from the laws of quantum mechanics. But a direct and complete quantum mechanical solution is only possible for systems involving a very small number of particles, even on the best HPC facilities. Fortunately, starting in the 1960s, approximate methods have been derived from Density Functional Theory (DFT), which revolutionized the field and allowed the treatment of systems containing thousands of atoms. These methods are known as ab initio, since they start from the microscopic components of the of the system under study: nuclei and electrons. The importance of these methods has been recognized by the 1998 Nobel Prize for Chemistry given to Walter Kohn (B.Sc. and M.A. from the U Toronto) – for initiating DFT – and John Pople (formerly at the National Research Council of Canada), who designed the GAUSSIAN code. Ab initio methods are reliable enough that molecules and materials can be studied in virtual experiments in a HPC environment, whenever needed in solid state physics, chemistry, and increasingly in biology and pharmacology. Community strength and needs. The research groups named on p. 11 in connection with this field are only a few among the many active in Canada. Many more groups using these methods have a need for large computation facilities. The methods are now implemented in well-written codes (Abinit, deMon, Gamess, Gaussian, Pwscf, Siesta, Vasp, Wien2k), each of these having a large community of users in Canada. Some of these codes have small users fees for academics while others are open source codes making them accessible to the whole community. Because these codes provide an easy access to this technology, many groups in different fields are now using them and their usage will continue to grow in the coming years. The impact of this technology is felt [ 1g,4b ] in pharmacology (drug design), in biology (processes of the living cell), in the study of catalysis and in nanotechnology. Many Canadian researchers are involved in international collaborations for the development of code using DFT methods with a world-wide distribution: The ADF code is developed in part at WestGrid, the deMon group was initiated at RQCHP/WestGrid and the Abinit code has contributors from RQCHP. Because of the wide range of applications possible with these methods, implementation will depend on the problem that is addressed. Hence, different [ 2b ] strategies have been employed to efficiently use the computational resources available. Some codes (ex: Gaussian, Gamess, Siesta) use a localized basis set to represent the electrons and do not scale well beyond a few nodes; they are more efficiently used with very fast processors with large memory. Other codes, based on a plane wave expansion (ex: Abinit, Pwscf, Vasp), are more suitable for periodic systems and scale well over hundred of nodes on a capability cluster with fast interconnect. For these reasons, both shared memory and distributed memory architectures are needed. However, all these codes have in common an extensive use of standard linear algebra routines that benefit from new co-processor technologies (Clearspeed or Cell) for which these routines could be optimized. More specific research successes are given in the examples below.

– 40 – Example A. The Development of new functionals in DFT often requires testing on a large set [ 2b ] of predefined systems. The code then needs to be run on many independent instances, a task for which a capability cluster (or “capability farm”) is best suited. Typically, each test lasts a few hours, requiring a few processors, and hundreds of such tests need to be executed. Example B. Ab initio methods are perfectly suited for the design of novel materials without actual synthesis, optimizing the composition and structure and saving considerable amounts of experimental effort. Studying a complex solid with several hundred atoms per unit cell requires a vast amount of computational resources. With a plane wave basis, this problem is equivalent, in some cases, to solving a linear system of up to hundreds of thousands variables. Fortunately, only a few hundred eigenvectors are needed, which makes iterative methods ideal for this task. Moreover, these problems involve Fast Fourier Transforms (FFT), of which highly optimized implementations are available. Typically, such a study would require several hundred CPUs and run for as long as several weeks. Using this approach, RQCHP researchers were able to design a new material that [ 1a ] combined C60 fullerenes and a metal-organic framework with enhanced electronic properties aimed at improving the superconducting transition temperature. HPCVL chemists have demonstrated, [ 1a,1g ] using theory and molecular dynamics, that a film of oil containing additives is compressed at the molecular level between two hot, hard surfaces. This study explains why certain additives fail to protect the internal combustion engines. This finding is extremely important to the auto and lubrication industrial sectors. This type of study strains the capacity of existing resources. Expansion is required to provide large capability clusters or, when the technology is mature, special [ 2b ] accelerator technology adapted to linear algebra packages.

7.4 Nanoscience and Nanotechnology The Challenge. Research in nanoscience and nanotechnology is inherently multidisciplinary, requir- [ 3f ] ing expertise in physics, chemistry, biology, engineering and material science in order to elucidate the principles underlying the behaviour of nano-objects, and exploiting these principles in order to develop nanodevices and integrate them into ‘real world’ applications. One of the most promising targets in nanoscience and nanotechnology involves integrated ‘soft’ biologically-inspired or synthetic organic nanostructures with inorganic ‘hard’ nanomaterials. It is anticipated that this focus will lead to new and extremely powerful tools and technology platforms with broad application in the life sciences, medicine, materials science, electronics and computa- tion. High performance computing (HPC) is an important component of nanoscience research, enabling the application of theory and modeling to such nanotechnology areas as molecular elec- tronic devices; nanoengineered thin films and photonic devices; bioinformatics, microfluidic devices, and nanodevices for health applications; environmentally friendly chemical processes and energy production. The timely development of advanced HPC infrastructure is thus a crucial factor in the successful growth of these nanotechnology sectors in Canada. Community Strength. The Canadian research community encompasses a wide range of disci- plines, with theoretical modeling and simulation employed to aid in solving crucial problems of nanoscience and nanotechnology. For example, the Nanoelectronics program of the Canadian In- stitute for Advanced Research (CIAR) involves a number of highly accomplished researchers from across the country, including researchers from the National Institute for Nanotechnology (NINT), other National Research Council (NRC) Institutes and universities. A larger portion of the re- search conducted by these researchers involves simulation and modeling. CIAR members have developed novel techniques for simulating and studying current/spin in nanoscale systems, are developing novel platforms for nanoscale devices and are studying systems displaying high tem- perature superconductivity. Development and application of computational methods to research areas that impact the Canadian economy include: nanocatalysis in chemistry and petrochemistry, biomembranes and protein simulations to understand the processes in biological nanostructures,

– 41 – fundamentals of solution chemistry and physico-chemical processes in soft matter nanosystems, molecular theory of solvation and a platform of multiscale modeling of nanosystems, integrated modeling, simulation and visualization of biological systems (project CyberCell), nanoengineered devices for biomedical diagnostics and fuel cell development. The following are two examples of leading-edge developments in the theoretical modeling and simulation of nanosystems. The work described relies critically on the availability of HPC facilities. Example 1 : Simulations of nanoscale devices. To progress beyond traditional silicon-based tech- nologies for computing and sensing, there is a need to develop devices that operate at the nanoscale. New molecular scale devices are envisioned, hybrid organic-silicon devices that have as their cen- tral functional elements a small number of molecules (less than ten) with surface features at which a charge may be localized. The control over these molecules and charges will open the door to new devices that require little power to operate, dissipate little heat and are capable of extremely fast operation. A working model that demonstrates the principles of a single molecule transistor has been built. The present WestGrid facilities, augmented by a 20-node PC cluster, were used to [ 1a,1b ] perform the quantum mechanical simulations that helped to elucidate the operational details of the device. Further developments towards an operational computing element will require computing facilities well beyond that which is presently available to Canadian researchers. The difficulties lie in the fact that the systems under study are only slightly heterogeneous. This means that traditional techniques for studying large systems (e.g. periodic methods) cannot be applied, be- cause a large amount of silicon bulk (10000 atoms) might contain, for example, only one dopant atom, which is responsible for the conductive ability of the bulk. Therefore the ability to perform [ 2a ] quantum mechanical modeling (ab initio calculations) on systems containing tens of thousands of atoms is required. Having access to HPC computational facilities is required to advance the development of molecular scale devices. Example 2 : Theory and Modeling on multiple scales. The platform of theory, modeling, and simulation on multiple length and time scales integrates a set of theoretical areas that treat phe- nomena and properties on a hierarchy of length and time scales: from ab initio methods for the electronic structure at atomic scale to statistical-physical integral-equation theory of solvation and disordered solids, chemical reactions in solution and at interfaces, self-assembly of supramolecu- lar nanostructures, and heuristic models of functioning of biological and engineered nanosystems. This state-of-art theoretical methodology constitutes a modeling platform for such major applica- tions as nanomaterials for catalysis, energy production and conversion; nanomembranes for water purification and treatment; ultrafast gas sensors for industrial safety and control systems; photonic nanodevices for integrated optics; supramolecular nanoarchitectures for chemistry and medicine. We have demonstrated the capabilities of our multiscale modeling methodology by predicting the [ 1a,1b ] formation and tunable stability of synthetic organic rosette nanotubes which constitute a new class of synthetic organic architectures and a novel platform of organic synthesis. The calculations were done using the WestGrid facilities and local HPC resources at U. Alberta. With appropriate HPC resources, this theoretical methodology makes feasible predictive modeling of very large sys- tems and slow processes of high practical interest, such as nanomaterials for energy and ICT, and protein-protein interactions for enzymatic catalysis and biomedical applications. HPC requirements adequate to the above research goals summarize as follows: [ 2b ] • At least 50 TFlops (10,000 processors equivalent); • 20 TB of distributed RAM; • 2 TB of shared RAM. It is highly desirable to have at one’s disposal different HPC systems that span over the set of requirements for particular applications, e.g.: distributed memory cluster; shared memory system; floating point coprocessors for massive parallel computations; FPGAs/co-processors for compute- intensive key algorithms.

– 42 – 7.5 Quantum materials The Challenge. The behavior of electrons in solids is most often described by band theory, which treats electrons as if they moved independently of each other. The parameters of band theory (the electronic structure) are obtained from ab initio calculations (see case study 7.3). The indepen- dent electron approximation, central to band theory, fails to apply in a large class of “strongly correlated” materials, such as high-temperature superconductors. Problems involving magnetic impurities within metals, relevant to nanotechnology, are another important example. In general, many advanced “quantum materials” are in some exotic phase of matter that cannot be under- stood within the independent electron approximation and require new numerical methods. The simulation of these problems at very low temperatures typically requires finding the quantum me- chanical ground state of a system made of a small number of electrons (N). This is an eigenvalue problem whose computational size increases exponentially with N, like the dimension of the corre- sponding quantum mechanical Hilbert space. In HPC, exponential problems are often considered impractical; however, in practice, we can respond to this challenge by developing schemes that approximate the solution of a large system by embedding the characteristics of a much smaller system. Community Strength. Canada has a very strong international reputation in the field of strongly correlated electrons. This is in part due to the action of the Canadian Institute for Advanced Research (CIAR) through its program on quantum materials (and formerly through its super- conductivity program). There are strong groups at UBC, Waterloo, Toronto, McMaster and Sherbrooke. Two notable examples of scientific achievements are cited below. HPC Requirements. This field, with its many different methodologies, has a variety of require- [ 2b ] ments. Pushing exponential problems to the limit of feasibility, however, requires large memory access, which is only possible with distributed memory architectures (capability clusters). Quan- tum cluster methods, or methods based on the iterative aggregation of sites within an effective Hilbert space (e.g. the Density-Matrix Renormalization Group, or DMRG), require the solution of a mid-sized eigenvalue problem. In this way, memory requirements are kept at a reasonable level in order to gain speed of execution. For this application, capability clusters are the instruments of choice, although capacity clusters are an entry-level solution for problems requiring parametric studies. Monte Carlo methods (based on a stochastic evaluation of quantum-mechanical averages) are also widely used, and are most economically implemented using capacity clusters. Example A: The screening of a magnetic impurity by conduction electrons. When a magnetic impurity is introduced into a metal, the resulting characteristics of the material, such as the resis- tivity, cannot be described using the independent electron picture. This problem is of paramount importance for the understanding of nanoscale electronic circuits as well as quantum computing and quantum information theory. From the work of Kondo, it is also known that perturbative methods fail and a full description, incorporating all correlation effects between the electrons, is necessary. Numerical modeling has therefore been extremely important for the advancement of this active research field. Initial work by K. Wilson, for which he was in part awarded the 1982 No- bel Prize, has developed into the Numerical Renormalization Group (NRG). The DMRG method has been extensively used at SHARCNET as an alternative to the NRG method since it allows for a convenient way of calculating real-space correlations around the impurity. Both the NRG and the DMRG methods are iterative methods that, at each step, require finding the lowest eigen- state of a sparse matrix whose dimension range in magnitude from thousands to millions. Sparse eigenvalue problems are very well adapted for a distributed parallel architecture but require a very high performance of interconnectivity. On current SHARCNET computers researchers have reached sizes of 5.7 billion for exact diagonalization (ED) studies of the Kondo problem using 72 hours on 64 CPUs. These calculations provided the first direct observation of the electrical transport in [ 1a,b ] nanoscale circuits with Kondo impurities and would have been impossible to perform prior to the arrival of SHARCNET computational facility. For a single parameter, a typical DMRG calculation

– 43 – will take 100 hours on 16 CPUs, the only limitation being the per node memory capacity. This work required roughly 50 000 CPU-hours. However, an even more important factor for DMRG and ED is the per node memory bandwidth, much more than the CPU clock frequency. Example B: High-temperature superconductivity with quantum cluster methods. High-temperature superconductors were discovered in the late 1980s. A simple physical model for these materials was proposed in the Hubbard Model but it proved exceedingly difficult to study and it could not be shown how it explained high-temperature superconductivity. Of interest is the phase diagram of the model, i.e., its properties as a function of parameters such as the density of electrons or the electron-electron interaction strength. Progress has been accomplished in recent years thanks to a class of numerical methods based on the exact quantum-mechanical solution of the model on a small cluster of atoms, together with a smart embedding of the cluster within an infinite crystal lattice. One of these methods, called Variational Cluster Perturbation Theory (or V-CPT) has been used recently by physicists at RQCHP to show that the simple Hubbard Model contains the right elements to explain the basic properties of high-temperature superconductors. The calculation was in fact the first to run on the RQCHP capacity cluster [ 1a,b ] (Summer 2004), requiring roughly 100 000 CPU-hours, and would not have been feasible on a cluster of a few tens of nodes only, let alone on a single computer. The method involves an optimization problem: finding the stationary points of a function Ω(h), where h denotes one or more fictitious fields favoring the establishment of broken symmetry states like antiferromagnetism or superconductivity. Evaluating this function Ω(h) requires solving a mid-size eigenvalue problem, as discussed above, and performing a three dimensional integral. The limiting factor is speed; the memory requirements are still relatively modest. Performing more realistic calculations will be [ 2a ] best done by working sequentially in parameter space, running each case in MPI. This will require a “capability farm”: a large capability cluster with many MPI calculations running concurrently.

7.6 Global and Regional Climate Change The Challenge. The ongoing global climate change caused by the inexorable rise in atmospheric greenhouse gas concentrations constitutes a daunting challenge to the interconnected but hetero- geneous global community. Its impact will be most severely felt on high latitude regions of the northern hemisphere, in particular Canada. A 2005 report to the Prime Minister by the National Round Table on the Environment and the Economy refers to this danger as “perhaps unmatched in times of peace” and suggests that “all Canadians will be touched by climate change impacts” that “pose new risks to human health, critical infrastructure, social stability and security”. The attri- bution of the ongoing changes to increasing concentrations of CO2, CH4,N2O, and halocarbons has been clearly proven by new multidisciplinary numerical approaches that model the coupled evolution of the atmosphere-ocean-cryosphere-land surface processes as an Earth System. The con- tinuing development of such models constitutes one of the Grand Challenge Problems of modern computational science. Climate change computational models, using HPC, will have extremely [ 4b ] profound implications for national energy and environmental policy. Canadian Successes. The Canadian Scientific community continues to play a leading international [ 1a,b ] role in the development and application of large-scale HPC-based numerical Earth System models. For example, significant early stage development of the semi-spectral methodology that serves as basis for the dynamical cores of the atmospheric component of global coupled models took place at the RPN Laboratory of Environment Canada at Dorval (Qu´ebec). The semi-Lagrangian, semi- implicit methodology that is employed to efficiently time-step such models was also developed there. The Canadian Earth System Model at the Downsview, Toronto Laboratory of the Meteorological Service of Canada is based upon this RPN sub-structure, and is now being further developed in the Canadian Climate Centre at the University of Victoria and widely used in Canadian universities. A further extension of the RPN effort to include fully elastic field equations, led by a UQAM scientist,

– 44 – has led to the development of the Canadian Community Mesoscale MC2 model. The Canadian Middle Atmospheric Model (CMAM) that is playing a significant role in the international effort to better understand the stability of the stratospheric ozone layer that shields the surface biosphere from the harmful effects of solar UV-B radiation has been developed by a team led by a SciNet scientist. All of these Canadian built models are playing an important role in the ongoing work of the Intergovernmental Panel on Climate Change (IPCC), which is co-sponsored by the UN and the World Meteorological Organization, and is responsible for assessing the evolving science. Community Strength. A measure of the Canadian research community strength and influence is that the two highest ranked journals in the field, the Journal of the Atmospheric Sciences and the Journal of Climate, both published by the American Meteorological Society, have selected Canadians as their chief editors. A second measure is provided by the number of Canadians currently serving as Lead Authors in writing the 4th Scientific Assessment Report of the IPCC that will appear in 2007. There are ten Canadians involved, with two as Coordinating Lead Authors on two of the eleven chapters. Our national effort is spread over a large number of universities that host significant groups involved in global climate change modeling, including Alberta, Dalhousie, McGill, Toronto, York, UQAM, Victoria and Waterloo. Major national research networks funded by the Canadian Foundation for Climate and Atmospheric Science are currently functioning. These include the Polar Climate Stability Network and the Modeling of Global Chemistry for Climate Network (both led by SciNet PI’s), the Climate Variability Network led by a McGill-based PI, and the Regional Climate Modeling Network led by a UQAM-based PI. HPC Requirements. The requirements of this field are currently being met by parallel vector [ 2b ] systems, based on their dominance in the leading European laboratories (UK, France, and Ger- many). Canadian groups have also opted for smaller systems of this type (U Toronto, UQAM, Victoria). We have determined that alternative architectures do not adequately serve our needs in this area. This was indicated in a letter from seven University of Alberta scientists, which stated: “Although the XXX shared memory architecture is, for climate simulation, distinctly superior to PC clusters, it does not allow particularly good scalability”. To appreciate the magnitude of the computational problems that must be addressed in climate change research it is useful to consider a few specific examples. In describing these examples the NEC parallel vector systems will be employed in order to provide a single basis for comparison. Please note that use of the vendor name does not imply a predetermined purchasing commitment. Project A : Statistical equilibrium state computation under modified climate forcing. [ 2a ] • Model employed: NCAR CCSM 3.0, very low resolution (T31 atmosphere and 3x3 deg. ocean) • Calendar years of integration required: approximately 2000 years. • Machine employed: single node SX-6 with all 8 CPUs dedicated • Wall clock time required: 7 months • A single node SX-8 system operating at an aggregate peak speed of 128 Gflops will complete this job in ∼ 3.5 months, a 20 node system in approximately 5 days. Such work may be done competitively on this system. Project B : Ensemble of 10 transient simulations under changing climate. [ 2a ] • Model employed: Extended version of CMAMv8 with chemistry, T47, L95. • Calendar years of integration required: 100 years • Machine employed: single node SX-6 using all 8 CPUs but not in dedicated mode • Wall clock time required: 1 year • A single node SX-8 system operating in dedicated mode will complete this job in ∼ 5 months. A 20-node system will complete this job allocation in about 2 weeks. Such work can be done competitively on such a system.

– 45 – Project C : Regional scale climate change projections. [ 2a ] • Model employed: GEM-based CRCM version 5, continental domain, 15-km resolution • Mesh comprising a 500 × 500 horizontal grid with 60 levels in the vertical. • Simulation: double 30 year runs required (one for reference climate, one for projected). Tens of runs for each to create the ensemble needed to determine significance. • Machine: SX-6 or SX-8. • Wall clock time for 1 run using 8 nodes of SX-6 : ∼ 2 months. • A 20-node SX-8 system will complete this job in approximately 2 weeks.

7.7 Hydrology The Challenge. Issues such as surface water and groundwater quality deterioration, waste disposal, [ 2a, 4b ] the presence of infectious organisms in drinking water supplies, maintenance of aquatic habitat, and the impact of climate change on water supply require a fully-integrated approach. An ad hoc approach to water resources planning can lead to unpredictable and undesirable environmental consequences that require costly remediation or lead to irreparable damage to the resource. This challenge is internationally recognized, as are the limitations of our current computing capacity to address the pertinent issues. Numerical models that consider both groundwater and surface water quantity and quality in a fully-coupled, holistic fashion are conceptually and numerically challenging, especially where complex reactions occur. Areal watershed models generally represent the surface water components adequately but overly simplify or entirely ignore the dynamics of groundwater. Standard groundwater models, on the other hand, ignore the dynamics of surface water. An integrated 3D surface/subsurface-modelling framework will require the bridging of these scale differences. The development of such models is recognized by the US NSF as one of the “Grand Challenge Problems” of modern computational hydrology and the predictions made with them have extremely important implications with respect to water resources management and environmental policy, especially in the context of sustainable growth and the impact of ongoing climate change. Canadian HPC Successes. The members of the Canadian research community play a leading [ 1a,b ] role in the development and application of integrated numerical surface and subsurface hydrology models. The recently developed 3D control-volume finite element watershed model (HydroGeo- Sphere) is an advanced fully-coupled surface/subsurface flow and contaminant transport model. It incorporates the physical processes of 3D variably-saturated subsurface flow, 2D overland/surface water flow, and multi-component, reactive, advective-dispersive contaminant transport. Its devel- opment is being led by research groups at Waterloo and Laval, and researchers at a number of universities throughout Canada and the World are using it and its precursor, FRAC3DVS. The further enhancement of the model has recently been accelerated through an evolving partnership with the US Bureau of Reclamation (USBOR) and the California Water Department to study a variety of water-related issues within the Central Valley of California. The HydroGeoSphere model has recently led to significant networking opportunities with European researchers. The model is at the core of a large EU Framework VI project which links numerous researchers from 18 EU universities and research institutions to study the impact of non-point-source contaminant inputs, including climate and land-use change, on surface water and groundwater quality and on soil functioning within several key watersheds within Europe, such as the Rhine and the Danube. Community Strength. One measure of this quality is the fact that Canadian hydrogeologists have in the recent past served as editors of the two highest ranked journals in the water resources field: Water Resources Research published by the American Geophysical Union and the Journal of Contaminant Hydrology published by Elsevier. Canadian hydrogeologists have also served as leaders of major international scientific societies, including past Presidents of the Hydrology Section of the American Geophysical Union, past Presidents of the Hydrogeology Division of the Geological

– 46 – Society of America, President of the International Commission on Groundwater of the International Association of Hydrological Sciences, among other high-ranking posts. The Institute for Scientific Information (ISI) recently assembled a list of the 250 most highly-cited researchers in the entire field of engineering throughout the World: of the 11 Canadians listed, 4 are hydrogeologists. HPC Requirements. The computational requirements of integrated surface/subsurface hydrolog- ical modeling is best met by tightly coupled parallel computer systems as is the norm in the leading European laboratories (UK, France, Germany). Alternative architectures, such as loosely- coupled clusters, do not serve our current needs, let alone the demands required to perform 3D simulations at the basin or continental scale over time frames of hundreds to thousands of years. To appreciate the magnitude of the computational problems that must be addressed in coupled surface/subsurface hydrological research it is useful to consider a few specific examples. Leading Projects with HPC Requirements. Project A - Canadian landmass-scale computation of groundwater flow system evolution under [ 2a ] glacial cycling. Over the last two years, Canadian scientists have been conducting an application of HydroGeoSphere that entails the 3D simulation of the coupled surface and subsurface flow regimes for all of Canada in a fully-integrated manner that is driven by paleoclimate, in particular climate-induced glaciation and deglaciation of the North American continent. This work entails asyncronous linkage of HydroGeoSphere with a comprehensive, climate-driven dynamical model of the advance and retreat of the Laurentide ice sheet. Wall-clock times for each 120 K-year simulation on a 16-processor IBM RISC-based, shared-memory machine requires on the order of two weeks even after linearization of the governing flow eqautions and a de-coupling of the surface water flow regime. Through this effort, we are now poised to address the impact of future climate change on Canada’s water resources at a national scale, but require substantially expanded computing resources. We estimate that a single one-hundred year simulation driven by future climate-change scenarios, and with full-coupling of the surface and subsurface flow systems would require several months of CPU time on existing SHARCNET facilities. Project B - Ensemble of 100 Monte Carlo transient simulations of high-level radioactive waste [ 2a ] repository performance under changing climate. The FRAC3DVS code has been adopted by the Canadian spent nuclear-fuel geoscience program to investigate the safety case for the disposal these wastes in a Canadian Shield setting. Inclusion of complex 3D fracture-zone networks imbedded in the host rock, along with the influence of dense groundwater brines at depth and with an accounting of the effects of glacial loading/unloading over 100 K-year time frames, requires on the order of 3 days of CPU time per simulation on a 3.5 Ghertz PC for each Monte Carlo realization. About 100 realizations are needed to capture the prediction uncertainty with any realism. Project C- Groundwater discharge to coastal environments. Performed in 3D, a single simulation [ 2a ] of density-dependent flow in basins on the scale of 100 km2 can have clock times of 3 to 4 months (WestGrid statistics). This forces us to simplify the problem to consider only 2D elements of the flow, where in fact significant elements of the fluid motion are inherently three-dimensional.

7.8 Aerospace The Challenge. Over the past thirty years, the aerospace design process has gradually been revolutionized. It has transformed from a primary reliance on testing to become heavily supported by computational analysis. This change has been most notable in the field of computational fluid dynamics to predict complex turbulent flows, which has greatly reduced the dependence on expensive wind tunnel and flight tests. The development of reliable, efficient and accurate computational tools have greatly reduced the design cycle time and led to more efficient and safer aircraft and engines. Nevertheless, there remain numerous challenges, many of which will be progressively addressed through the use of future capabilities in HPC. The design of low-emission

– 47 – engines is dependent on a better understanding of the interaction of chemistry and fluid dynamics in turbulent combusting flows. Similarly, further reductions in aerodynamic drag are essential in order to reduce fuel consumption and emissions. Development of advanced concepts to reduce drag, such as flying wing configurations, adaptive wings, and active flow control devices, is highly dependent on high-performance computing. Recently, high-fidelity predictive capabilities have been used together with optimization algorithms to produce optimal designs. While much current [ 3f ] research in high-fidelity optimization is concentrated on a single discipline, such as aerodynamics, in the not-too-distant future one can envisage multi-disciplinary optimization of a complete aircraft and rotorcraft. This would be an enormous computational challenge highly dependent on HPC. Canadian Successes. Canada’s aerospace industry, ranking fourth worldwide, is a key contributor to the national economy. It is Canada’s leading advanced technology exporter and employs 80,000 workers. Canadian companies are world market leaders in regional aircraft, business jets, commer- cial helicopters, small gas turbine engines, and flight simulators. The larger companies, such as Bombardier Aerospace, Pratt & Whitney Canada, Bell Helicopter and CAE invest heavily in R&D partnerships with HPC-based Canadian universities, and lately in unison through Consortia like CRIAQ or through multi-sponsored NSERC Industrial Research Chairs. A flourishing number of [ 1g ] multidisciplinary applications of computational fluid dynamics have been developed and industry has made extensive use of such tools in the design process. For the Canadian aerospace sector [ 4a ] to remain competitive, it will have to continually improve design time, which will be facilitated greatly by advancements in computational techniques and HPC. The Community Strength. Several Canadian researchers are world leaders in the development and application of computational algorithms for aerospace applications. Members from these research groups are international journal editors, and have edited and authored books in this field; as well as often invited as keynote conference speakers. These researchers have a long history in developing [ 1a,b ] computational fluid dynamics (CFD) software and, interestingly four major codes, StarCD, CFX, VSAERO and FENSAP-ICE, are spin-offs from Canadian researchers who developed them at UBC, Waterloo, RMC and McGill. As a measure of that unique success, taken together, it would be highly improbable that any major aerospace corporation around the world today is not using one if not more of these four software. Canadian academic-industry projects have led to hundreds of joint scientific publications and are a testimony to the originality and applicability of the research. Thus, overall, the Canadian research community has a golden track record in computational methods for aerospace applications and will continue contributing significantly to this sector. HPC Requirements. Computational analysis to optimize the design of aerospace vehicles and [ 2a ] engines requires very large computing resources. The international escalating scientific challenges will continue to strain and exceed our capabilities in HPC and can only be met with continuous upgrades in our faculties. • For example, a solution of the steady flow field about an aircraft requires well over ten million mesh nodes for sufficient resolution. There are at least six degrees of freedom per node, leading to linear equations systems with over sixty million degrees of freedom and solution times ranging from ten to several hundred hours on a single processor. • Optimization of an aircraft based on such analysis will require several hundred flow solutions, at ten to twenty different operating points. Such an optimization will require 104 to 105 processor hours. • If one now adds the complexity of the fourth dimension, time, in order to realistically account for unsteadiness, be it from rotor-stator interaction in a multi-stage jet engine, or from the effect of propellers and rotors on aircraft and rotorcraft, solution times can easily be from 5 to 10 times higher than the steady state. • The next step of complexity would be enriching the physical models by migrating to more complete turbulence models, such as large eddy simulation and, certainly not in the near

– 48 – future, direct numerical simulation. That step is sure to add another factor of 5 to 10 to the calculations. • Following this, multidisciplinarity must be addressed in order to streamline the current lengthy [ 3f ] and inefficient sequential nature of aerospace design such as aerodynamics – structural analysis – stability and control – acoustics – icing. Putting two disciplines together is not additive, as most interactions are very nonlinear and can easily quintuple the overall solution time, but yield significant savings in terms of design time and manpower, added performance, and enhanced safety. Examples are fluid-structure interaction, which is becoming the norm rather than the exception in industry. The simulation of in-flight ice accretion on an aircraft requires four suc- cessive complete solutions for impingement, accretion, anti-icing and performance degradation, thereby quintupling the resources needed for a flow analysis. Of similar complexity is to predict the propagation of noise generated from an aircraft, rotorcraft or engine. This requires spectral resolutions to capture pressure signals that are from 6 to 7 orders of magnitude smaller than the flow itself, giving solution times in the order of days if not weeks. With a high-performance parallel capability high-memory computer consisting of from one to [ 2b ] a few thousand processors, the above tasks translate to wall-clock times of tens to hundreds of hours, assuming good parallel scalability. It is thus critical that the Canadian aerospace research community have suitable computing resources needed to develop the algorithms required to tackle current problems.

7.9 Computational Biology The role of HPC in Computational Biology. HPC has become an essential tool for biological and biomedical research. We briefly describe some examples: [ 2a,2b ] 1. Genome sequencing projects are revealing the massive catalogue of genes in organisms ranging from yeast to man. We face major challenges in assigning functions to the many thousands of uncharacterized genes and then using this information to identify the proteins that work together to control cellular processes. 2. Genetic studies of complex human diseases now test over 500,000 genetic markers per subject, such that a dataset for a current colon cancer study in Ontario/Qu´ebec/Newfoundland involves over 1.5 billion genotypes in 5,000 subjects. The search for gene-gene and gene-environment studies require huge computational power, in order to develop a set of genetic and environmental risk factors that can be used in cancer screening programs. Genetic studies will embrace new sequencing technologies that promise full genome sequencing of thousands of individuals in disease studies, including complete studies of tumors, and evaluation of epigenetic changes across tissues. These projects will dwarf previous HPC needs of the Human Genome Sequencing Project. 3. High-throughput microarray technologies are enabling studies of all genes of over 200 species. Only with HPC can we look for optimal structure and parameter values for networks of 25,000 genes, and rapid identification of specific functional interactions between genes. 4. New advances in high-resolution, high-throughput tools such as mass spectrometry are making it possible to rapidly identify and characterize proteins. Spatial and temporal mapping of protein-protein interactions in live cells has emerged as a powerful platform technology requiring the acquisition, storage, and analysis of large (100’s of TB) image datasets. 5. Proteins are dynamic, undergoing rapid changes between various conformational states. Un- derstanding a protein’s biological function requires a first-principles understanding of how these interconversions occur since the biochemistry of diseases cannot be deduced from static folded structures alone. New efficient parallelization algorithms, innovative software, and rapid ad- vances in high-performance computing hardware are enabling technologies for atomistic- and coarse-grained molecular dynamics (MD) simulations of large, complex systems and processes, including protein folding and membrane protein dynamics.

– 49 – Canadian HPC Successes. Canadian HPC successes in computational biology are many, including [ 1a,1e,1f ] new efforts in distributed computing such as the Trellis Project (see p. 20), which enables the creation of metacomputers from geographically distant HPC infrastructure. In 2004, the Trellis project engaged over 4000 computers nationwide to study two problems of significant biologi- cal importance: (1) protein folding (Chan:Toronto; Tieleman: Calgary); (2) proton transport in biological membranes (Pomes: Toronto). Community Strength. Canadian researchers are world-class in the design and implementation of high-performance computing in computational biology. These includes membrane biophysics simulations and methods development for membrane simulations (Tieleman: Calgary), theoretical and computational approaches to protein folding and conformational properties of biomolecules (Chan:Toronto), theoretical methods applied to the structure, function and dynamics of biolog- ical macromolecules (Pomes: Toronto), complex trait genetics (Hudson: McGill) and functional genomics (Hughes, Frey: Toronto; Rigault, Corbeil: Laval). Indeed the application of high perfor- mance computing in biology has spawned numerous initiatives, including the Canadian Bioinfor- matics Workshops. HPC Requirements. The HPC needs for the computational biology, and indeed the biological sci- [ 2b ] ences community in general, are diverse and demanding. These range from immense storage (100’s of TB) required to archive information derived from massively parallel high-throughput micro- array experiments and all-atom molecular dynamics simulations of protein folding and membrane dynamics, to large (1000’s of nodes) cluster computing systems for performing proteomic and ge- nomic data analysis, complex molecular dynamics simulations of biomolecular systems comprising tens of thousands of atoms, and machine learning algorithm development. Example. In view of the long-standing paradigm that structure begets function, it is critically important that the physico-chemical forces that underlie the three-dimensional structures of folded proteins be resolved. Recent experimental advances point to an even greater challenge: the biolog- ical functions of proteins often rely on large-scale conformational fluctuations. Hence, a knowledge of the folded structure of a protein is sometimes insufficient for deciphering, let alone under- standing its function. Moreover, many proteins are found to be function in intrinsically unfolded forms. Thus, gaining insights into the dynamic and ensemble properties of proteins is crucial. Computational simulations are a powerful means of addressing such issues, and their implications for maladies such as Alzheimer’s and prion diseases, which are among an increasing number of diseases found to arise from protein misfolding and aggregation. Simulations of protein folding require extensive investigations of the available conformations using appropriate inter-atomic and inter-molecular interaction potentials. Herein lies the key challenges: • How should the effectiveness of these empirical potentials be improved? • How should they be systematically derived and evaluated? • How should these potentials be applied, in a computationally tractable manner, to provide insights about the relative importance of a protein’s different conformations? Answering these questions at either the atomistic- or even coarse-grained level requires sig- nificant computational resources. One needs to look at not only interactions within the protein but also with its surroundings (i.e. solvents such as water) while simultaneously formulating de- tailed analytical approaches to modelling the interaction potentials between specific functional groups. The potential payoff is tremendous – Efficient modelling software coupled with realistic interaction potentials will allow computational biologists to rapidly predict interactions between proteins, analyse mass spectrometry data, and elucidate the structure of gene products simply from sequence.

7.10 Brain Imaging

– 50 – The Challenge. Brain imaging brings together physical scientists (physicists, chemists, computer [ 3f ] scientists, engineers) and neuroscientists (psychologists, molecular biologists, clinical researchers) in the study of the living brain, both in normal brain and disease. A range of brain scanner techniques collect 3D and 4D data about the brain’s structure and function, investigating the mechanisms which underlie: 1. Dynamic changes in the brain from birth through to old age; 2. Normal mechanisms of perception, language, memory, motor skills, cognitive skills; 3. Brain disorders: Alzheimer’s disease, Stroke, Multiple Sclerosis, Schizophrenia; 4. Societal ills: drug addiction, stress, alcoholism, depression. In the last 20 years, the field has exploded as computational advances allow complex analysis of raw imaging data in human brain research. Recently, there have been rapid increases in animal brain imaging where, for example, we can measure the impact of genotype manipulations (gene knock-outs, transgenics) on brain development (the phenotype) in rodent models. This “genotype- phenotype” research is the next significant challenge beyond the Human Genome Project and will require massive computational resources. Magnetic resonance imaging (MRI) assesses brain anatomy in exquisite detail, allowing us to measure changes in normal tissue or disease over time. Positron emission tomography (PET) measures brain chemistry such as the activities of dopamine, serotonin and other neurotransmit- ters. Functional MRI (fMRI) detects physiological changes like blood flow in different parts of the brain and allows researchers to identify areas engaged in specific processing like pain perception or memory. Magneto-encephalography (MEG) detects electrical activation in a brain over time, giv- ing information on the temporal ordering of events, how different brain regions communicate with each other. MEG tells us “when” an event occurs but not “where” PET and MRI tell us “where” and “how much” but not “when”. The great benefit of advanced brain computational imaging, and indeed the next challenge, is to localize focal physiological changes and their interactions. These multi-modal tools will be vital for the assessment and the development of therapies relat- ing to neurodegenerative diseases, and in particular diseases associated with the aging Canadian population. History. Since the beginning of brain imaging in the early 1980’s, Canadian researchers have been major leaders. There are world-renowned centres in Montr´eal, Toronto, Vancouver and London. At the McConnell Brain Imaging Centre (BIC) in Montr´eal, a community of 150 physical, computational, neurobiological scientists and trainees investigate the normal brain and a variety of disorders in humans and in animal models. The BIC hosted the leading international meeting in the field in 1998. (2000 attendees). The BIC’s innovative use of computational analysis was recognized by a Computerworld Smithsonian Award and a permanent archive in the Smithsonian Institute (www.cwheroes.org, 1999 Science category). The BIC is a hub for a large-scale data analysis for many international projects (e.g. a 30 M$ NIH multi-centre project to study pediatric brain development). The BIC has received numerous CIHR Group Grant awards, major NIH consortia awards and a 35 M$ CFI award in 2000 to re-equip the brain scanner infrastructure. HPC Requirements. Functional images (fMRI, PET) measure brain physiology, including blood [ 2a ] flow, neurotransmitter/receptor density and brain networks that are “activated” during task per- formance (vision, language, memory, pain etc.). Serial 3D imaging creates a 4D datasets as the brain responds to pharmacological/cognitive stimuli. Small changes in physiology identify the re- gions involved and the interaction between “activated” regions changes over time through learning, development and disease. Characterizing these correlated network underlying a particular behav- ior is a massive computational undertaking for each of N voxels8 (N 2 correlation analyses where N ∼ 106).

8 A voxel is to a 3D data set what a pixel is to a 2D image.

– 51 – Similar strategies are employed for structural imaging. Much of the analysis involves iterative deformation of 3D volumes (∼ 106 voxels) or folded 2D cortical manifolds ( 300K vertices). Tensor maps of fiber tracts in the brain are extracted using 3D path analysis. Subtle anatomical signals are detected by applying these analyses to large numbers (1000’s) of brain datasets. Data analysis requires compute-intensive image processing techniques, all in 3 or 4D: 1. Tensor analysis of fiber pathways throughout the brain; 2. Non-linear 3D image deformation maps each brain to a common coordinate space; 3. Segmentation of the 3D images to assign an anatomical ‘label’ to each voxel; 4. Finite-element modeling of the cortical surface’s folding patterns; 5. Statistical detection of voxels with significant changes in structural or functional signal; 6. MRI/PET analysis using Bayesian formalism and differential equations at each voxel; 7. Simulation experiments which model data acquisition using Monte Carlo techniques. These steps are fully automated and are performed within a “pipeline” environment that pro- cesses large image databases for local researchers and academic collaborative networks. These pipelines also apply in the commercial arena. Clinical trials of new pharmaceuticals traditionally employ subjective clinical assessment of patient status. Imaging allows objective, in-vivo, quanti- tative assessment of disease progression and treatment impact. In 1997, the BIC completed the first fully-automated analysis of a phase III clinical trial image database (6000+ 3D datasets from 14 international centres). The major limiting factor in brain imaging research is computational capacity. The scanners [ 2b ] generate very large amounts of data (present BIC capacity is 25TB). We cannot keep up with the computational demands of complex and iterative 3D voxelwise analysis on each of perhaps thousands of brains. Many important questions have to be addressed at inferior spatial resolution or not explored at all. Canadian brain imaging researchers need continuous access to high-end HPC to maintain their position at the forefront of international brain imaging research field.

7.11 Large text studies The role of HPC in Textual Studies. We live in an age of excess information. In a study titled, “How Much Information?” researchers estimated that 5 exabytes of new information are being generated each year.9 Google, before they stopped advertising how many pages they had indexed, could search 8,168,684,336 Web pages. The Wayback Machine (Internet Archive) claims to have 55 billion web pages archived for browsing. In short, researchers face an excess of information much of which is in textual form and we need to develop new tools for analyzing large-scale collections that are too large to read in traditional ways. Researchers that work with textual evidence from literary studies to law need HPC to help create custom aggregations for focused research that can be combined with the analytical tools that can handle large-scale text datamining. One way to think of the problem is to look at the scale of text collections that research tools were designed to handle. Early concording tools developed in the 1960s and 1970s like OCP (Oxford Concordance Program) and Micro-OCP were designed to create a concordance for a single coherent text. These programs were batch programs that produced a concordance that could be formatted for printing. Tools like TACT and TACTweb, both developed in Canada, were designed to handle single texts or collections of works like a corpus of work by the same author. Today, thanks to the breadth and energy of digitization projects, there is on the Web an exploding amount of information of interest to humanities scholars. Whether you are interested in 18th century literature or an aspect of popular culture and discourse like discussion of the “cool” among teens there is an excess of information now in digital form that can be gathered and studied. Some

9 See www.sims.berkeley.edu:8000/research/projects/how-much-info-2003/

– 52 – large-scale text databases have emerged like CanLII, ARTFL and the Women Writers Project. These have been built with custom retrieval tools optimized for their size. What we don’t have is generalized tools for gathering and studying large-scale collections or ways of using those enormous collections like Google that have emerged. Canadian HPC Successes. Humanities researchers and others concerned with textual evidence [ 1a,1b,1e ] in law, library science and informatics are at the forefront of the field of computer assisted text analysis. The TAPoR (Text Analysis Portal for Research) project, which involves researchers at 6 Canadian universities, is leading the development of digital collections and innovative techniques for analyzing literary and linguistic texts and a community portal that provides open access to web service text tools (see www.tapor.ca). The Dictionary of Old English project and the LEME (Lex- icons of Early Modern English) project are unique historical dictionary projects providing founda- tional tools for the study of historical texts (see www.doe.utoronto.ca/ and leme.library.utoronto.ca). LeXUM, the Universit´ede Montr´eal’sjustice system technologies laboratory, is a leader in the field of legal data processing. LexUM is a partner in the development of CanLII, a large-scale database of legislative and judicial texts, as well as legal commentaries, from federal, provincial and terri- torial jurisdictions on a single Web site (see www.lexum.umontreal.ca and www.canlii.org). Community Strength. The Social Sciences and Humanities currently includes over 18,000 full-time faculty who represent 54% of the full-time professors in Canadian universities. This community depend for their research on scholarly texts from primary sources to journals. While most use electronic resources in some form few use computer assisted text analysis and even fewer use HPC. Focused research groups like TAPoR are the link with HPC consortia like SHARCNET. TAPoR is a national project involving 6 universities that is proposing to expand to 9. It involves the major players in textual computing and has the expertise to bridge text research and HPC. HPC Requirements. Data empires or large-scale collections, from the perspective of computing humanists, have the following features: • They are too large to handle with traditional research practices. • They are heterogeneous in format. • They are often multimedia rich in that they are not just texts in the sense of alphabetic data meant to be read but include page images, digital audio, digital video and other media objects. • They are not coherent in the sense that they cannot be treated as corpora with an internal logic the way one can treat all the writings of a particular author. The challenge is how to think about such large collections and use them in disciplined research. The HPC challenges include developing: • Ways of gathering and aggregating large-scale collections for the purpose of study especially when distributed. We have to develop virtual aggregation models. • Ways of understanding the scope of these custom aggregations so that one can orient research questions to them. We need to be able run mining tools on distributed aggregations. • Ways of asking deep questions of these aggregations. We need to be able to run analytical tools on subsets of these aggregations. These types of problems are not new to HPC. Indexing, retrieval and datamining on large collections of data has been the subject of research at groups like the Automated Learning Group at the NCSA.What is new is adapting these techniques, often developed for business data or biomedical data, to textual research in history and human culture. It is not the case that there are “grand problems” in the arts and humanities that are waiting for text mining techniques. The very questions change when you ask about heterogeneous and incoherent collections. The most dramatic challenge these text empires pose is to our understanding of what research in the humanities is. We are at a threshold where we can stay the disciplines of the book (and archive) or

– 53 – become the disciplines of human art and expression. If we are about expression we have to adapt our research practices and questions to a different scale of information. We will have to learn to [ 2b ] use mining techniques, visualization techniques, and statistical techniques suited to the scale of information in the digital age. The textual disciplines may end up the major user of HPC.

7.12 Collaborative Visualization The Problem and its Importance. Visualization has been widely recognized as a critical component of modern computational science. The US NIH/NSF report on Visualization Research Challenges (2005) states “Visualization is poised to break through from an important niche area to a pervasive commodity... ” This break through will be driven by the need for computational scientists to understand increasingly complex data. During the past 20 years the world has experienced an ?information big bang?. New information produced in the years since 2003 exceeds the information contained in all previously created documents. Of the information produced since 2003, more than 90% takes digital form, vastly exceeding information produced in paper and film forms. Among the greatest scientific challenges of the 21st century is to effectively understand and make use of this vast amount of information. If we are to use information to make discoveries in science, engineering, medicine, art, and the humanities, we must create new theories, techniques, and methods for its management and analysis. The goal of visualization is to bridge the gap between the data, the computation that produces the data, and the human consumers of the data, providing the ability for researchers to share complex visualizations, control remote computations, and interact with both their visualization and computational environments in an intuitive and natural manner. HPC Requirements. Visualizing large, complex data sets requires the tight coupling of data [ 2b ] storage, computation, and graphics hardware. Computational simulations that produce millions of data point points per time step, running over many time steps, produce vast amounts of data. Visualizing such time varying data can require memory in the 10s of gigabytes (GB). In addition, the extraction of graphics primitives from raw data and the processing of those graphics primitives into an effective visualization require the use of both parallel computation and parallel graphics in the form of CPU and GPU (graphics processing units) clusters. As data sets scale, so must the computation and graphics. Canadian Successes. In addition to pure and applied visualization research, there is a wide range of research projects in Canada that make extensive use of visualization as a critical tool in support of their research. Example projects are described below: Collaborative Computational Steering. The AMMI group at U. of Alberta, in collaboration with [ 1a,b ] ACE group at SFU, has developed an advanced computational steering, visualization, and collab- oration environment for computational science. Driven by the visualization needs of research on the hydrodynamics of the earth’s core by Dr. Moritz Heimpel at U. of Alberta, this project has expanded to include a wide range of scientific domains including the simulation of CFD and of biological cellular processes (Project CyberCell). These projects, with increasing data set sizes, require both increasingly capable visualization algorithms as well as visualization hardware to enable researchers to interactively explore these complex simulations. This research has recently expanded to include participation in the Global Lambda Visualization Facility (GLVF) project, an international testbed for advanced visualization, collaboration, and computational steering (www.evl.uic.edu/cavern/glvf/). This participation is being facilitated by the excellent collabora- tion/visualization infrastructure provided by WestGrid (www.westgrid.ca/collabvis). Parallel Visualization. Research at the SCIRF lab (scirf.cs.sfu.ca) at SFU targets the development [ 2a,b ] of new interactive visualization algorithms for large data through data compression, hierarchical and adaptive visualization methods, parallelization, and leveraging the power of GPUs. In partic- ular, developing new algorithms for data pre-processing and visualization directly on GPUs can

– 54 – provide a drastic increase in algorithm performance. Such computations are currently limited to lab systems that have relatively small numbers of processors and GPUs and as a result we are restricted to data set sizes of a few GB. To cope with increasing data set sizes, we are faced with the need to scale these computations to large parallel GPU clusters. Such systems do not currently exist, but their availability, through the NPF program, would enable us to implement interactive visualization algorithms for data set sizes that are currently not possible. Canadian Centre for Behavioural Neuroscience (CCBN). The development of new imaging tech- nology applicable to biological systems has spurred a revolution in neuroscience (See Case Study 7.10). The field stands on the brink of being able to see unprecedented anatomical detail, levels of key signaling molecules, and dynamic changes related to information processing, progression of brain pathology, and the evolution of therapeutic effects. The most effective mechanism to deal with this increase in imaging data is through image analysis and visualization. The application of visualization technologies improves the ability to observe molecular signals in spectroscopic imag- ing approaches in optical-UV to infrared wavelengths, improves the time to capture and analyze images, and improves the ability to observe specific cell types. For researchers at CCBN (Uni- versity of Lethbridge), these technologies are critical in the areas of functional brain imaging, cell specific imaging, and the detection/evolution of neuropathology. Biochemistry and Medical Genetics. To understand the three-dimensional (3D) organization of [ 2a,b ] the mammalian nucleus in normal, immortalized and tumor cells, visualization of the nuclei is key. Previous approaches using two-dimensional (2D) imaging have obvious limitations; nuclear structures that are found in various focal planes cannot be properly displayed and therefore not properly visualized. The result is that the interpretation of the data is incomplete. In the worst case scenario, the interpretation is wrong. This is a limitation for basic and clinical research. Three- dimensional imaging, coupled with visualization of objects within the 3D space of the nucleus, has enabled researchers to view the organization of such objects in a qualitative manner. For the first time, objects are viewed in the 3D space of the nucleus with high accuracy and the interpretation of the data is unbiased and objective. Researchers are able to assess parameters, such as spatial relationship of objects to each other, overall distribution of multiple objects, and their relative sizes. None of the above is possible without visualization. Dr. Sabine Mai at the University of Manitoba represents the 3D technology node for Canada in international collaborations with the Netherlands, Italy and the US. X-Ray Crystallography. Researchers at McMaster University’s Analytical X-Ray Diffraction Fa- [ 1a,b,2b ] cility use 2D CCD detectors for collecting diffraction data on single crystals (Chemical Crystallog- raphy) for molecular structure analyses and on polycrystalline solids (alloys, thin films, polymers, etc.) for crystallite orientation distribution (texture) analyses. Until recently, researchers have not been able to view the data set as one complete diffraction volume. With emerging visualization software and hardware technologies researchers can now examine the full 3D diffraction pattern and all pole figures in a single visualization. For example, for single crystal data a regular series of Bragg diffraction spots (points) are used for solving the structure. Interesting packing disorders result in diffraction intensity between the spots. These new visualization techniques give us critical insights into these unusual crystal packings. Physics-Based Simulation of complex phenomena. Researchers at the Computer Vision and System [ 2b ] Laboratory (U. Laval) and non-academic partners are developing a platform for physics-based simulation that is coupled with distributed immersive visualization environments (4-wall CAVE and other systems). The research aims to combine complex geometric and photometric models of urban areas with GIS information and real-time physics-based simulation to train search and rescue teams.

– 55 – APPENDIX : Glossary

ACEnet Atlantic Computational Excellence Network CFI Canada Foundation for Innovation CIHR Canadian Institutes of Health Research CITA Canadian Institute of Theoretical Astrophysics CLUMEQ Consortium Laval, Universit´edu Qu´ebec, McGill and Eastern Qu´ebec for HPC DFO Department of Fisheries and Ocean HPCVL High Performance Computing Virtual Laboratory HQP Highly qualified personnel (PDFs, analysts, graduate students, etc.) IOF Infrastructure Operating Fund (CFI) LRP Long Range Plan for HPC in Canada, published through C3.ca MFA Major Facilities Access program (NSERC) MITACS Mathematics of Information Technology and Complex Systems MRP Major Resource Provider (now synonymous with a consortium) MSC Meteorological Services Canada NCE Network of Centers of Excellence (NSERC) NIC National Initiatives Committee, authoring this NPF proposal NINT National Institute for Nanotechnology NPF National Platforms Fund (CFI) NSC National Steering Committee, for the NPF NSERC Natural Sciences and Engineering Research Council of Canada ORAN Optical Regional Advanced Network RFP Request For Proposal RQCHP R´eseau Qu´eb´ecois de Calcul de Haute Performance. SciNet Science Network SHARCNET Shared Hierarchical Academic Research Computing Network SSHRC Social Sciences and Humanities Research Council of Canada TASP Technical Analyst Support Program TRIUMF Canada’s national laboratory for particle physics research WestGrid The Western Canada Research Computing Grid APPENDIX : Evaluation criteria from the CFI guidelines

1. Results and outcomes of past HPC investments. Past investments: a. enabled leading-edge research on computationally-challenging questions that would not have been possible to undertake without the HPC resources; b. enabled institutions and their researchers to gain a competitive advantage nationally and internationally; c. attracted and retained excellent researchers; d. enhanced the training of highly qualified personnel through research; e. strengthened partnerships among institutions and enhanced the efficiency and effectiveness of HPC resources; f. provided resources that are used to their full potential; g. contributed to bringing benefits to the country in terms of improvements to society, the quality of life, health and the environment or contributed to job creation and economic growth. 2. Quality of proposed research or technology development and appropriateness of HPC resources needed. The investments will: a. enable computationally challenging research with the potential of being internationally com- petitive, innovative, and transformative, and that could not be pursued otherwise; b. meet the needs of institutions and their researchers effectively and efficiently; c. provide a high degree of suitability and usability; d. are potentially scalable, extendable or otherwise upgradable in the future; e. incorporate reliable, robust system software essential to optimal sustained performance; f. provide a suitable and sustainable physical environment to accommodate the proposed sys- tems, including adequate floor space, power, cooling, etc. 3. Effectiveness of the proposed integrated strategy of investments in HPC in contributing to strengthening the national capacity for innovation. The investments will: a. build regional, provincial, and national capacity for innovation and for international com- petitiveness; b. ensure complementarities and synergies among regional facilities; c. combine the expertise of regional facilities to ensure researchers have access to unprecedented depth and support in the application of HPC to the most computationally challenging research; d. attract and retain the best researchers or those with the highest potential; e. create a stimulating and enriched environment for training highly qualified personnel; f. strengthen multidisciplinary and interdisciplinary approaches, collaborations among re- searchers, and partnerships among institutions, sectors, or regions; g. ensure effective governance, including the management, accessibility, operation and mainte- nance of HPC resources on an ongoing basis; h. address all aspects and costs as well as long-term sustainability issues. 4. The potential benefits to Canada of the research or technology development enabled by HPC. The activities enabled by the investments will: a. contribute to job creation and economic growth in Canada; b. support improvements to society, quality of life, health, and the environment, including the creation of new policies in these areas.