Research Computing Facility An Update from Dr. Francesca Dominici June 20, 2013

Dear all,

We are very excited to provide you some important updates regarding the research computing facility at the Faculty of Arts and Science (FASRC) http://rc.fas.harvard.edu. Please note that we are phasing out the HSPH cluster, and if you are currently leasing nodes on the HSPH cluster we will be working with you to migrate to FASRC. We have developed a FAQ document, which is available at the web link https://rc.fas.harvard.edu/hsph-at-fas-rc-frequently- asked-questions/ and also included in this message.

Updates:

1. 158 HSPH accounts have been opened on FASRC, enabling users to run computing jobs on the FAS High Performance Computing Cluster (HPCC), also known as Odyssey 2. Several HSPH faculty have worked with the FASRC team to purchase data storage equipment and hardware that have been deployed at FASRC in Cambridge and linked to Odyssey via a secured network 3. FASRC has developed personalized solutions for our faculty to transfer secure data from HSPH to FAS in accordance with data user agreements. 4. FASRC has been mentioned as a key strength in training and research grant applications from HSPH, and high impact papers have been published that previously were delayed for lack of computing power 5. Please bookmark the web site http://rc.fas.harvard.edu/hsph- overview/ for additional and up to date information

To access FASRC you will be charged approximately $3000 per year per account. Access for PhD and ScD students is free. This cost is used to provide salary support for the RCFAS team. By supporting members of the FASRC we have access to the whole group of system administrators, software engineers, and database managers at FASRC (http://rc.fas.harvard.edu/about-rc/research-computing- staff/). This fee will allow each account’s owner to:

1. Access Harvard’s largest research computing environment, Odyssey (http://rc.fas.harvard.edu/kb/high-performance- computing/architectural-description-of-the-odyssey-cluster/) 2. Obtain a personalized letter of support from the FASRC team for your grant submissions. The letter will describe the FASRC environment and describes the level of support and expertise that the FASRC team will be able to provide for that particular application 3. Request individual consultation to plan computational requirements, acquisition of hardware, data storage and dissemination, CPU, etc. Please note that FASRC and its director James Cuff are representing Harvard in the Massachusetts Green High Performance Computing Center (MGHPCC) initiative. This computing center, the result of a $168 million investment, has the immediate capacity for 10,000 high- end computers, with expansion capabilities for an additional 10,000 computers. This means that by opening an account with FASRC, our faculty will have the ability to purchase an unlimited amount of data storage and computing power (http://www.mghpcc.org/) 4. Access to local support at HSPH for research computing consulting 5. Obtain free consultation from the whole FASRC team

The FASRC access fee is very modest compared to those levied by other research groups and organizations that charge back research computing costs; we are able to benefit from subsidies provided by FAS, Harvard, and HSPH for infrastructure and specific user groups. Your departmental administrators and grants managers will be able to tell you exactly how to budget the FASRC computing fee to your existing grants, and we have developed standardized language regarding FASRC services that you can add to your budget justification.

Please do not hesitate to email me if you have any questions or concerns.

FREQUENTLY ASKED QUESTIONS Also available at https://rc.fas.harvard.edu/hsph-at-fas-rc-frequently-asked- questions/

• What do I need to do to sign up for a FAS RC account?

Complete the sign up form process at https://account.rc.fas.harvard.edu/request/. Your PI will receive an e-mail containing a link to approve your account.

• What type of support am I receiving for signing up to the FAS RC cluster?

We have experience in numerous aspects of research computing support, including software installation and troubleshooting, performance optimization, and storing large and/or secure data. With your account, you'll receive access to cluster compute and storage resources as well as support from the entire FAS RC team. The FAS RC team and their expertise are at https://rc.fas.harvard.edu/about-rc/research-computing-staff/. More specifically, by signing up each account’s owner has:

1. Access to Harvard’s largest research computing environment, Odyssey (http://rc.fas.harvard.edu/kb/high-performance- computing/architectural-description-of-the-odyssey-cluster/) 2. Obtain a personalized letter of support from the FASRC team for your grant submissions. The letter will describe the FASRC environment and describes the level of support and expertise that the FASRC team will be able to provide for that particular application 3. Access to local support at HSPH for research computing consulting 4. Access consultation from the whole FASRC team to plan computational requirements, acquisition of hardware, data storage and dissemination, CPU, etc. Please note that FASRC and its director James Cuff are representing Harvard in the Massachusetts Green High Performance Computing Center (MGHPCC) initiative. This computing center, the result of a $168 million investment, has the immediate capacity for 10,000 high-end computers, with expansion capabilities for an additional 10,000 computers. This means that by opening an account with FASRC, our faculty will have the ability to purchase an unlimited amount of data storage and computing power (http://www.mghpcc.org/).

• Why should I sign up for the FAS RC instead of buying my own research RC cluster and storing the data in my office?

The FAS RC team is one of the best in the world and has access to a scalable amount of computing power and data storage. They will act as your collaborators and assist you in developing personalized solutions. They can make your grant application much more competitive by providing letter of support and the necessary expertise and infrastructure. They can accommodate all your computing needs.

• If I am a PhD or ScD student at HSPH, may I access the FAS RC for free? Yes!

• May I request a face to face meeting with a research computing specialist to address questions regarding research computing?

Yes. We often use e-mail and other electronic methods to coordinate RC needs, but we would also be happy to meet in person to discuss large projects or transitions.

• Can FAS RC help me with the storage of large and secure data?

Yes. Contact [email protected] and we can discuss your data storage requirements. We strongly encourage you to contact the team as you are submitting a grant application. We can host secure data assigned level 1, 2, or 3 treatment.

• I have secure data stored at HSPH and need to use the HSPH cluster to analyze the data. Can my secure data transition to FASRC?

Yes. First we will need to review your DUA. Most of the DUA says that you can store your data within Harvard, so we should be able to move your secure data to FAS and meet the same security requirements without modifying your DUA. If not, we will be working with you and your data provider to revise the DUA accordingly.

• Can FAS RC help me transitioning from the HSPH RC environment to FAS RC environment?

Yes. We will provide you with instructions that will cover the majority of use cases and migration steps. If you do own your own IT hardware or have sensitive data we can help with the transition to FAS RC. Also feel free to contact [email protected] and we can discuss your needs during and after the transition.

• Is there a language that I can use in my grants submissions to budget the cost for access the FAS RC?

Yes. See below for an example. Please get in touch with us as you are developing a grant application. The FAS RC team will also provide a personalized letter of support for your application. The language below can be further tailored to your needs:

Computing Costs are requested to provide researchers with access to the Faculty of Arts & Sciences Research Computing shared use facility (FAS RC). A $3,000 fee per FTE per year is charged for access to the FAS RC. This fee applies to each account and provides users with access to resources hosted in the FAS RC environment, including expert consulting help coupled with extensive resources, such as >10 petabytes of storage, 25,000 processing cores, and numerous software modules and applications. A full description of FAS-RC computing resources available to researchers can be found at http://rc.fas.harvard.edu/

• Why do I need to pay 3K per year per account to access FAS RC?

HSPH has signed a MOU with RC at FAS to access their infrastructure and expertise (see web site for details). FASRC is managing the largest high performance computing cluster at Harvard and they have recently transitioned to Holyoke where they have access to almost an unlimited amount of data storage space with very competitive cost (http://www.mghpcc.org/. As part of the MOU, HSPH has hired one additional FTE that has joined the RCFAS team. The cost of 3K is to recover the cost of this FTE. We do believe that this is the most cost effective way to allow HSPH investigators to access a wealth of RC expertise, purchase disk space and computing power at the lowest cost on their grants. After you sign up, for a cost that is comparable to a laptop, you will not need to worry about anything else. We also now have concrete evidence that highlighting this collaboration on your grant applications, increases the chance that they are getting funded.

• Do I need to pay for software licenses at the FAS RC?

Most software (including SAS and Stata) is already licensed at FAS RC and therefore is free. Additional commercial software can be added on a case-by-case basis, depending on demand. Contact [email protected] and we can discuss your software needs.

• May I invest more resources on the FAS RC with the purchase of additional nodes and increase computing power and decrease waiting time?

Yes. Contact [email protected] and we can discuss your computing needs, coordinate obtaining a quote, and install the new hardware.

• If I am facing a deadline where I need more computing power for a short period of time, can FAS RC help me?

Yes. Contact [email protected] and we can discuss your computing needs and how we can shift other resources temporarily. We can often temporarily dedicate part of HSPH's computing resources at FAS RC to your project, or obtain permission to borrow computing capacity normally reserved for other schools or researchers.

• If I am teaching a class on computing, can I ask FAS RC to open a temporary account to my students to run their job on the cluster?

Yes. Please send a list containing each student's first name, last name, and e-mail address to [email protected].

• Why the cost for accessing the FAS RC cluster is spread equally across all the grants that support my salary, when I only use high performance research computing for one of these projects? May I charge only a specific grant to access to FAS RC, instead?

Yes. The funds, percentages, and amounts are merely guidelines based on salary allocations for the past quarter. It is understood that not all funds require the use of research computing. PIs should work with their grant managers to allocate the cluster fee to the appropriate fund(s).

• Can a collaborator of mine from another institution access the FAS RC? What will be the cost?

Yes. Researchers outside of Harvard may access the FAS RC. The cost of a "guest" account is the internal fee plus HSPH overhead of 61.5%. This is consistent with practices for access to other area research computing facilities, such as the Channing. For FY13, the FAS RC guest account cost is $1,050 per quarter.

• Can you please provide examples of how RCFAS has been able to help our PIs with RC and or data storage issues?

The FAS research computing environment can provide customized support for almost any project. While the storage space, compute resources and available software are sufficient for the majority of work many research projects require a more specialized setup. The FASRC team has provided assistance in deploying and configuring servers, set up databases for large scale data sets, automated data transfer between HSPH equipment and FAS storage facilities and helped optimize novel algorithms.

Comments from PIs at HSPH

Marc Lipsitch, Professor of Epidemiology, HSPH

“Our interaction with the FASRC has gone really well and we are routinely interacting with the team for our genomic analyses in several projects. This collaboration has lead to the following two papers so far (see below). Support has been excellent especially since HSPH got a dedicated person.”

1: Cobey S, Lipsitch M. Pathogen diversity and hidden regimes of apparent competition. Am Nat. 2013 Jan;181(1):12-24. doi: 10.1086/668598. Epub 2012 Nov 27. PubMed PMID: 23234842.

2: Cobey S, Lipsitch M. Niche and neutral effects of acquired immunity permit coexistence of pneumococcal serotypes. Science. 2012 Mar 16;335(6074):1376-80. doi: 10.1126/science.1215947. Epub 2012 Mar 1. PubMed PMID: 22383809; PubMed Central PMCID: PMC3341938.

Ashish Jha, Professor of Health Policy and Management

“The folks at FAS have been terrific and we have not yet really reached out to them to look for solutions. We feel like we are outstripping our capacity to run jobs and analyses in the ways that we need to be able to do. The whole thing is still a substantial upgrade from where we were.”

Curtis Huttenhower, Assistant Professor of and

“Interaction with RCFAS mostly changed my life for the better, the logistics are now quite smooth, and the technical glitches are relatively minor. Staff support has been phenomenal, if anything they need more people to keep doing what they're doing.”

Liming Liang, Assistant Professor of Statistical Genetics

“The FAS cluster certainly makes life better. We have been able to process DNA sequencing data and establish a website that provide dynamic eQTL computing which cannot be done at HPCC. The temporary student accounts for my EPID293 class the last winter were extremely useful. From the students feedback and comments, they like it a lot to be able to learn all tools for GWAS under a UNIX environment. The rchelp team is fantastic and always response quickly to all our requests. They even stood by during the class hours.”

Best,

Francesca Dominici, PhD Associate Dean of Information Technology Professor of Biostatistics