XSEDE 2.0

XSEDE 2.0: Integrating, Enabling and Enhancing National Cyberinfrastructure with Expanding Community Involvement Submitted to the National Science Foundation by invitation Principal Investigator John Towns Executive Director National Center for Supercomputing Applications (NCSA) University of Illinois 1205 W. Clark Street Urbana, Illinois 61801 Tel: (217) 244-3228 Fax: (217) 244-2909 Email: [email protected]

Co-PI: Kelly Gaither Director of Visualization, Texas Advanced Computing Center Co-PI: Ralph Roskies Scientific Director, Pittsburgh Supercomputing Center Co-PI: Nancy Wilkins-Diehr Associate Director, San Diego Supercomputing Center Senior Personnel: David Hart Section Head, User Services Section, Computational and Information Systems Laboratory, National Center for Atmospheric Research Senior Personnel: David Lifka Director, Center for Advanced Computing, Cornell University Senior Personnel: Gregory D. Peterson Director, National Institute for Computational Science, University of Tennessee

i XSEDE 2.0

A. Project Summary

Cyberinfrastructure is the ubiquitous enabling technology supporting the mission of the National Science Foundation to promote the progress of science. The eXtreme Science and Engineering Discovery Environment (XSEDE) has served as a unifying framework for the NSF’s investment in cyberinfrastructure (CI) for four years, connecting top researchers to leading resources, facilitating scientific discovery, and enabling transformational science, engineering, scholarship, and innovative educational programs. For the next five years, the XSEDE project will carry out an ambitious plan to accelerate the integration of community-developed tools and services into the national CI ecosystem and expose that CI to a much broader community in support of research and scholarship. XSEDE will enhance the productivity of scholars, researchers, and engineers by giving them both access to and extensive support for world-class advanced CI resources. We propose an adaptive and streamlined framework for XSEDE that is driven by evolving community needs and by XSEDE’s three strategic goals: to deepen and extend use of the ecosystem of national CI by both existing computational researchers and new communities of scientists and students where the use of computation and large scale data is transforming their fields; to advance the ecosystem by creating an open and evolving infrastructure, enhancing the value of the ecosystem; and to sustain the ecosystem by maintaining a secure, reliable and efficient infrastructure. To fulfill its mission to serve the community’s needs, XSEDE will reorganize into goal-driven focus areas that will provide a more agile and responsive program designed to better support the strategic goals. A Resource Allocation Service will fulfill XSEDE’s crucial role of neutral arbiter in allocating resources from the service provider ecosystem to the research community. The revised Community Infrastructure service will identify, evaluate, test, and integrate into the infrastructure. Community Engagement & Enrichment will build on the XSEDE tradition of outstanding user services, and engage a new generation of diverse computational researchers. The Extended Collaborative Support Service will maximize the effectiveness of CI resources through its critical role of deep and direct engagements with research teams. Finally, XSEDE Operations will maintain and evolve an integrated CI capability of national scale. Intellectual Merit: XSEDE will integrate resources and services into a CI that is unique in scale and diversity, designed to enhance user productivity, interface with other CI resources, and evolve in capabilities over its lifetime. This project will evaluate and integrate the latest CI technologies, integrate missing functionalities, document and publish experiences and results, and provide requirements that drive future CI research and development. XSEDE will also work directly with leading researchers to apply XSEDE-allocated resources to groundbreaking science. The resulting infrastructure, engineering, operations, and support activities will advance distributed systems, high-end computing, and data- intensive computing, to the direct benefit of advancing science. Broader Impacts: XSEDE will serve as a focal point for advanced CI in the U.S., with its resources and services available to researchers nationwide. Its significant outreach activities—from conferences to training events to student programs—will pull together offerings from many NSF-funded activities. As CI transforms the conduct and culture of scientific collaboration, XSEDE will be there to lead, advise, and contribute to this transformation. The resulting increase in scientific output will impact the entire nation. Working with applications teams and tool developers, XSEDE will adapt best practices in digital services into professional and curriculum development programs to prepare a diverse array of current and future science, technology, engineering, mathematics, arts, humanities, and social science researchers, educators, and practitioners. In collaboration with educational institutions and other funded teams across the nation, XSEDE will promote diversity and improved digital science literacy in the U.S. workforce, with special attention to early career researchers and under-represented groups. These achievements will be realized within a framework designed for seamless continuity beyond the project lifespan, thereby preserving these efforts for the future benefit of our society.

A-1 XSEDE 2.0

B. Table of Contents

A. Project Summary ...... A-1 B. Table of Contents ...... i Foreword ...... iii C. Project Description ...... C-1 C.1 Introduction ...... C-1 C.1.1 Strategic Plan: Achieving Our Mission and Goals ...... C-1 C.1.2 Strategic Goals ...... C-1 C.1.3 Strategic Goals & Key Performance Indicators ...... C-2 C.1.4 New in XSEDE 2.0 ...... C-2 C.1.5 The Added Value of XSEDE ...... C-3 C.1.6 Interaction with Other National & International Cyberinfrastructures ...... C-4 C.2 Science Case for XSEDE ...... C-4 C.2.1 Foundational Motivation for XSEDE...... C-4 C.2.2 Developing the Pipeline—User Support, Training, Education & Outreach ...... C-5 C.2.3 Highlights of Previous Achievements ...... C-5 C.2.4 Prospective Future Successes ...... C-7 C.3 Community Engagement & Enrichment ...... C-8 C.3.1 User Engagement ...... C-8 C.3.2 User Interfaces & Online Information ...... C-9 C.3.3 Campus Engagement ...... C-10 C.3.4 Workforce Development ...... C-10 C.3.5 Broadening Participation ...... C-12 C.3.6 Community Engagement & Enrichment KPIs ...... C-12 C.4 Extended Collaborative Support Service: Enabling the Research Community ...... C-13 C.4.1 Extended Support for Research Teams (ESRT) ...... C-14 C.4.2 Novel & Innovative Projects (NIP) ...... C-14 C.4.3 Extended Support for Community Codes (ESCC) ...... C-15 C.4.4 Extended Support for Science Gateways (ESSGW) ...... C-15 C.4.5 Extended Support for Education, Outreach & Training (ESTEO) ...... C-15 C.4.6 ECSS KPIs ...... C-16 C.5 XSEDE Community Infrastructure: Adding value to the CI ecosystem ...... C-16 C.5.1 Requirements Analysis & Capability Delivery (RACD) ...... C-17 C.5.2 Capabilities Evaluation & Testing (CET) ...... C-18 C.5.3 Capability & Resource Integration (CRI) ...... C-18 C.5.4 XSEDE Community Infrastructure KPIs ...... C-18 C.6 XSEDE Operations ...... C-19 C.6.1 Cybersecurity ...... C-19 C.6.2 Data Transfer Services ...... C-20 C.6.3 XSEDE Operations Center (XOC) ...... C-20 C.6.4 Systems Operational Support (SysOps) ...... C-20 C.6.5 XSEDE Operations KPIs ...... C-21 C.7 Resource Allocations Service (RAS): Stewarding the National Investments ...... C-21 C.7.1 XSEDE Allocations Process & Policies ...... C-21

i XSEDE 2.0

C.7.2 Allocations CI Enhancement & Maintenance ...... C-22 C.7.3 Resource Allocations Service KPIs ...... C-23 C.8 Managing the Program ...... C-23 C.8.1 Project Governance ...... C-24 C.8.2 External Relations (ER) ...... C-25 C.8.3 Project Management, Reporting & Risk Management (PM) ...... C-25 C.8.4 Business Operations ...... C-25 C.8.5 Strategy, Planning, Policy, Evaluation & Organizational Improvement ...... C-26 C.8.6 Sustainability Planning ...... C-26 C.8.7 XSEDE Program Office KPIs ...... C-26 C.9 Broader Impacts of the Proposed Work ...... C-27 C.10 The XSEDE2 Team ...... C-27 C.10.1 XSEDE2 Leadership ...... C-27 C.10.2 Prior Support ...... C-29 C.11 Conclusion ...... C-29 C.12 References Addressing the “Expectations of a renewal proposal” ...... C-30 D. References Cited ...... D-1 Appendix I XSEDE 2.0 Organization and Budget ...... 1 I.1 XSEDE2 Organizational Structure ...... 1 I.2 XSEDE2 Work Breakdown Structure ...... 2 I.3 XSEDE2 Budget Detail ...... 3 Appendix II List of Partners and Roles ...... 1 Appendix III Conflict of Interests List ...... 1 Appendix IV Letters of Collaboration ...... 1 Appendix V Data Management Plan ...... 1 V.1 Primary Data Products ...... 1 V.2 Secondary Data Products and Related Projects ...... 2

ii

Foreword

It is anticipated that this document is read in electronic form (PDF) using Adobe Reader®. There is extensive cross-linking to facilitate referencing content across the document. In general, all text that has blue underlining (e.g. §C.1) is clickable. Clicking on the underlined text will take you to the referenced section. These are set up to facilitate moving back and forth between the high level discussions in §C.1 to more detailed discussions regarding specific project areas in §C.3 though §C.8, and to provide navigation between sections when referenced.

iii

C. Project Description

C.1 Introduction Computing is increasingly ubiquitous across all fields of scholarly study. Digital technologies accelerate and enable new, even transformational, research in all domains. Researchers rely on an increasingly diverse set of distributed resources in their research and educational pursuits; access to an array of well- supported, advanced digital services and support is critical for the advancement of knowledge. XSEDE (the Extreme Science and Engineering Discovery Environment) is a socio-technical platform empowering contemporary science by integrating and coordinating advanced digital services, and delivering and enhancing advanced distributed digital infrastructure and critical support services within the national cyberinfrastructure (CI) ecosystem. This ecosystem involves a distributed assemblage of software, supercomputers, visualization systems, storage systems, networks, portals and gateways, data collections, instruments, and personnel with specific expertise. Driven by the needs of the open research community, XSEDE enhances the productivity of a growing community of scholars, researchers, and engineers; federates with other high-end facilities as well as campus-based resources; and supports seamless access to data from major instruments. Our vision is a world of digitally enabled scholars, researchers, and engineers participating in multidisciplinary collaborations while seamlessly accessing advanced computing resources and sharing data to tackle society’s grand challenges. During XSEDE1 (the first five years of the XSEDE program), we established critical infrastructure within the ecosystem targeted at XD Program awardee Service Providers (Kraken, Stampede, Gordon, etc.). This infrastructure has supported cutting-edge research advances, some of which are summarized below (§C.2.3) with many more examples available via our website (www.xsede.org/science-successes). This infrastructure provides an evolving framework for integrating, federating, and sharing CI services and resources driven by the community. During XSEDE2 (the second five years of the XSEDE program), we will sustain, improve, and expose that infrastructure to the broader community, encompassing a far broader range of resources and services in support of science. This infrastructure will empower the community to discover and access an expanding portfolio of services with XSEDE playing the role of “connector of services.” C.1.1 Strategic Plan: Achieving Our Mission and Goals XSEDE’s mission is to enhance the productivity of a growing community of scholars, researchers, and engineers through access to advanced digital services that support open research by coordinating and adding value to the leading cyberinfrastructure resources funded by the NSF and other agencies. Our strategic goals fully support NSF’s vision as stated in Investing in Science, Engineering and Education for the Nation’s Future [1] and strategies stated broadly in the Cyberinfrastructure Framework for 21st Century Science and Engineering [2] and the more specifically relevant Advanced Computing Infrastructure: Vision and Strategic Plan [3]. C.1.2 Strategic Goals Our strategic goals support our mission and guide the project’s activities toward the realization of our vision of an advanced digital services ecosystem. Three strategic goals are defined: Deepen and Extend Use: XSEDE will deepen the use of the ecosystem by existing scholars, researchers, and engineers, and extend the use to new communities. We will contribute to the preparation—workforce development—of the current and next generation of scholars, researchers, and engineers in the use of this ecosystem; and raise the general awareness of the value of advanced digital services. Advance the Ecosystem: XSEDE will advance the ecosystem by creating an open and evolving e- infrastructure and enhance the technical expertise and support services offered. Sustain the Ecosystem: XSEDE will sustain the ecosystem by ensuring and maintaining a reliable, efficient, and secure infrastructure, and providing excellent user support services. XSEDE will further operate an effective, productive, and innovative virtual organization.

C-1

C.1.3 Strategic Goals & Key Performance Indicators The strategic goals of XSEDE2 cover considerable scope. Specific activities within that scope are often very detailed; therefore to ensure that this significant and detailed scope will ultimately deliver our mission and realize our vision, we decompose the three strategic goals into components or sub-goals to be considered individually. To ensure that we are fulfilling our mission, to assess progress toward our vision, and to guide our planning, we have identified key performance indicators (KPIs) to encapsulate and measure our progress toward meeting each sub-goal. These KPIs, shown in Table C-1 are rooted in the needs and requirements of the communities we serve. XSEDE’s KPIs are measurable values that demonstrate how effectively XSEDE is achieving key objectives and attaining strategic goals. They have been chosen so that actions and decisions that have a desired positive impact on the metrics also drive the organization toward attaining its strategic goals. Each of these KPIs is owned by a major organizational unit of XSEDE2 and is discussed in the appropriate sections. C.1.4 New in XSEDE 2.0 XSEDE2 has adapted its organizational structure, addressing input from NSF and feedback from reviewers and our advisory boards, and in consultation with new strategic advisors to be more agile, create more immediate impact, and be responsive to its stakeholders. XSEDE2 has also improved the organizational structure and focus for XSEDE1’s TEOS efforts, now the Community Engagement & Enrichment Program (CEE, §C.3). We will implement an improved user engagement effort that collects user requirements, tracks XSEDE responses, and reports actions to the community. The organizational structure facilitates integrated workforce development efforts with deeper ties to campus personnel and deeper engagement with campuses by working with CIOs and VPRs to ensure that XSEDE complements campus activities. User Information and Online interfaces will create a

Table C-1: Summary of Key Performance Indicators (KPIs) for XSEDE. Sub-Goals KPIs (owner) Strategic Goal – Deepen and Extend Use  Number of completed ECSS projects (ECSS, §C.4) Deepen use for (existing  Average ECSS impact score (ECSS, §C.4) communities)  Average satisfaction with ECSS support (ECSS, §C.4)  Number of sustained under-represented users of XSEDE resources and services Extend use to (new (ECSS, §C.4) communities)  Number of new under-represented users of XSEDE resources and services (ECSS, §C.4) Prepare the current and next  Registrants for Training (CEE, §C.3) generations  Average impact assessment of the Training (CEE, §C.3)  Website (Number of unique visitors) (CEE, §C.3) Raise awareness of the value of  Social Media (Number of impressions) (External Relations, §C.8.2) advanced digital services  Public Relations (Number of media hits) (External Relations, §C.8.2) Strategic Goal – Advance the Ecosystem Create an open and evolving e-  Number of new capabilities made available for production deployment (XCI, §C.5) infrastructure  Overall satisfaction with XSEDE services (XCI, §C.5)  Number of staff certifications (XCI, §C.5) Enhance the array of technical  Average rating of staff regarding how well-prepared they feel to perform their jobs expertise and support services (XCI, §C.5) Strategic Goal – Sustain the Ecosystem Provide reliable, efficient, and  Average composite availability of core services (Operations, §C.6 and RAS, §C.7) secure infrastructure  Impact of security incident on resource availability (Operations, §C.6 and RAS, §C.7)  Mean time to ticket resolution (Operations, §C.6 and RAS, §C.7) Provide excellent user support  Average satisfaction ratings for allocations and other support services (Operations, §C.6 and RAS, §C.7) Effective and productive virtual  Percentage of L2 areas implementing climate study and evaluation organization recommendations (Program Office, §C.8)  Number of strategic or innovative improvements (Program Office, §C.8) Innovative virtual organization  Ratio of proactive to reactive improvements (Program Office, §C.8)  Number of staff publications (Program Office, §C.8)

C-2

new XSEDE website and expand capabilities for an improved information dissemination service to the community via a consistent and information-rich web presence via the public website and the XSEDE user portal. XSEDE2’s Extended Collaborative Support Services (ECSS, §C.4) will adapt the skill sets to address new modes of computing including the use of virtual machines and containers, data analysis resources, and databases. ECSS will enforce greater discipline in managing projects such as exploiting work plans, deciding when to terminate projects, and having better defined measures of success to create greater efficiency in the operation of this team. Finally, ECSS will work with CEE to provide in-depth staff training to increase and diversify staff skills. The XSEDE Community Infrastructure (XCI, §C.5) area will redefine XSEDE’s approach to infrastructure enhancement, software deployment, SP coordination, and campus engagement. XCI will shift its investments from activities that require extensive staff effort to implement software “in house” to integrating tools and services developed by the open-source and NSF-supported software community into an extensible integration framework. This model is perhaps best exemplified by the rapid integration of Globus, the Computation Institute’s changes to the software, and XSEDE’s successful adoption and integration of Globus for data transfer services. XSEDE2 Operations (§C.5.4) will extend operational and usage tracking capabilities and work more closely with the XSEDE Metrics Service to analyze and present this information, thus enabling decision makers to understand trends and recognize and exploit opportunities to optimize operations. Cybersecurity will deploy and promote a Collective Intelligence Framework (CIF) for exchanging attack intelligence to reduce threats and maintain our excellent cybersecurity record--no cross-XSEDE compromises were experienced in XSEDE1. XSEDEnet and Data Services efforts will be merged into a single team, refocused on end-to-end data transfer performance, functionality, and efficiency in user data flows. XSEDE2 will elevate the Resource Allocations Service (RAS, §C.7) to a top-level service. RAS will evolve the XSEDE allocations services to better match researchers and educators with the advanced digital services across the national ecosystem to help them accomplish their objectives. C.1.5 The Added Value of XSEDE Analyzing the value of XSEDE is particularly difficult; it may take decades for the value of scientific discoveries to become clear and XSEDE itself does not provision the most visible resources in the ecosystem. Work by Apon et. al. suggests that institutional investment in HPC returns higher federal grant funding and publications [4] [5] but does not determine whether the overall return is greater than the investment cost. Two reports [6] [7] contain dozens of examples of advanced CI adding value to health, science, quality of life, and benefits in industry with essentially no serious cost-benefit analysis. Stewart et. al. [8] have analyzed and documented the value of XSEDE in two ways: value added, defined as “an activity that increases the worth of the product or services to the customer,” and return on investment (ROI). For added value, they consider how the national research community benefits from interactions with a single entity serving the community compared to a scenario where multiple SPs offer services independently (e.g. see §C.2.1). For return on investment, Stewart et al. created a “thought model” comparing the cost of one national integrating core function (XSEDE) to hypothetical models of multiple centers, where minimum fixed costs must be replicated. While this study has several limitations, noted by the authors, it is the first openly published study of ROI of an entire cyberinfrastructure support and delivery operation. In XSEDE2, PI Towns, Stewart, and others will build on existing analyses to estimate more fully the value added and ROI of XSEDE2 and will expand efforts to measure impact and value. The XSEDE Campus Champions are a critical component of XSEDE’s efforts to engage the user community. The program has more than doubled in size since July 2011 to 192 institutions (currently covering 49 states, two U.S. territories and nearly all EPSCoR jurisdictions) and over 250 champions serving as critical points of contact for XSEDE on these campuses and providing deeper engagement with a diverse and influential set of campus personnel. The scale of the XSEDE program makes Campus Champions viable. Replicating the same program at multiple SPs would likely dilute the impact. Of the ten Level 3 members of the Service Provider Forum (§C.8.1.2), five of the PIs are Champions and four others have a champion on campus. Through cooperation and coordination with campuses, the resources and

C-3

services being offered on campuses can directly complement those offered by XSEDE; from deploying advanced digital resources to providing support services such as consulting and training. C.1.6 Interaction with Other National & International Cyberinfrastructures Although XSEDE is the largest national cyberinfrastructure for open science, it is only one of many used by researchers. For example, the Open Science Grid (OSG) is a CI with a focus on high-throughput computing. XSEDE’s allocation process already provides access to OSG resources, complementing the high-end NSF SP resources. We have renewed our agreement with OSG as a partner (see Appendix IV Letters of Collaboration). Many major data-driven projects have developed their own CI to achieve their scientific aims (DES, LSST, LIGO, NEES, and others). While each project has some, usually significant, local CI resources, often a high-profile subset of their applications can benefit from migration to extremely large-scale execution environments at XSEDE SPs. We have made progress with the LIGO project in working to support their production science needs for Advanced LIGO as it comes online. Incoming SP resources supporting virtualization will make migration of their applications to XSEDE-allocated resources even easier. An important and emerging initiative is the National Data Service (NDS) which is working to establish national data CI and will create an important opportunity to connect data and data services with analysis resources available via XSEDE. We are currently negotiating with the NDS to become a Service Provider and federate with XSEDE creating new opportunities for enhancing the productivity of those involved in data intensive research (see Appendix IV Letters of Collaboration). As an important step toward international interoperability of CIs, XSEDE has been working with the Partnership for Advanced Computing in Europe (PRACE) and the European Grid Initiative—together representing the closest European analog to XSEDE—and Compute Canada. We facilitate interoperability of these CIs, and thus support international research collaborations, by adopting standards-compliant software implementations. In addition, we have developed an international summer school series initiated with PRACE, and now expanded to include Japan’s RIKEN and Compute Canada. We have engaged in joint support efforts of collaborating research teams spanning the U.S. and Europe while making use of XSEDE and PRACE resources. We will continue these very successful efforts in XSEDE2 and retain our aspiration to develop a process for allocating resources across these infrastructures to support international science collaborations. In all of these interactions XSEDE functions as a coordinating point for national and international entities that want to engage and collaborate with NSF advanced cyberinfrastructure programs. C.2 Science Case for XSEDE The availability of powerful and affordable computational technologies is transforming almost every field of science and engineering. This transformation is also seen in disciplines new to advanced computing, such as humanities, social science, and even areas of computer science, like machine learning. XSEDE2’s essential, continued role in training the next generation of computational science experts and increasing the productivity of current users of digital resources is illustrated with concrete examples of achievements from XSEDE’s first four years and a look at what can be achieved over the next five. Many XSEDE features are designed to enhance user productivity. The obvious premise is that increased productivity leads to more effective research: increased productivity can make the difference between a practical project and an impractical one. Investment in science is frequently justified by the number of publications that have emerged, and investments in infrastructure usually measure how many have used the resources. In the past four years, users have reported about 14,000 publications that acknowledge XSEDE or SP support. Studies have also shown that those publications are, on average, cited more frequently than other publications in the same journal [9]. Moreover, in the most recent quarter (Q1CY2015), there were 5,451 active user accounts (used some chargeable resource or a gateway). While these are impressive statistics, we will not explore them further here because we cannot de-convolve the contributions of XSEDE from those of the SPs. This section is concerned with the science case for XSEDE itself. C.2.1 Foundational Motivation for XSEDE NSF recognizes that advancing science across multiple disciplines requires a variety of resources. There is thus the need for comprehensive cyberinfrastructure composed of heterogeneous digital resources

C-4

leveraging the aggregate expertise of a small number of leading institutions, each with its own unique human talent and approaches to addressing community needs. However, even with heterogeneous resources and the expertise of multiple institutions, it is only through tight yet flexible integration and interoperability of these resources and services that a growing number of scientific research activities can move forward efficiently and effectively. This is the foundational motivation for XSEDE. To improve user productivity and thus generate more science and scholarship, XSEDE2 will present users and developers with a single interface rather than a set of different interfaces with different administrative domains. Examples, detailed throughout this proposal, include a single user portal, to coordinate and unify services and information such as training offerings, allocations, documentation, publication management, and the status and performance of resources. Other common services include a single help desk, unified authentication mechanisms, an application software catalog, a repository of services and tools, coordinated security (which has prevented the spread of security breaches from site to site), unified allocation of resources (which helps assure that the most meritorious work nationally is awarded resources), coordinated advanced support and training, common techniques for rapid data transfer (i.e. Globus), support for end-to-end network tuning to optimize data transfer, and a well-defined infrastructure that will allow campuses to design their systems to interoperate with those of XSEDE. C.2.2 Developing the Pipeline—User Support, Training, Education & Outreach The progress of science requires a pipeline of people knowledgeable in exploiting the available technologies. Therefore, Extended Collaborative Support Service (ECSS) and Community Engagement & Enrichment (CEE) encompass essential aspects of the science case for XSEDE. The resources deployed by the SPs change continually, constantly requiring training and education in new techniques and the development of new algorithms. These efforts are outlined in §C.3 (CEE) and in §C.4 (ECSS). The centralized coordination and management of these efforts by XSEDE enables the most appropriate experts to be brought to bear to assist users, no matter where the experts or the resources are located. This also provides important stability to the career paths of the valuable experts. For example, the pooling of expertise has allowed ECSS to support experts in digital humanities and workflows, which it is unlikely that an individual center could have provided due to insufficient demand at any one site. That centralization also enables cross-pollination of knowledge between disciplines and resources. We have often seen advanced user support professionals transmitting advances and insights at the algorithmic, numerical, coding, and optimization levels between fields of application and between computing systems. In training, XSEDE enlists the most effective presenters regardless of the center at which they are located. C.2.3 Highlights of Previous Achievements The following (very few) sample highlights of XSEDE scientific achievements illustrate the value added by XSEDE compared to a model of disconnected centers. Many more examples available via our website (www.xsede.org/science-successes). Economics: Fast Construction of Nanosecond Level Snapshots of Financial Markets (Mao Ye, University of Illinois College of Business) Most economists agree that the millisecond stock trades enabled by automated trading have improved the efficiency and fairness of the markets. Subsequent advances in technology, which reduced the time necessary for a trade to the nanosecond scale, however, have posed potential problems not balanced by further improvements in efficiency. In particular, the use of massive, fast trades to manipulate the market, as well as the profound effects of inadvertent software bugs, suggest that ever-faster trades may actually be destabilizing the markets, as in the “Flash Crash” of May 6, 2010, when a $4.1 billion trade on the NYSE resulted in a loss to the Dow Jones Industrial Average of over 1,000 points and then a rise to approximately the previous value, all over about 15 minutes. With help from XSEDE’s Novel and Innovative Projects (NIP) program within ECSS, Mao Ye and collaborators used PSC’s Blacklight to show that 20 percent of NASDAQ trades—and more than half of trades for some major stocks such as Google—are done automatically in “odd lots” that have not previously been reportable to the moment-by-moment Trade and Quote (TAQ) “ticker tape.” [10] This work contributed to an October 2013 change in NASDAQ and New York Stock Exchange rules, such that all trades are reportable and visible moment by moment. In a further retrospective study, and with continued ECSS help from Bob Sinkovits (SDSC), Ye used SDSC Gordon, TACC Stampede, and PSC Blacklight to optimize the code for analyzing market data, speeding the analyses by as much as 126-fold [11]. Future work, particularly with SDSC’s Gordon and PSC’s Blacklight, holds the promise of speeding

C-5

analyses of multiple stocks such that researchers and regulators can follow and understand automated trades over whole markets in near-real-time. Characterization of Stable and Metastable Alumina Surfaces (Douglas Spearot, University of Arkansas) Structural details at the atomic level have the greatest influence on properties of materials. Experimentalists want a quick way of understanding which details are responsible for the patterns they observe in the lab. Together with PhD student Shawn Coleman, Spearot produced a unique algorithm integrated with the LAMMPS simulation package that uses simulation data to produce a visualization of both X-ray diffraction line profiles and selected-area electron diffraction patterns that is directly comparable to what experimentalists would observe in a lab. Spearot and Coleman stress that this success would not have been possible without ECSS support from NCSA’s Sudhakar Pamidighantam on workflows and visualization expert Mark Vanmoer, who automated simulation and visualization techniques. Simulations are launched from a desktop computer and data and visualizations are received without further interaction. The simulation first runs on Stampede at TACC to do the atomistic simulations; when complete it automatically launches on Gordon at SDSC to do the virtual diffraction calculations as well as the visualizations. Yang Wang at PSC, together with Campus Champion Fellow Luis Cueva-Parra from Auburn University at Montgomery, improved the parallelization and scalability of the algorithm. This work depended not only on ECSS support for the workflows, visualization, and optimization, but also on the XSEDE infrastructure that enables the coupling of the disparate computing resources. Biofuels Enzyme Research (Christina Payne, Gregg Beckham, NREL et al) Cellulase enzymes found in nature—from sources such as wood-degrading fungi, cow rumen, and compost piles—form one of the key catalysts for degrading plant biomass in industrial processes to make biofuels. At the potential scale of biofuels production, even small gains in enzyme performance potentially could have huge cost implications for biofuels production and can help biofuels be more cost-competitive with transportation fuels derived from fossil fuels. In a September 2013 PNAS publication [12], Beckham and colleagues from the National Renewable Energy Lab, the University of Colorado), the Swedish University of Agricultural Sciences, the University of Kentucky, and University College Ghent reported that a component called a linker—previously thought to only serve a synergistic support function in cellulose binding in biomass degradation—actually binds cellulose itself and thus is a new potential target of enzyme engineering in biofuels studies. This effort depended on the XSEDE infrastructure that facilitates access to the disparate computing resources: Supercomputers at PSC and SDSC enabled them to apply quantum chemistry methods, while Athena and Kraken at NICS, along with Ranger and Stampede at TACC, allowed them to perform large-scale molecular dynamics simulations. “We’re now starting to design and test enzymes experimentally with different linker sequences and lengths,” Beckham says. “We’ll be able to produce these enzymes in fungi and then test their activity to start to understand better the effect of linker sequence and length on enzyme function.” Behavioral Genomics: Using Next-Generation Sequencing to Identify Genetic Differences in the Social Lives of Mammals (Matthew D. MacManes and Eileen A. Lacey, University of California, Berkeley) Matthew MacManes and colleagues at UC Berkeley used XSEDE to access multiple resources to investigate the genetic underpinnings of social behavior in mammals. With Ranger at TACC, MacManes looked at the genetics of two related species of mice—one monogamous, the other sexually promiscuous—analyzing the differences in bacterial communities in the female reproductive tract. At PSC, MacManes exploited Blacklight’s large shared memory, looking at differences in gene expression related to the transition from social to solitary living in the colonial tuco-tuco, a species of burrowing rodent. This species is unique for studying social behavior in that some of them live in groups while others leave their burrow system to live in solitary conditions. Beyond the ease with which the researchers used resources at different sites (single allocation process, single authentication, etc.) this project was enabled by ECSS support. ECSS expert Phil Blood installed modules of ~20 programs typically used in genomics assembly and analysis, managed the complex dependencies between these packages, and helped address issues in using them effectively on Blacklight, which, says MacManes, was essential to the success of the project. The assembly required 14 days of computing, with subsequent analysis extending for months. Findings that included identifying a number of genes that are differentially expressed according to tuco social behavior were reported in PLOS ONE in September 2012 [13].

C-6

Cosmology: How first galaxies formed, Volker Bromm, University of Texas Supported by ECSS, Volker Bromm and his team at UT-Austin used Stampede to perform ab initio simulations refining how the first galaxies are formed "It is a really exciting time for the and how metals in stellar nurseries influenced field of cosmology. We are now characteristics of first stars in the galaxy [14]. The ECSS ready to collect, simulate and efforts (by Lars Koesterke, TACC) led to an OpenMP analyze the next level of precision implementation of adaptive mesh refinement (AMR) for use data...there's more to high on the Phi. These simulations require modern large- performance computing science scale computing to make predictions, which, for the first than we have yet accomplished.” time can be validated in 2018 by the James Webb Space - Astronomer and Nobel Laureate Telescope (and in fact determine how JWST is used). Saul Perlmutter, Supercomputing '13 Associated visualizations will appear in an upcoming keynote address Terrence Malick documentary "Voyage of Time," narrated by Brad Pitt and Cate Blanchett. C.2.4 Prospective Future Successes Here we give a few examples of the challenging projects scientists expect to carry out over the next five years. We call out in italics those features of XSEDE that will facilitate this work. In many fields new to HPC, such as digital humanities, machine learning, genomics, and radio astronomy, there is a trend toward assembling numerous, rapidly-evolving software packages into powerful applications and workflows. For example, researchers nationwide are using the NSF-funded single-dish Green Bank Telescope (GBT) at the National Radio Astronomy Observatory to search for millisecond pulsars, to detect and study gravitational waves, and to study astrochemistry. A project for rapidly spinning neutron stars generates ~1 PB of data per year. The Python-based GBT Mapping Pipeline, which integrates many tools, is a new software system intended to facilitate the production of sky maps from this massive data stream. Assembling and optimizing this pipeline, which must maintain high reliability and throughput to keep pace with observations, will require ECSS involvement. XSEDE training in OpenACC and Big Data will be invaluable for optimizing certain components of the pipeline. Computations resulting in the inference of phylogenetic trees are launched through CIPRES, by far the most commonly used Science Gateway. Many of the algorithms have only threaded parallel implementations suitable for a single node. SP resources with many cores per node will minimize the time to solution for such problems. XSEDE will help with gateway development and provide venues, such as the XSEDE conferences, for disseminating results. ECSS support will also help reduce performance bottlenecks for particular codes as they use new SP resources, such as Comet and JetStream, that are particularly designed for effective use by science gateways. Many computational chemists use multiple techniques (quantum simulations, molecular dynamics, QM/MM), optimized on different SP resources. ECSS will help users develop solutions for these complex workflows. For example, Kendall Houk’s group at UCLA has been studying ruthenium catalysts for olefin metathesis (controlled formation of carbon-carbon double bonds). Highly accurate quantum simulations required more than 100 GB of scratch space per node and therefore used Gordon’s SSDs with their much lower latency. On newer SP systems, these calculations can be done more efficiently entirely in core, allowing for the design of even more complex catalysts. Two large biology user communities, the iPlant Collaborative and the Galaxy Project, encompass more than 17,000 U.S.-based users and operate sizable software infrastructure projects. Experimental biologists would prefer to use these interactively. iPlant and Galaxy Main are currently delivered from an oversubscribed hardware environment not part of the XD program. ECSS will work with the infrastructures on new XD resources to enable this work to be efficiently carried out, tuning applications and implementing science gateways. The National Snow and Ice Data Center (NSIDC) curates and manages widely used data but has no community collection of analysis routines. Thus, a polar researcher might know where to get data but not where to find best-practice analysis routines. Moreover, no common computing infrastructure is available to this community. At least 2,500 researchers regularly use NSIDC-managed data products. SPs and ECSS will assist NSIDC staff to create and publish VMs capable of requesting NSIDC data

C-7

and running common earth science/polar science routines to enable more effective research and better analyses of data, including automated recording of provenance and version information. Advanced computing and large-scale data analytics are transforming the way scholars address literature and art. In literature, going beyond the traditional, and arguably limited, approach of close reading, where a researcher carefully analyzes a relatively small body of work, distant reading or macroanalysis [15] applies techniques from natural language processing, statistics, graph analytics, and machine learning to analyze significantly larger corpora, yielding important insights for contextualizing literary movements. The scaling is not merely of the number of works being addressed. Rather than working independently, interdisciplinary and geographically distributed communities of researchers with common interests and complementary expertise are collaborating, building sophisticated infrastructure such as the Collaborative for Historical Information and Analysis [16] and the Digital Mitford Archive (www.digitialmitford.org). XSEDE2 will support access to such data collections and work to identify and install appropriate, scalable applications and frameworks for collaborative, data-intensive research (e.g. collaborative editing, data integration and fusion, GIS and overlay systems, and digitization of scanned works). Equally vital, XSEDE2 will provide training opportunities to help researchers transition to XSEDE2 resources. C.3 Community Engagement & Enrichment At the core of Community Engagement & Enrichment (CEE) is the user, broadly defined to include anyone who uses or may potentially use the array of resources and services offered by XSEDE. The CEE team, led by co-PI and L2 director Kelly Gaither (TACC), is dedicated to actively engaging a broad and diverse cross-section of the open science community, bringing together those interested in using, integrating with, enabling, and enhancing the national cyberinfrastructure. Vital to the CEE mission is the persistent relationship with existing and future users, including allocated users, training participants, XSEDE Conference attendees, XSEDE collaborators, and campus personnel. CEE will unify public offerings to provide a more consistent, clear, and concise message about XSEDE resources and services, and bring together those aspects of XSEDE that have as their mission teaching, informing, and engaging those interested in advanced cyberinfrastructure. The five components of CEE are User Engagement (§C.3.1), User Interfaces & Online Information (§C.3.2), Campus Engagement (§C.3.3), Workforce Development that includes Training, Education & Student Preparation (§C.3.4), and Broadening Participation (§C.3.5). These five teams will ensure routine collection and reporting of XSEDE’s actions to address user requirements. They will provide a consistent suite of web-based information and documentation and engage with a broad range of campus personnel to ensure that XSEDE’s resources and services complement those offered by campuses. Additionally, CEE teams will expand workforce development efforts to enable many more researchers, faculty, staff, and students to make effective use of local, regional, and national advanced digital resources. CEE will expand efforts to broaden the diversity of the community utilizing advanced digital resources. The CEE team will tightly coordinate with the rest of XSEDE2, particularly Extended Collaborative Support Services (§C.4), Resource Allocation Services (§C.7), Community Infrastructure (§C.5), and External Relations (§C.8.2). To address continuity beyond XSEDE, we will have prepared a large population of knowledgeable campus personnel motivated to sustain campus interactions, and well qualified to work with national advanced digital resource providers. We will have created a large repository of quality peer-reviewed documentation, training and education materials that others may continue to use, update, and augment with new content. And we will have prepared a larger and more diverse community of students, researchers, and professionals who will be able to advance discovery for many years to follow. C.3.1 User Engagement The mission of the User Engagement (UE) team is to capture community needs, requirements, and recommendations for improvements to XSEDE’s resources and services, and report to the national community how their feedback is being addressed. XSEDE2 will place greater emphasis on traceability in tracking and closing the feedback loop. Led by L3 manager Chris Hempel (TACC), the UE team will process and track actionable items obtained from user feedback and monitor them throughout the UE loop, from assignment to a responsible XSEDE party through communication of subsequent actions back to the user community. To obtain user feedback, we will engage users of XSEDE’s resources and services to gauge overall satisfaction, pervasive problems, emerging needs, and requirements. Integral to

C-8

this process is the derivation of requirements from diverse sources—micro-surveys, user satisfaction surveys, user interviews—and turning them into actionable Use Cases that can be tracked and handled in all areas of the XSEDE2 organization. The UE team will use tools provided by the XSEDE Project team including JIRA and issue tracking software to monitor requests and enhancements linked to the stakeholders who originated the requirement. The UE team will use this feedback to create a lightweight Use Case document—an encapsulation of user needs via scenarios—attach it to the JIRA issue, and assign it to the responsible XSEDE2 area. UE personnel will provide issue status on the user portal to keep the stakeholders and the general community apprised of progress on actionable items. This ongoing feedback loop will encourage further community input for improving XSEDE2’s resources and services. In XSEDE1, micro-surveys, short questionnaires posted on the XSEDE user portal, proved to be an effective, low-overhead mechanism to poll users on specific topics of interest. Experience suggests that running single micro-surveys for six weeks results in good response rates while minimizing survey fatigue. For example, the most recent XSEDE micro-survey (6 questions available for 6 weeks) measured user satisfaction of user support and training on Stampede and drew approximately 500 respondents. Survey responses indicating user concerns, such as long queue wait times, were directly addressed by staff. The UE team will conduct approximately eight micro-surveys annually and will contact all users with new and renewed allocations. Focus group surveys will be used when feedback recommends personal contact. Area metrics include percentage of active and new PIs contacted quarterly, number of inputs (e.g. problems, recommendations) received on a quarterly basis, number of inputs that are addressed by XSEDE and reported to the community on a quarterly basis, and number of improvements to the XSEDE resources and services that result annually. C.3.1.1 XSEDE Annual Conference Series XSEDE2 will host the annual XSEDE conference (www.xsede.org/web/conference), bringing together the community and staff to advance knowledge, innovation, and applications of advanced digital resources. The conference has effectively engaged users through tutorials, science and technology talks, poster sessions, student programs, Campus Champions sessions, BOFs, and discussions with sponsors. XSEDE2 will form an XSEDE Conference Steering Committee to set the mission and goals of the conference, to provide over-sight, and to ensure continuity. By-laws modeled after organizations such as the SC steering committee will guide the committee’s activities. The committee will select each year’s Conference Chair, who in turn will recruit committee members from XSEDE staff and the community. Area metrics include number of people participating in the annual conference, number of first time attendees, number of papers published, and paper acceptance rate. C.3.2 User Interfaces & Online Information The User Interfaces & Online Information (UII) team is committed to enabling the discovery, understanding, and effective utilization of XSEDE’s powerful capabilities and services through effective user interfaces and documentation services. Led by L3 manager Maytal Dahan (TACC), UII has had an immediate impact on XSEDE users from day one, providing them with an information-rich website, the XSEDE user portal, and a uniform set of user documentation. The website is the first place XSEDE stakeholders come to find information about the project, addressing the needs of internal and external stakeholders. The website and user portal will be improved to create a more consistent and easy-to navigate look and feel. The UII team will develop an information architecture to support a variety of stakeholders. This information-centric approach is rooted in the ability to answer fundamental questions when browsing the website: Am I in the right place? Do they have what I am looking for? What do I do now? The redesign will include a new layout, enabling a single web and mobile site regardless of device type. The UII team will expand mobile capabilities and build upon the new iOS and Android applications. Managing and publishing approved content to the site will be handled via workflows that enable multiple members of XSEDE2 to contribute in an organized and effective manner. Prospective and current users of XSEDE quickly navigate from the website to the XSEDE User Portal for user and project related needs. For example, PIs can apply for and manage allocations, and record their research accomplishments via the publications feature in the Portal. UII will expand the existing XSEDE User Portal to integrate features such as data management, job execution, and task management. The UII team will incorporate the XSEDE software catalog and its administrative interface (work previously funded through XSEDE1’s Technology Investigation Service (TIS) team) and continue to improve capabilities based on stakeholder feedback.

C-9

The UII team manages documentation enabling users to easily find resource and service offerings. New to XSEDE2, the UII team will enable users to create a dynamic environment and tailor the user portal experience to their individual needs. For example, users with allocations on multiple XSEDE resources will be presented with content related to those specific resources, e.g. job submission. Area metrics measured on a quarterly basis include number of page views, sessions, average duration per visit to both the XSEDE Website and User Portal, number of user portal accounts created, number of user interactions (tickets, survey responses, feedback) with XSEDE, ratio of User Portal enhancements to features released, average number of actions performed for each portal login, percentage of active users logging in to XUP, and number of documents that are new and updated. On an annual basis the area will measure the user satisfaction with the website, user portal, and user documentation. C.3.3 Campus Engagement Ongoing communication and cooperation with campuses will help to ensure that the resources and services being offered on campuses complement those offered by XSEDE, and vice-versa. This collaboration will enhance the advanced digital resources and services provided to the community. The Campus Champions effort has established Memoranda of Understanding (MOUs) with more than 190 campuses. On these campuses there are more than 250 Campus, Domain and Student Champions focused on assisting local users to make informed choices of resources and services that may best meet their needs. The Campus Engagement effort, led by L3 managers Henry Neeman (OSCER) and Dana Brunson (OSU), will extend XSEDE’s relationship with campus personnel by establishing regular communications with CIOs and VPs for research. CIOs have indicated that they value communicating with each other and with XSEDE staff to plan the development and delivery of resources and services on their campuses. There will be monthly conferences calls, email lists, and forums for CIOs and VPRs to share challenges, solutions, and information. Other campus individuals who have service roles complementary to XSEDE (e.g. cyberinfrastructure integration and support, training, education, and broadening participation) will be engaged to enhance cooperation among campuses and XSEDE. The Campus Engagement program will collect information from each campus quarterly to assess the level of activity in working with local users. The Campus Engagement team will provide additional training and consulting, and work with campuses to strengthen their Champion’s productivity and engagement. Campus Engagement will enhance the “Welcome Wagon/New Champion Development” efforts to provide individualized attention to new Champions so they can more quickly become actively engaged. The number of campus members has more than doubled in four years (from 93 to 192), and we project this growth rate to continue based on continuing requests from campuses. To address this rapid growth, the XSEDE Regional Champions program is actively developing models for regional support. The lessons learned from the Regional Champions program will guide further improvements for scaling support of the member campuses. XSEDE2 alone will not be able to sustain the support needed for the predicted growth of the program. Through collaborations with ACI-REF, Open Science Grid, and the SP Forum, the Campus Engagement program will develop strategies for long-term sustainability. Area metrics include number of institutions contributing resources and/or services and sharing information with other institutions, number of actively engaged champions, and number of faculty, staff, and students using XSEDE’s resources and services. C.3.4 Workforce Development Workforce Development will provide an integrated suite of training, education, and student preparation activities to address formal and informal learning of advanced digital resources. Led by L3 manager Scott Lathrop (Shodor), workforce development will address the needs of researchers, developers, integrators, IT staff, XSEDE staff, and undergraduate and graduate faculty and students. CEE will provide business and industry with access to XSEDE2’s workforce development efforts including training services and student internships that have proven beneficial to industry in XSEDE1. C.3.4.1 Training The Training team will develop and deliver training programs to enhance the skills of the national open science community and ensure productive use of XSEDE’s cyberinfrastructure. XSEDE will expand the breadth and depth of XSEDE training content based upon a gap analysis of current programs and needs identified by the UE team. Led by Susan Mehringer (CAC), XSEDE2 will expand on existing training

C-10

roadmaps to include information on which training courses have been vetted and provide pointers to materials available from XSEDE as well as external training providers. Survey data will be collected to assess and improve upon respondents’ abilities to easily find the needed material. The training team will fully implement the XSEDE training certification program for users and staff in an effort to recognize learners who demonstrate competencies attained through participation in XSEDE training offerings, enabling them to gain recognition for their accomplishments. The Moodle Learning Management System and Mozilla’s Open Badges Infrastructure (OBI) are the basis for implementation. Badges for an additional three competencies will be offered in PY6 with a goal to issue at least ten badges to XSEDE staff and fifty badges to XSEDE users, with 10% growth planned for successive years. During PY5 of XSEDE1, we will offer short duration Massive Open Online Courses (MOOCs). Based on a user assessment of these MOOCs, XSEDE2 will enrich the interactive and hands-on portions of these training offerings and transform them into smaller, more effective SPOCs (Small Private Online Courses) offered quarterly. Past evaluation data shows SPOC students were much more motivated and had a higher completion rate when provided with mentoring and a badge or university credit. XSEDE2 will coordinate training development and offerings with campus representatives and HPC centers interested in developing, delivering, and/or using training materials. The University of Illinois’ Computational Science and Engineering group (see Appendix IV Letters of Collaboration), the Software Carpentry group (see Appendix IV Letters of Collaboration), and the Data Carpentry groups have committed to collaborate with XSEDE. These collaborations will gather user requirements for training, share plans for developing training materials among these groups, and foster sharing of training development and the resulting materials. The objective is to expand the breadth and depth of training so researchers, users, students, and XSEDE staff will have ready access to an ever-expanding portfolio of training opportunities delivered via live, broadcast, and online learning platforms. Area metrics include number of people registered quarterly, number of certifications awarded quarterly, and an annual average impact assessment of the training offerings by the participants. C.3.4.2 Education The education team will work closely with training and student preparation to create a cohesive team supporting faculty in all fields of study about advanced digital technologies, and incorporating those capabilities within the undergraduate and graduate curriculum. Led by Robert Panoff (Shodor), XSEDE2 will expand on XSEDE1 efforts by developing an online community for faculty to share experiences and get advice on curriculum materials and development. XSEDE2 will work with faculty to develop 50 new, re-usable learning modules and materials. This will include modules for introducing computation and data-enabled techniques within STEM classes and student oriented projects. XSEDE will disseminate educational materials to provide public access to a growing base of peer-reviewed materials that will enhance the graduate and undergraduate experience and contribute to preparing future generations. The education team will visit campuses and attend regional workshops for faculty. This outreach has proven to be crucial in engaging faculty with integrating computational and data-enabled tools and methods into the curriculum. The campus visits and faculty support have been instrumental in motivating and assisting departments and colleges with developing certificate and degree programs. The outreach also helps raise awareness and usage of the repository of training and education materials available from the XSEDE User Portal for re-use by the community. Area metrics include number of faculty that incorporate advanced digital resources into the curriculum, number of materials contributed to a public repository, and number of materials downloaded from the repository, all on an annual basis. C.3.4.3 Student Preparation The Student Preparation program will actively recruit students to use the aforementioned training and education offerings to enable the use of XSEDE resources by undergraduate and graduate students. Evaluation data show XSEDE’s overwhelmingly positive impact preparing college students to conduct computational science and research. Led by Rosalia Gomez (TACC), XSEDE2 will reach thousands of students annually via the vast array of training offerings. XSEDE2 will provide badging and certification for students on a diverse range of topics including parallel programming, visualization, data analytics, and software engineering practices. XSEDE2 will broaden participation by engaging with students via conference exhibitions, campus visits, regional workshops, and national conferences (§C.3.5).

C-11

XSEDE will reach out to externally funded student programs, such as NSF Graduate Research Fellows, the NSF Research Experience for Undergraduates (REU), Integrative Graduate Education and Research Traineeship (IGERT), and Broadening Participation in Computing programs. The student preparation program will also establish partnerships with national student organizations (e.g. SIAM, ACS, ACM). The students in these programs will have the opportunity to access XSEDE’s resources and services, including the workforce offerings. Area metrics include number of students, including under-represented students, using XSEDE resources and services on a quarterly basis, and number of students presenting XSEDE-enabled research at various conferences. C.3.5 Broadening Participation Broadening Participation will engage under-represented minority researchers from domains that are not traditional users of HPC, and from Minority Serving Institutions. This target audience ranges from potential users with no computational experience to computationally savvy researchers, educators, Campus Champions, and administrators that will promote change at their institutions for increased use of advanced digital services for research and teaching. Led by Linda Akli (SURA), XSEDE2 will provide awareness activities—conference exhibitions, campus visits, and regional workshops—while increasing national impact through new partnerships such as the Southern Region Education Board Doctoral Scholars Program, the Institute for African-American Mentoring in Computing Sciences, and the Computing Alliance for Hispanic-Serving Institutions. XSEDE2 will aggressively promote the submission of papers at professional societies by XSEDE under- represented users and expand our dissemination partners to include new initiatives such as the IEEE Special Technical Committee on Broadening Participation. Persistent participation is enabled by curriculum reform and larger numbers of researchers adopting the use of advanced digital resources as standard methods. Collaboration with Campus Engagement (§C.3.3) and Education (§C.3.4.2) will support institutional change and capacity building. XSEDE will target institutions with funded initiatives to implement curriculum changes and increase research capacity. Using the model of the Service Provider Forum, an XSEDE Diversity Forum will be established with outreach and diversity managers at HPC centers and on campuses. The forum participants will share best practices, identify ways to leverage XSEDE activities, and review XSEDE programs to ensure they are encouraging diversity. The diversity forum will be responsible for engaging new national programs and initiatives, institutions with funding to make curriculum change and research infrastructure investments, and major research grant awards at MSIs or with a focus on broadening participation. Area metrics include number of under-represented individuals using XSEDE resources and services on a quarterly basis, number of sustained under-represented individuals using XSEDE resources and services annually, and satisfaction assessment of the Diversity Forum. C.3.6 Community Engagement & Enrichment KPIs KPIs for CEE are derived from our sub-area activities. We have goals to measure the number of sustained participants to begin understanding persistent impact the project has on current and potential users and underrepresented communities.

KPIs Annual Target Sub-goal Supported (§C.1.3) Number of new users of XSEDE resources and services >1,000 Extending use to new communities Number of sustained users of XSEDE resources and services >5,000 Extending use to new communities Number of new under-represented users of XSEDE resources and >100 Extending use to new communities services Number of sustained under-represented users of XSEDE >1,000 Extending use to new communities resources and services Average rating of users and staff regarding how well-prepared 4 out of 5 Prepare the current and next generation they feel to perform their jobs Registrants for Training 10% increase Prepare the current and next generations annually Average impact assessment of the Training 4 out of 5 Prepare the current and next generations Website (# of unique visitors) 80,000 Raise awareness Table C-2: KPIs for Community Engagement & Enrichment.

C-12

C.4 Extended Collaborative Support Service: Enabling the Research Community Domain scientists should not have to be experts in all areas of cyberinfrastructure to achieve their goals. At the leading edge, science requires a team. The ECSS program provides professionals who can be part of such teams—dedicated staff who develop deep, collaborative relationships with XSEDE users, helping them make best use of XSEDE resources to advance their work. These professionals possess combined expertise in many fields of computational science and engineering. They have a deep knowledge of underlying computer systems and of the design and implementation principles for optimally mapping scientific problems, codes, and middleware to these resources. ECSS includes experts in not just the traditional use of petascale computing systems but also in data-intensive work, workflow engineering, and the enhancement of scientific gateways. In each of XSEDE’s annual reviews, ECSS has been singled out as the most valuable aspect of XSEDE. Encouraging feedback we have received from researchers includes:  “It’s great to have people who understand the discipline and speak the scientists’ language.”  “This is the best support I’ve ever had for any computing. We pay for support at Amazon and it’s nowhere as good as ECSS support.”  “ECSS support has been critical. We couldn’t do what we needed to do on our own.”  “This work allowed us to do things we hadn’t even considered before.”  “I learned what better programming looks like.” The extended collaborations available through ECSS complement initial engagements with users through the XSEDE Operations Center helpdesk and Community Engagement & Enrichment (CEE). During XSEDE1, ECSS has undertaken hundreds of collaborative projects. These last for at least one month and are expected to have significant deliverables within a year. Staff members typically spend 20-25% of their time on a single project, but there is flexibility in how these projects unfold. A staff member may spend a month working full time on a project, but often the collaborations last an entire year at a smaller level of effort to align with the availability of the PI’s team. PIs who want or need a multi-year effort are encouraged to write ECSS staff members into their grants, and some have done so. XSEDE 1 average satisfaction ratings from PI interviews are 4.52 out of 5. Therefore we are not planning major changes to the ECSS program; instead, the changes we propose emerge from lessons learned. We face an ongoing challenge to ensure that the ECSS pool of expertise matches the needs faced by users. To support new computing modes being introduced by SPs (e.g., virtual machines, Hadoop, databases), we will require SPs to retarget their ECSS support to staff with relevant expertise. We have successfully used this strategy in the past to build ECSS expertise in genomics, GPUs, and Xeon Phis. For expertise not already present in ECSS, a major innovation in XSEDE1 was contract hiring within ECSS, which was used to hire experts in digital humanities, workflows, and data analytics. These experts have now been integrated into the ECSS staff. ECSS will compete for portions of the XSEDE Innovation Fund (§C.8) when we identify a need for which contract hiring is the proper mechanism. Finally, we will use continuing staff development to ensure that ECSS staff learn from other projects and build expertise. The ECSS Symposium series occurs monthly, and we will hold our first in-depth, hands-on training sessions for ECSS staff at XSEDE15, an activity that will continue going forward. ECSS projects fall into five categories, which are further described below. Because of the size of this program (29 FTEs, ~70 experts), we will continue to coordinate these as two larger groups. ECSS- Projects, led by co-PI and L2 director Ralph Roskies (PSC), comprises Extended Support for Research Teams (ESRT) and Novel and Innovative Projects (NIP). ECSS-Communities, led by co-PI and L2 director Nancy Wilkins-Diehr (SDSC), comprise Extended Support for Community Codes (ESCC), Extended Support for Science Gateways (ESSGW), and Extended Support for Training, Education and Outreach (ESTEO). Roskies and Wilkins-Diehr have developed an effective working relationship, each one helping the other when needed. Moreover, the assignment of experts to these five categories will continue to be very fluid. The same expert may participate in an ESRT project and an ESTEO project at the same time, and sometimes a single project will require expertise from multiple areas of ECSS. Project-based ECSS support is requested by researchers via the XSEDE peer-review allocation process. If reviewers recommend support and if staff resources are available, the ECSS expert and the requesting PI develop a work plan outlining the project tasks. The work plan includes concrete quarterly goals and

C-13

staffing commitments from both the PI team and ECSS. ECSS managers review work plans and also track progress via quarterly reports. ECSS staff will be active in numerous other activities in XSEDE. Besides helping scientists with their codes, ECSS will provide the expertise for the CEE training program (§C.3.4.1) and will assist the Resource Allocation Service (RAS, §C.7) by conducting allocation reviews of smaller-scale Research requests and all Educational requests. ECSS staff review hundreds of requests each year. For XCI, ECSS will provide use cases and participate in technical reviews. Roskies and Wilkins-Diehr will direct overall ECSS activity, oversee the process for identifying and filling needed expertise, interact with the User Advisory Committee, and supervise the ECSS management team. This team will comprise L3 managers for each area and two half-time project managers. The management team will be responsible for the selection and execution of all ECSS projects as well as project reporting and planning. The management team will assign staff members to projects by mapping available skills to needs and manage the ECSS resource planning process. Roskies or Wilkins-Diehr will interview all PIs who received project-based ECSS support from ESRT, ESCC, and ESSGW after their projects conclude. From these interviews, we will obtain the satisfaction rating and the impact rating that go into our KPIs. PIs provide invaluable feedback on how we can improve ECSS operations and on other aspects of the XSEDE and SP programs that are fed back to the relevant people to improve overall XSEDE and SP activities. To address continuity beyond XSEDE, we will be leaving a cadre of experts invaluable to the national open science community. We have demonstrated that they can be managed remotely. All of our processes, templates, meeting minutes and topics are captured in the wiki and will be made available to the next team managing the XSEDE cyberinfrastructure and services. C.4.1 Extended Support for Research Teams (ESRT) ESRT will be led by L3 manager Lonnie Crosby (NICS). An ESRT project, requested by a research team, is a collaborative effort between the team and ECSS staff to enhance the group’s ability to use XD resources and related technologies. Since 2011, ESRT has had 147 projects with work plans. ESRT projects can require many different types of expertise. For example, in Douglas Spearot’s project (§C.2.3), ECSS staff streamlined workflows, automated visualization techniques, and improved the parallelization and scalability of Spearot’s code. More generally, ECSS staff will contribute expertise in performance analysis, petascale optimization, effective use of accelerators, I/O optimization, data analytics, and visualization, in addition to domain knowledge. Expertise in ESRT, as in all areas of ECSS, will adapt and expand to new areas reflecting the needs of the user community. Area metrics will include the annual number of projects initiated, the number discontinued, and the number with work plans. C.4.2 Novel & Innovative Projects (NIP) The NIP initiative, launched in 2011 and extremely fruitful under the leadership of L3 manager Sergiu Sanielevici (PSC), proactively develops projects in areas of science and scholarship that have traditionally not used advanced CI. The team collaborates with researchers and educators from underrepresented groups. The program has been especially successful generating and mentoring advanced computing projects in bioinformatics; machine learning; image, text and social network analysis; and cross-disciplinary studies. Some are featured in science highlights (§C.2.3). NIP developed the XSEDE Domain Champions program, in which community leaders increase awareness of XSEDE and provide ideas and feedback. NIP has brought 46 projects to the startup stage, and 20 of those have received XRAC awards. NIP will focus on efficiently exploiting the capabilities of new SPs and complementary components of the national advanced computing and data ecosystem. NIP experts will work closely with SPs, recruit and steer appropriate user groups to the most suitable resources, and mentor them to ensure the success of their projects. In particular, the efficient use of the science gateway, virtual environments, and data hosting and analysis support offered by the new SPs should significantly boost the return on NIP effort. These environments promise to greatly reduce the barriers between end-users and the advanced computing ecosystem, especially for people in non-STEM fields and at under-resourced institutions. NIP will expand its efforts to additional disciplines, such as computational mathematics, applications of geographical information systems, and the arts. Suggestions will be sought from advisory bodies, NSF

C-14

program directors, and XSEDE internal sources. To improve its impact on underserved minorities, NIP will further strengthen its collaboration with CEE, paying special attention to the development and mentoring of projects that improve the quality and efficiency of teaching at under-resourced institutions. We will use the contract hiring and Domain Champion recruitment processes, as well as the Campus and Regional Champion programs, to ensure active participation by underrepresented groups in the work of NIP. Area metrics will include the number of new XSEDE projects from target communities that NIP helps to generate each year, and the number of such projects that successfully use XSEDE services (tracked by allocation usage and publications). C.4.3 Extended Support for Community Codes (ESCC) Led by L3 manager John Cazes (TACC), ESCC efforts will be aimed at deploying, hardening, and optimizing software, with the focus on disciplinary software used by extensive research communities rather than software used by individual research teams. ESCC projects include collaboration with the developers of widely used community applications and models and may include industry partners. Optimizations made here affect large numbers of users, both within ECSS and elsewhere. For example, an ESCC project to optimize the astrophysics code GADGET resulted in improvements that were included in the open-source code base and are now available to anyone downloading the code. Work plans for projects are developed as described for all ECSS projects. Since 2011, ESCC has had 48 projects with work plans. ESCC projects can be proposed by the developers of community codes, the ESCC manager or suggested by staff, XSEDE leadership, and advisory boards. Priority will be given to helping projects funded by NSF programs (e.g., PetaApps, SDCI, STCI, SI2, MREFC) to generate robust, sustainable, and maintainable community applications. XSEDE also supports user-controlled Community Software Areas (CSAs) where any developer can get an account and install and publicize their software. The ability to request CSAs will be featured more prominently in XSEDE2. Area metrics will include the annual number of projects initiated, the number discontinued, the number with work plans, and the number with adequate documentation. C.4.4 Extended Support for Science Gateways (ESSGW) ESSGW will be led by L3 manager Marlon Pierce (IU). Science Gateways are community-designed, web- based interfaces that build on XSEDE (and other) resources to provide services to their communities. Gateways play a critical role in expanding XSEDE’s user base and account for 40% of all XSEDE users. But the needs of gateway developers can be significantly different from those of researchers requesting other types of ECSS assistance. Gateways require well-defined, secure, web-accessible programming interfaces which are used for remote job submission, monitoring, and management; remote file and data management and transfer; and information services describing the state of hardware and networks, available software, queuing systems wait times, and similar information. ESSGW staff can often use lessons learned working with one user team to advise another. Best practices will continue to be captured through activities like the gateway cookbook [17]. ESSGW staff members also bring in expertise in areas such as workflows, data analytics and digital humanities and often recruit new gateways through their connections in the community. Since 2011, ESSGW has completed 25 projects with work plans. ESSGW support will increase in importance with the introduction of new resources such as SDSC Comet, PSC Bridges, TACC Wrangler, and Indiana’s Jetstream. These resources are designed to support gateway usage and novel user groups. Better integration with ESTEO and CEE will also be a priority for ESSGW in order to introduce new users to gateways. Area metrics include the annual number of projects initiated, the number with work plans, the number of Gateway users, and the number of architectural use cases derived from engaging with the larger science gateway community. C.4.5 Extended Support for Education, Outreach & Training (ESTEO) ESTEO, led by L3 manager Jay Alameda (NCSA), coordinates bringing technical expertise of ECSS staff members to support CEE efforts. Staff deliver training in many venues—at XSEDE sites, on campuses, at conferences and offered virtually. One R training class included 30 live participants and 420 online, who shared experiences and helped one another via an active Twitter chat throughout the event. Such events attract around 30,000 registrations each quarter. ESTEO staff serve on committees within XSEDE’s CEE area to jointly plan and support these activities. ESTEO experts develop, review, and present technical

C-15

content in all areas of ECSS expertise. Researchers can also request ESTEO support for resource use in support of education. ESTEO staff review education allocation requests and suggest new topics for training. They are often on the front lines of engagement with current and prospective users and refer information and leads to ECSS management for follow-up action. The Campus Champions Fellows program pairs XSEDE Campus Champions (§C.3.3) with ECSS staff members to work together on ECSS projects for one year. Fellows commit 400 hours per year and receive a stipend and travel support in order to participate. For ECSS staff, acting as a mentor to a Fellow counts as an additional ECSS project, allowing time to participate substantially in the mentoring exercise. The goal is to enhance the effectiveness of Fellows on their campus. Area metrics include the number of presentations, the number of attendees, the number of staff training events, and the average assessment of staff training impact. C.4.6 ECSS KPIs KPIs for ECSS are derived from our project-based activities. We have goals for the number of projects we complete annually and ratings from PIs both for satisfaction and impact.

KPIs Annual Target Sub-goal Supported (§C.1.3) # of completed projects (ESRT + ESCC + ESSGW) 50 Deepening use Average satisfaction with ECSS support from PI follow-up interviews 4.5 of 5 Deepening use Average impact rating from PI follow-up interviews 4 of 5 Deepening use Table C-3: KPIs for ECSS. C.5 XSEDE Community Infrastructure: Adding value to the CI ecosystem It is important that XSEDE operates in a coherent national CI ecosystem comprised of resources funded by NSF as well as a growing number of resources funded by colleges and universities and a growing dependence on commercial cloud providers. Led by L2 director David Lifka (Cornell), the XSEDE Community Infrastructure (XCI) team is a new unit of the XSEDE organization, focusing on interactions between XSEDE and the national community of CI providers and users. We envision enabling users targeting services allocated by XSEDE (including OSG resources), campus-based CI facilities, commercial cloud providers, CI software services such as Science Gateways and Globus Online, and even the individual researcher who wants to interact effectively with the national CI via her or his own . Through XCI, XSEDE2 will serve an aligning function within the nation not by rigorously defining a particular architecture, but rather by assembling a technical architecture that facilitates interaction and interoperability across the national CI community. The suite of interoperable and compatible software tools that XSEDE2 will make available to the community will be based on those already in use by XSEDE, such as Globus, but will add additional services that address emerging needs including data and computational services. The software and tools distributed by XSEDE2 will adhere to widely held community standards providing the base for a high degree of interoperability and compatibility among the CI community partners. XCI is responsible for understanding the community infrastructure requirements in the form of use cases gathered by the XSEDE User Advisory Committee (UAC), XSEDE users via CEE, XD SPs, and commercial cloud service providers. XCI uses those requirements to identify existing tools and services that meet those requirements or identifies and evaluates new tools from the community that do so. After testing those tools to ensure proper security and integration with existing XSEDE services and tools, they will be tested with the stakeholders that requested them to ensure they address the expressed needs. The tools and services will then be made available in the XCSR along with instructions on how to deploy them. XCI will work with CEE to promote the availability of these new capabilities and hold regular workshops and training to assist the community in their deployment. Feedback will be requested regularly on how well these capabilities are meeting or can be extended to better meet the requirements of the community. XCI will create the XSEDE Community Software Repository (XCSR), a service and tool catalog available to the national community via the XSEDE website. This catalog will list all services and tools, which SPs have them installed, and links to the source code and/or installation packages along with documentation

C-16

necessary to install and configure them. A list of all use cases, their stakeholders, and current status will also be cross-referenced with each service or tool. This information will inform discussions of priority and importance with stakeholders and the national community. All this information will be stored in the XCSR, a core deliverable and vehicle for our handoff strategy to the XSEDE successor(s) at the end of PY10. XSEDE is moving from a direct support model to a subscription model. XCI will identify opportunities to leverage cloud providers for select elements of service delivery in order to provide a sustainable and scalable approach for integrating critical services and tools into the ecosystem. For example, XSEDE will enter into a subscription service with Globus Transfer as opposed to funding Globus staff directly. This agreement will allow closer monitoring and metrics for transfers, Globus Plus services, and data publication options. We will also provide gateway-hosting services as part of the XSEDE organizational infrastructure—hosted within XSEDE on a server to be called XGH (XSEDE Gateway Hosting), based on a refresh of the existing Quarry Gateway Hosting. We expect that over time XSEDE will adopt more cloud-hosted services for its technical infrastructure. Rather than treating these services as part of XSEDE operations, we will take a peer-to-peer approach where XSEDE will interact, contract, and report on the value of such cloud-like infrastructure services as part of our community interaction activities. Because return-on-investment will be a priority, we will also work with outside developers and software providers to instrument their tools so we can measure usage in a consistent way and ultimately feed into XDMoD—the portal by which we share resource and service usage information [18]. Software as a Service (SaaS) providers such as Globus and Science Gateways will be required to provide usage data as well. It will not be possible or even practical to instrument all codes for usage tracking, but anything that requires a significant financial or personnel investment by XSEDE will be an important target. Working with the community, we will communicate this aspect as a critical part of developing community code. We will provide examples and workshops where possible to assist the community in this effort, or a request can be made for ECSS support. XCI will maintain its website, associated software repository, and user testimonials in accordance with the overall XSEDE transition plan to ensure continuity beyond XSEDE2. In particular XSEDE partners at University of Chicago, Indiana University, and Cornell University take this responsibility on as part of their institutional commitments in participating in XSEDE2. C.5.1 Requirements Analysis & Capability Delivery (RACD) Led by L3 manager John-Paul Navarro (Argonne), the RACD team will facilitate the timely delivery of new capabilities by: identifying technical requirements implicit in use cases, identifying functionality gaps and needed capabilities to enable use cases, working with software providers to fill those gaps, and coordinating the engineering work necessary to deliver capabilities to CI operators and user communities in the US. The technical services that RACD provides in support of engineering collaboration are: requirements analysis and capability delivery planning; documentation of the infrastructure design and component interfaces (APIs); organization of stakeholder design and security reviews; and the preparation of software and associated documentation for dissemination and use. Initial tool suites will focus on these XSEDE1 identified high priority needs: Infrastructure Discovery Services (resource, software, and service); User and Group Discovery Services; and Engineering Discovery Services (use cases, deployable components, and engineering activities). All of the software implementation, dissemination, and support will be carried out with a sense of “enabled by XSEDE,” rather than “created and branded XSEDE.” Software will be distributed to the XSEDE community through the XCSR. CI operators and the national user community will be enabled and encouraged to treat this repository much like a large menu—where people who manage a CI resource can select those tools that are relevant to the needs of their resource users and the purpose of their CI resource. XSEDE will recommend tools that are appropriate for use in a particular circumstance, or the sets of tools for particular CI provider groups (e.g., Level 1 SPs or campus resources). XCI will create and manage use case capability delivery plans. These plans may be put on hold if constraints limit XSEDE’s ability to deliver a capability, or there is insufficient ROI. RACD area metrics include the number of capability delivery plans prepared for prioritized use cases, the number of CI integration assistance engagements, the number of components delivered to testing by Capabilities Evaluation and Testing, the customer rating of components delivered to testing, and responsiveness to defect and support requests.

C-17

C.5.2 Capabilities Evaluation & Testing (CET) Led by L3 manager J. Ray Scott (PSC), the CET team will facilitate the delivery of capability components to XSEDE by working with the RACD team to identify and perform evaluations on candidate capability components to deliver to XSEDE users; performing acceptance testing on capability components the RACD team has engineered to integrate with XSEDE; and assisting the Capability and Resource Integration (CRI) team for production deployment. A key function of this area is in identifying and testing technologies that are candidates for adoption by XSEDE and the national CI community. This will include operating the XCSR catalog, performing initial functionality and security analysis on test bed resources, and conducting trials at SPs and engaging stakeholders that requested the capabilities to ensure their needs are met. In keeping with NSF guidance, when RACD and CET have identified a tool that meets a particular community need, we will strive to make that tool available to the community as quickly as possible. Our goal will be to go from the initial identification of a community need to delivery of a tool implementation ready for use by Level 1 SPs in twelve months. The CRI engagement with SPs will include presentations, workshops, and training on the use, installation, and configuration of new capabilities and services. CET area metrics include the number of new technologies evaluated and the percentage of capability identification, evaluation, and testing requests from the RACD team completed. C.5.3 Capability & Resource Integration (CRI) Led by L3 manager Richard Knepper (IU), the CRI team will manage and coordinate working with SPs and campuses to maximize the aggregate utility of national cyberinfrastructure. For SP integration, CRI will have an SP coordinator who will focus on XSEDE interactions with SPs and the SP Forum. CRI will also engage with other national CI organizations such as ACI-REF, EDUCAUSE, SURA, CASC, the Open Science Grid, and campus CI providers. These interactions will play a strong role in cost/benefit analyses and priority setting. They will inform and help all CI providers serving the U.S. research community understand each other’s needs and the needs faced by their users, promote best practices, and synergize provided services. By working with CRI, CI providers will gain a clear understanding of the costs and benefits of interoperability and interaction with XSEDE. CRI will extend and complement Campus Bridging activities in XSEDE1 by establishing closer links with organizations that make use of these technologies and soliciting input and greater participation from these stakeholder organizations. CRI will help CI providers of particular system types by creating “toolkits” within the XCSR that correspond to common usage modalities. The first of these toolkits will be the XSEDE National Integration Toolkit (XNIT). XNIT will include tools that can be installed on a campus cluster to promote interoperability with the national cyberinfrastructure, including XSEDE. XNIT will largely replace what is now called the XSEDE Compatible Basic Cluster (XCBC); however, we will maintain a Rocks distribution of the XCBC for those interested in new cluster installations. XNIT will include a “laptop suite” of tools that can be installed on a workstation or laptop computer at any site, with infrastructure and scientific software to enable researchers to interact effectively with the national cyberinfrastructure from their own personal system. CRI area metrics include the number of repository subscribers to XCRI cluster and laptop toolkits, the aggregate number of TeraFLOPS of cluster systems using XCRI toolkits, and the number of partnership interactions between XCRI and SPs, national CI organizations, and campus CI providers. C.5.4 XSEDE Community Infrastructure KPIs KPIs for XCI focus on the integrating new capabilities and developing relationships with new stakeholder communities. XCI goals include overall satisfaction with activities introduced by XCI, the ratio of delivered capabilities to those planned, and the utilization of toolkits and implementation of recommended tools, reflecting both stakeholder satisfaction and impact of XCI activities.

  KPIs Annual Target Sub-goal Supported (§C.1.3) Satisfaction rating on XCI Services overall and those 4 out of 5 Create an open and evolving e- introduced in the last program year infrastructure # of capabilities delivered / # planned 100% Create an open and evolving e- infrastructure Total number of systems that use one or more CRI 10 Create an open and evolving e- provided toolkits infrastructure

C-18

KPIs Annual Target Sub-goal Supported (§C.1.3) % of Level 1 total systems that fully incorporate all of the 100% Create an open and evolving e- recommended tools from the XSEDE Community infrastructure Repository Table C-4: KPIs for XSEDE Community Infrastructure.

C.6 XSEDE Operations Led by L2 director Dr. Gregory Peterson (NICS), XSEDE Operations will maintain and evolve an integrated CI capability of national scale, incorporating a wide range of digital capabilities to support the diverse national scientific and engineering research effort. To this end, XSEDE will provide first-class facilities, support, and services for existing and emerging research and educational users via improved technical capabilities and services, coordinated operation of distributed resources, a 24x7 operations center, and highly accessible documentation. XSEDE Operations will increase interactions with other organizations operating computational resources such as university campuses, national computational resource providers such as Blue Waters, NCAR, the Open Science Grid (OSG), and other large NSF projects, with the goal of sharing information, saving costs, and reducing duplication of effort. XSEDE Operations will innovate through expanded trend tracking and monitoring, and the analysis and visualization of operational data related to the ticket system, data transfers, and system information that Operations collects. These efforts complement the RAS data analytics efforts pertaining to SP resource usage. New visualization capabilities will include additional public and internal dashboards on the XSEDE website, wiki, portal, and mobile apps. These capabilities will provide new insight and business intelligence for guiding decision making by NSF, XSEDE Operations, SPs, and users. XSEDE Operations will build upon the current operational successes with continued improvement based on XSEDE management guidance, advisory inputs, and NSF review panel recommendations. The function of two groups formerly in Operations, Software Testing and Deployment (ST&D) and Accounting and Account Management (A&AM), will now be part of XCI (§C.5) and RAS (§C.7), respectively. To streamline coordination and eliminate duplication of effort, XSEDEnet and Data Services will merge to create Data Transfer Services. Thus XSEDE Operations will encompass four areas that handle day-to- day operational tasks impacting the entire project: Cybersecurity, Data Transfer Services, XSEDE Operations Center, and Systems Operational Support. The Operations L2 director and management team will regularly assess progress toward goals, address cross-group challenges, and plan collaboratively; progress will be recorded and shared on the XSEDE staff site. C.6.1 Cybersecurity Led by L3 managers Randy Butler (NCSA) and Jim Marstellar (PSC), the Cybersecurity group will continue its excellent track record in protecting the confidentiality, integrity, and availability of XSEDE resources and services. This includes monitoring to detect and respond to any security incidents and providing guidance for security policies. During PY1-4, there were no XSEDE security incidents due to the effective security approaches employed by the security team. Given the breadth and number of security threats, this accomplishment attests to our cybersecurity program effectiveness and coordination. The Cybersecurity group will continue to expand and improve XSEDE security while minimizing impact on users and their productivity. To further increase awareness and rapid response to threats, knowledge from individual sites will be aggregated and applied across all of XSEDE. A real-time intelligence sharing service for SPs will be deployed that will leverage the Research and Education Networking Information Sharing and Analysis Center (REN-ISAC) Collective Intelligence Framework (CIF) for exchanging attack intelligence. The CIF is an NSF-funded project to improve local protection against cyber threats by sharing security event information in near-real time [19]. This real-time intelligence will feed into a system that will shunt traffic related to the IP addresses of bad actors such as password attackers or network scanners into a black hole network, thereby eliminating the threat. We will extend this service beyond XSEDE sites to include campus participants and operators of Science DMZs [20]. We will further extend our intelligence system to perform cross-site analysis to look for scans, account attacks, and other suspicious activities that don’t reach thresholds at any one site but do trigger an alert when the same action is identified across multiple sites. This derived intelligence will then be shared with all participating sites. This has promise to transform our security ability to monitor and respond to a broad spectrum of attacks, with concomitant potential impact on the entire national CI ecosystem. The widespread

C-19

deployment of virtual machine technologies including Docker and OpenStack highlights the critical need to understand cybersecurity best practices for these environments. Similar issues result from the adoption of public, private, and hybrid cloud services. We will develop best practices around these topics, document them, and aggressively work to disseminate this information. Outreach to campus CIOs and IT staff through the CEE Campus Engagements program (§C.3.3), and Science DMZ operators will further include a collaborative effort working with ESnet, the Bro Center of Excellence, and the new Cybersecurity Center of Excellence with the goal of further documenting and training campus operators in security best practices. Finally, we will also develop specific training for security staff at XSEDE SPs to cover policy, process, controls, and best practices within XSEDE. Area metrics include the availability and communication of security best practices to SPs and beyond, staff security awareness program participation, adoption of the REN-ISAC CIF, and adoption of two-factor authentication for XSEDE central servers. C.6.2 Data Transfer Services Led by L3 manager Tim Boerner (NCSA), Data Transfer Services (DTS) will refocus on end-to-end data transfer performance, functionality, and efficiency in user workflows. DTS will continue efforts to improve users’ ability move data between instruments, resources, centers, and campuses. Where applicable, DTS will leverage emerging analytics capabilities and software defined networking (SDN) tools to improve performance, provide quality of service capabilities, and monitor network health and efficiency. These efforts will include: (1) migration of XSEDEnet Internet2 connections from 10 Gbps to 100 Gbps; (2) modifications to the XSEDEnet routing policy over Internet2 to create a de facto best path to XSEDE resources for all Internet2-connected sites, thereby broadening the community served by XSEDEnet connectivity; (3) improvements to end-to-end data transfer throughput along with documenting and sharing best practices for large-scale data transport with the community; (4) deepening of engagement with the national Research & Education networking community through direct participation (e.g., panels, papers, posters) in the Internet2 Global Summit, Internet2 Technology Exchange, and other conferences; and (5) investigating and deploying new methods for instrumenting the entire data-transport path to provide greater insight into performance issues as well as enabling research into new data transport techniques, such as scheduled bandwidth reservations and instrumented kernels. Area metrics include data transfer volume, end-to-end performance for data transfers of a sufficient size (e.g., GridFTP and SCP), adoption of GridFTP data transfer servers by all SPs at all levels, and Science DMZ awareness by all SPs. C.6.3 XSEDE Operations Center (XOC) Led by L3 manager Mike Pingleton (NCSA), the XSEDE XOC is a 24x7 support resolution center providing critical services monitoring and frontline support with a seamless, single source of user assistance. Providing users with effective user support is crucial to their making the most of the XD ecosystem. Users can connect via a toll-free telephone number, a single point of contact email address, or through a web interface to the ticket system in the XSEDE User Portal. The XOC will manage first-line user contacts, resolve common issues, and coordinate problem reporting and resolution for central services as well as SPs. The XOC will provide a helpful response within 24 hours for every ticket submitted. During PY1-3, the XOC fielded and routed over 35,000 tickets while exceeding targets for XOC response time. Building on this excellent track record, the XOC will coordinate and monitor all tickets routed to non-SP specific XSEDE ticket categories to ensure timely response and escalate stale tickets. Also, the XOC will work with the XCI SP coordination team and the XD Metrics Service to develop new analytics capabilities to address SP ticket reporting and escalation of stale SP tickets. Area metrics include ticket information best practice compliance by XSEDE staff and SPs (to ensure the minimum required information to route and report on a ticket), time to resolution for XSEDE tickets and SP tickets, and XOC training activities. C.6.4 Systems Operational Support (SysOps) Led by L3 manager Gary Rogers (NICS), the SysOps group provides system administration and monitoring for all of the approximately 50 XSEDE centralized services spread across 7 sites. SysOps provides 24x7 monitoring and high availability for critical services for XSEDE, including geographically distributed backup and failover capabilities for enterprise services. This includes support for the XSEDE Central Database (XDCDB), which is critical for user account creation and deletion, usage tracking, the

C-20

ticket system and help desk, user news and mailing lists, the XSEDE User Portal (XUP), the XSEDE website, and documentation hosting. Other services include identity, data, and job management in support of XSEDE’s data and grid infrastructure. For XSEDE2, SysOps staff will continue to employ server virtualization to control costs without sacrificing high availability. Area metrics include availability of the XDCDB, XUP, and account management services, availability of Nagios monitoring of central services, implementation of two-factor authentication for privilege escalation on servers that provide central services, and usage tracking for the XSEDE single sign-on service. C.6.5 XSEDE Operations KPIs XSEDE Operations supports the core services and capabilities that underpin all of XSEDE. To measure effectiveness of its services, XSEDE Operations has defined the following a set of KPIs.

KPIs Annual Target Sub-goal Supported (§C.1.3) Average composite availability (geometric mean of critical 99% Provide reliable, and secure infrastructure services) Availability reduction resulting from security incidents (%) 0% Provide reliable and secure infrastructure Mean time to ticket resolution (hrs.) < 24 Provide excellent user support Table C-5: KPIs for XSEDE Operations.

C.7 Resource Allocations Service (RAS): Stewarding the National Investments RAS, led by L2 director David Hart (NCAR), will build on XSEDE’s current allocation processes and evolve to meet the challenges presented by new types of resources to be allocated via XSEDE, new computing and data modalities to support increasingly diverse research needs, and large-scale demands from the user community for limited XSEDE-allocated resources. RAS will accomplish its objectives through three activities: managing the XSEDE allocations process (§C.7.1) in coordination with the XD service providers, enhancing and maintaining the RAS infrastructure and services, and anticipating changing community needs. The RAS team will collaborate with XSEDE Community Engagement & Enrichment (§C.3), Extended Collaborative Support Service (§C.4), and External Relations (§C.8.2) on targeting outreach to help guide users toward available resources. Supporting the XSEDE allocation and SP activities, RAS will also increase its analytics focus and mine the XSEDE Central Database (XDCDB) to document and project user demand for high-end CI resources. Such efforts will help SPs meet their award deliverables and provide NSF with data it can use to guide the direction of national CI investments. Coordinated within the RAS office of the director and working closely with the XD Metrics Service (providers of XDMoD) and XSEDE Evaluation Team (§C.8.5), this analytics effort will investigate metrics-driven approaches to improving allocations processes and policies (including NSF policies) to better meet user needs and steward national investments in CI. For example, we will conduct analyses leveraging user survey data, XDMoD usage reports, allocation requests and awards, and the XSEDE publications database to understand how users have adapted to the severe resource constraints over the past several years and to identify possible responses from XSEDE, the SPs, or NSF. C.7.1 XSEDE Allocations Process & Policies As XSEDE allocation L3 manager, Ken Hackworth (PSC) will oversee the central XSEDE allocation process, a major focus of RAS. The most visible aspect will be support for the quarterly review by the XSEDE Resource Allocation Committee (XRAC) of larger-scale research requests. The XRAC also serves as a key advisory board for RAS and the allocations process. The XSEDE allocation manager will oversee and direct the peer-review process for the largest research proposals (currently nearly 200 submissions per quarter are reviewed by XRAC members) along with the review by XSEDE staff experts of dozens more smaller-scale requests. The RAS team recruits new XRAC members as terms expire, makes review assignments, ensures reviews are completed, manages the logistics of the quarterly meetings, and initiates the resulting awards. The RAS team will support the review and handling of Startup, Education, and smaller-scale allocation requests, which currently represent nearly 200 new and renewal requests per quarter. Furthermore, the RAS team will process requests, such as extensions, transfers and, to a lesser degree, supplements and appeals. It will coordinate the XSEDE resource guidance process, which leverages ECSS staff to help users identify the best resources for their projects prior to allocation request submission.

C-21

To ensure that the allocations process meets the needs of XSEDE stakeholders, the RAS team will engage regularly with the SP Forum, SP allocations representatives, the XSEDE User Advisory Committee, the NSF, and the XRAC on improvements or changes needed to the allocation policies. They will review processes to manage the growing volume of requests and to allocate effectively the evolving and diversifying resource portfolio, such as: human resources for ECSS projects, non-traditional computational resources (e.g. PSC’s Bridges and Indiana’s Jetstream), and, we anticipate, network bandwidth or quality of service with software-defined networking. The XSEDE allocation manager will implement any agreed-upon changes to the XRAC review and meeting format. As part of these expanding allocations approaches, the XSEDE allocation manager will work closely with other groups in RAS and across XSEDE to ensure that the infrastructure evolves in concert with allocation changes. Area metrics include user satisfaction with the allocations process, allocation requests processed, average time to process Startup requests, and percent of XRAC-recommended SUs allocated. C.7.2 Allocations CI Enhancement & Maintenance The RAS team will augment efforts to support the XSEDE Resource Allocation Service (XRAS) allocation management software, which has been greeted with unanimous appreciation by XRAC members at their meetings since August 2014 and is provided as software-as-a-service to the broader community. Given the intense demand for XSEDE-allocated resources, the RAS team will update XSEDE infrastructure components to better support researchers and educators in identifying the appropriate and available services across the ecosystem that can support their objectives, a need identified both by existing XSEDE use cases and by an NSF workshop [21]. RAS efforts encompass improvement, maintenance, and operation of critical services that enable and enhance the allocations process. Led by L3 manager Amy Schuele (NCSA), the XRAS activities not only will facilitate the allocation of resources but will also enable discovery of and access to other services, such as the resources operated by Level 2 and Level 3 SPs but not allocated via XSEDE—better informing users about the range of advanced digital services available to them. XRAS efforts will follow and leverage the processes and tools, including JIRA for feature tracking, defined by XCI (§C.5). XRAS will be operated as a robust and stable service in support of XSEDE allocations processes and for client organizations. Support efforts will include maintenance updates to XRAS as XSEDE allocations policies evolve. Additional efforts will focus on enhancing the performance and reliability of the XRAS service and enhancing the reporting and metrics capabilities of the system. Because of the central and critical nature of XRAS to XSEDE, and in view of its anticipated value to the broader CI ecosystem, new XRAS features will be prioritized based on the collective inputs and feedback from users, XSEDE, the SPs, and other stakeholders. Finally, RAS will refine the sustainability and cost model for XRAS as a service. Building upon initial collaborations with NCAR and NCSA’s CADENS project (www.ncsa.illinois.edu/enabling/vis/cadens), the team will work with other organizations that have expressed interest in the XRAS technology. The cost of work to support other organizations is not included in the XSEDE2 budget and will be covered by those organizations or through separate grants. The Resource Description Repository (RDR), part of XSEDE’s Infrastructure Discovery Services, will support XRAS and serve as a cornerstone for an enhanced RAS infrastructure. The RDR will serve as the foundation for a “Resource Selector” service that will guide users to CI resources and services that are relevant and available to them for their research needs. RAS will also formally define and complete the integration of the XRAS, accounting, and XSEDE User Portal components with the enhanced RDR. The Accounting & Account Management (A&AM) Services provide centralized mechanisms that integrate usage across XSEDE-allocated SPs and support allocation review and management. To accommodate emerging resources with novel usage modalities, RAS will collaborate with XCI to define accounting use cases and implement enhancements to the A&AM environment. For example, this activity will continue to investigate appropriate usage data collection for gateways, other resource types integrated in the RDR, and other XSEDE integrative services. The Account Management Information Exchange (AMIE) messaging service is a critical integration mechanism for the A&AM Services. In collaboration with XCI, RAS will conduct a tradeoff analysis to investigate the impacts of migrating the Accounting Service from the legacy AMIE transport system to a modern, open-source messaging service. RAS, XSEDE Operations (§C.5.4), and XCI (§C.5) will work together to evaluate the current infrastructure and status of

C-22

the XDCDB and, if necessary, migrate the XDCDB to an even more robust, high-availability and high- performance configuration and platform, with appropriate backup and continuity plans and processes. The XSEDE user publications database (§C.3.2) is essential to the XSEDE allocations process, to understanding XSEDE’s scientific impact, and to downstream XSEDE services such as gateways. This service will evolve to support RAS requirements in collaboration with CEE team efforts. The RAS infrastructure will adapt and evolve as other components of the XSEDE infrastructure evolve, i.e. Identity Management and Infrastructure Discovery Services. Because the XDCDB is a primary data source for XDMoD, RAS will ensure that XDMoD remains integrated with the allocations and accounting infrastructure. Through its interactions with the SP Forum and other stakeholders, RAS will also assess interest in building a community around CI management tools that support XSEDE and RAS integration. The RAS team will engage closely in the sustainability and transition plan from XSEDE2 to any successor program. The XDCDB encompasses a significant portion of the primary data products being generated and collected by XSEDE. These products will include allocation requests, review, and award information spanning more than two decades by the end of the XSEDE2 period, more than a decade of system usage records spread over more than 50 resources, and user profile data including a database of tens of thousands of publications acknowledging XSEDE and SP support. Software maintained and enhanced by RAS, including XRAS, will be transitioned to a successor program according to procedures defined by XSEDE2 and the successor awardee(s). Area metrics include user satisfaction with the XRAS system, availability of the XDCDB and XRAS systems, number of XRAS client organizations, operational metrics from the software engineering processes, and percent of approved feature requests implemented on schedule. C.7.3 Resource Allocations Service KPIs RAS contributes to all of XSEDE’s high-level goals, and we have identified metrics supportive of many of the sub-goals. These KPIs reflect active efforts by RAS to deliver value to the ecosystem via XRAS, provide excellent user support in the allocations process, and provide innovations matching researchers and educators with the most appropriate resources and services to accomplish their objectives.

KPIs Annual Target Sub-goal Supported (§C.1.3) User satisfaction with XRAS system & allocations process 4 of 5 Provide excellent user support Percent of approved allocation and XRAS change requests 100% Effective and productive virtual organization implemented on schedule Table C-6: KPIs for Resource Allocations Service.

C.8 Managing the Program The XSEDE project is managed by the University of Illinois’ NCSA with key partnerships instantiated via sub-awards. Illinois will ensure that an efficient and effective project governing structure is in place throughout the award period to support all significant project activities and ensure efficient and effective performance of all project responsibilities. In general, the existing governance and management mechanisms, which have served the program well to date, will continue under XSEDE2. The XSEDE2 team has focused on integrating the organizational structure with XSEDE’s strategic goals. By tightly aligning organizational units (by L2 WBS) with strategic goals, the team will simplify accountability and link effort and budget to important outcomes. This approach is consistent with best practices in the management of virtual organizations, where many traditional managerial practices do not apply directly due to the distributed and knowledge-intensive nature of the work [22]. Clearly delineated responsibilities and interfaces reduce uncertainty and enable the autonomy and discretion required for scalability and success [23]. Therefore, XSEDE has adopted an approach strongly reliant on KPIs to maximize accountability for performance in a way that aligns with the project’s goals and objectives. For sustainability, the goal of XSEDE management is to structure the organization to allow for reasonable decoupling of its major components along with plans for transition to any potential future awardees (§C.8.6). XSEDE’s modular organizational structure readies the project for future potential separation of units into independent projects [24].The XSEDE team will engage in annual strategic planning efforts to assess and realign the goals and associated organizational structures consistent with feedback from reviews and changing contingencies.

C-23

In XSEDE2, to more effectively support evolving innovative ideas, we have created the XSEDE Innovation Fund, explicitly for supporting unforeseen projects of significant value to XSEDE2’s mission and goals. Absence of such a fund in XSEDE1 caused painful re-budgeting exercises to enable new priorities. We will establish a process based on the well-established stage-gate approach [25]. The stage- gate process is used widely in industry to bring early stage inventions, requirements, and ideas through a series of steps to ensure that promising innovations get the resources they require and are neither abandoned prematurely nor pursued beyond their viability. Currently, a number of practices are being promoted to make stage gate processes more agile, iterative, and lean. In PY6, the XSEDE2 team will define the approach that aligns with our goals and define the criteria by which we will judge innovative opportunities. We will revisit these criteria annually. C.8.1 Project Governance XSEDE governance delegates decision-making authority to the greatest extent possible, allowing for timely decisions and greater agility in response to opportunities. Making use of the Work Breakdown Structure (WBS, §I.2), which aligns with the organizational structure (§I.1), each manager of a WBS area has decision-making authority within the scope, schedule, and budget of that WBS area. Decisions are escalated where other WBS areas or budget changes between partner institutions are involved. XSEDE’s governance model emphasizes documenting activities and decisions and responding to stakeholder needs. XSEDE2 will continue to use its advisory boards and other input mechanisms, including outreach activities, user engagement efforts, and help requests, to assess stakeholder needs, to prioritize and define impact for these requests, and to ensure that they are implemented within the framework of existing XSEDE best practices. Governance and decision making within XSEDE are made public through the XSEDE Quarterly Reports, and to expose all detailed activity, XSEDE2 will move to using a completely public project wiki. As Project Director (PD), Towns will oversee the management of the project as a whole and will direct activities in the XSEDE Program Office. The Co-PIs and Senior Personnel will direct the activities of the Level 2 WBS areas. The XSEDE PI (Towns) and co-PIs (Gaither, Roskies, and Wilkins-Diehr) hold ultimate authority and responsibility for successful program execution. C.8.1.1 XSEDE Senior Management Team (SMT) The XSEDE Senior Management Team (SMT), the highest-level management body, will meet bi-weekly to assess project status, plans, and issues. It will be chaired by the PD (Towns) and will include the directors and deputy directors of CEE (Gaither and Lathrop), ECSS (Roskies and Wilkins-Diehr), XCI (Lifka and Stewart), XSEDE Operations (Peterson and Hazlewood), RAS (Hart and Hackworth), the Program Office (TBD), and the Senior Project manager (Gendler). To be responsive to the user community and the SPs, the chairs of the User Advisory Committee (UAC) and the XD Service Providers Forum (SPF), both described below, are members of the SMT. The cognizant NSF Program Officer (Eigenmann) will be an ex officio member of the SMT. C.8.1.2 Advisory Bodies Stakeholders will have input through three distinct advisory committees that have proved beneficial during XSEDE1 and will provide guidance on strategy, service, and support priorities for the community. The XSEDE Advisory Board (XAB) meets semi-annually, either in person or by teleconference, to help ensure that XSEDE is designed to impact a broad range of disciplines, enable both research and education, have broader impacts to society, and have a user community that is diverse (gender, ethnic background, etc.) and includes representation from all types of colleges and universities. The XAB advises in the annual planning process, reviews the annual plans, and recommends strategic directions. While primarily strategic, the XAB may make tactical recommendations that help XSEDE. The XAB consists of five scientific leaders (selected by the XSEDE management team) from different communities who use XSEDE along with the chair of the UAC and three representatives from the SPF (the SPF chair and two others, self-selected by the Forum). These members are complemented by an additional five senior members of the broader community selected by the XSEDE management team. Members serve two-year staggered terms. The User Advisory Committee (UAC) will meet three times per year by teleconference and consists of 20 active users of XSEDE-allocated resources and services representing the needs and concerns of the community. The committee presents recommendations regarding emerging needs and will review plans

C-24

and suggested developments. XSEDE2 will seek input from NSF directorates to include researchers representing each NSF directorate or major division. Members serve two-year staggered terms. The chair, selected by UAC members, will participate in SMT meetings and is a member of the XAB. In addition, they will have the role of User Ombudsperson—the person to whom any user of any XSEDE allocated or supported resource or service can turn if they are not having issues addressed by XSEDE. The XD Service Providers Forum (SPF) meets bi-weekly and provides a means by which all Service Providers can voice concerns, make recommendations, and provide feedback on proposed changes to the XSEDE environment, policies, and services. The chair, elected by SPF members, will serve a one- year term. The SPF is more fully defined in the XD Service Providers Forum Charter [26] and the Requesting Membership in the XSEDE Federation as a Service Provider [27] documents, both available online (www.xsede.org/project-documents). Based on feedback from staff in XSEDE1, we will also form an Internal Advisory Committee to give advice on internal matters such as professional development, reporting, recognition, policies, etc. This committee will be defined in conjunction with the staff to establish a committee responsive to staff needs. C.8.2 External Relations (ER) ER, led by Tricia Barker (NCSA), will promote the resources and services provided by XSEDE and examples of its successful support for science, engineering, and education to internal and external stakeholders. The ER team will communicate upcoming events, project milestones and achievements, science successes, services and resources, etc. via the XSEDE website and other channels; research and write “science success stories” and media releases; distribute a monthly external newsletter and a monthly internal newsletter; coordinate and staff an XSEDE exhibit at each year’s SC Conference; promote the annual XSEDE Conference and create supporting materials; and grow the XSEDE social media presence. ER metrics include science success stories and announcements appearing in media outlets, newsletter reach (monthly open and click-through rates), unique visitors to the website, and social media engagement (e.g., likes and shares on Facebook, total Twitter followers). C.8.3 Project Management, Reporting & Risk Management (PM) The Project Management, Reporting and Risk Management (PM) team members, led by L3 manager Karla Gendler (TACC), have extensive experience applying project management principles to large, complex, distributed projects, including projects in the private sector, government, and XSEDE1. As a focal point for XSEDE2 project management efforts, the PM team will develop and maintain an online Project Execution Plan (PEP) on the staff wiki. The PEP describes the standard operating procedures for the project and is a living document that evolves with the project. Risk Management. Risk management is incorporated into the project at all WBS levels. The NCSA risk tool—originally developed for the Blue Waters project—will be used to register and monitor risks. Risk reviews will be conducted quarterly; high-risk items and mitigation strategies are included in the PEP. Project Change Management. As part of the XSEDE annual planning process, the project will define the schedule, milestones, budget, and scope. The PM team will ensure that changes to these baselines will be managed through a change management process. Project Reporting and Communications. The project will provide NSF with regular updates via teleconference and written quarterly and annual reports. The PM team will develop a Communication Plan that links all project groups and describes communication methods and frequencies to maximize the effectiveness and efficiency of project communications. Area metrics include total process improvements implemented, the number of days between close of a quarter and delivery of relevant reporting, the percentage of risks reviewed each quarter and the average number of days to process project change requests. C.8.4 Business Operations Working with the Associate Project Director (TBD), the project will fund a full-time person (Lacey Holman, NCSA) to help oversee budgetary issues, manage sub-awards, and assure timely processing of sub- award amendments and invoices. They will work closely with staff in the University of Illinois’ Sponsored Programs Office to facilitate these processes

C-25

Area metrics include processing times for the stages of processing sub-award amendments and invoices, and total process improvements implemented. C.8.5 Strategy, Planning, Policy, Evaluation & Organizational Improvement Addressing a lesson learned during XSEDE1, XSEDE2 will dedicate effort to project-wide strategic planning, policy development, evaluation, and assessment and organizational improvement. While strategic planning is a responsibility of the project as a whole, and in particular of the PIs and Senior Personnel, we will also engage Nick Berente (University of Georgia)—bringing significant expertise with organizational and management issues—and Ian Foster (University of Chicago)—bringing significant experience with community needs—to assist in these efforts. XSEDE2 will engage an independent Evaluation Team designed to provide XSEDE2 with information to guide program improvement and assess the impact of XSEDE services. The evaluators will report directly to the Associate Project Director and will attend SMT meetings as requested. Evaluations will be based on five primary data sources: (1) an Annual User Survey that will be part of the XSEDE annual report and program plan; (2) an Enhanced Longitudinal Study encompassing additional target groups (e.g., faculty, institutions, disciplines, etc.) and additional measures (e.g., publications, citations, research funding, promotion and tenure, etc.); (3) an Annual XSEDE Staff Climate Study; (4) XSEDE KPIs, Area Metrics, and Organizational Improvement efforts, including ensuring that procedures are in place to assess these data; and (5) Specialized Studies as contracted by Level 2 directors and the Program Office. The Evaluation Team will create a database to support an Area Metrics/KPI Dashboard and results of any specialized studies. Area metrics will assist in measuring progress toward the “effective and productive virtual organization” sub-goal by determining whether evaluation data is affecting program planning and decision making to facilitate organizational effectiveness and productivity. Since not all recommendations may be actionable, the metrics assess whether an L2 or L3 area address recommendations. C.8.6 Sustainability Planning The organizational structure of XSEDE1 makes the organization more modular and goal-oriented, which is expected to aid in the organization’s potential transition to one or more successor organizations. The specific hand-off procedures will be developed in collaboration with any successor(s), once NSF has made any successor awards. The transition plan will ensure continuity of services while minimizing disruption of the community of stakeholders that rely on XSEDE for access to computational resources, services, support, training, and documentation. At minimum, the following critical items will be available for transfer to the successor organization(s):  resource and service usage accounting, accounts infrastructure, and data  operations infrastructure and data, including ticket database and central database  policies, including security and access controls  portal and documentation information, including training materials  a cadre of ECSS experts that would be valuable to whatever follows XSEDE along with the management approach with processes and templates available from the wiki Once developed, the transition plan will be recorded in the XSEDE document repository. C.8.7 XSEDE Program Office KPIs KPIs for the Program Office XCI focus on organizational excellence and improvement in order to continuously improve the organizational capabilities of XSEDE2.

KPI Annual Target Sub-goal Supported (§C.1.3) Social Media engagement (# of impressions) 190,000 Raise awareness Public Relations (# of media hits) 140 Raise awareness Percentage of L2 areas implementing climate study and 90% per year Effective and productive virtual evaluation recommendations organization Number of strategic or innovative improvements 5 Innovative virtual organization Ratio of proactive to reactive improvements 9 Innovative virtual organization Table C-7: KPIs for the Program Office.

C-26

C.9 Broader Impacts of the Proposed Work XSEDE will remain a crucial factor in broadening the practice of U.S. science and engineering research through cyberinfrastructure by providing access to resources and services to researchers and educators nationwide. XSEDE has and will conduct numerous outreach activities, including organizing and contributing to conferences, workshops, training events, and student programs, and pulling together offerings from many NSF-funded activities. As the conduct and culture of scientific endeavors become increasingly collaborative and demanding of a greater diversity of capabilities, XSEDE is there to lead, advise, and contribute to this transformation. Through the NIP program, XSEDE will reach out to disciplines such as literature, history, film studies, and machine learning, which are beginning to exploit advanced computing capabilities. XSEDE will work with applications teams from the earth and environmental sciences, astronomy, materials science, biology, and other communities in concert with a broader range of providers of resources, services, and tools. XSEDE will adapt best practices for incorporating digital services into professional and curriculum development programs to prepare a larger and more diverse community of science, technology, engineering, mathematics, arts, humanities, and social science researchers, educators, and practitioners who will advance the frontiers of science in new ways that will benefit the nation far beyond the life of XSEDE. XSEDE constituencies will develop and diversify the U.S. workforce in collaboration with educational institutions across the nation by establishing certificate programs to prepare researchers, educators, and practitioners for the effective use of advanced digital technologies, with programs designed to engage early career researchers—and in particular under-represented groups—to expand this workforce and improve digital science literacy. Through collaborations with external organizations, campuses, HPC centers, and groups including Software Carpentry and Data Carpentry, we will be able to prepare a large and diverse community able to transform scientific discovery in ways we cannot predict. The XSEDE team is committed to fostering full participation of women, persons with disabilities, and underrepresented minorities. The project will benefit from the leadership of several accomplished women in STEM fields, including Gaither and Wilkins-Diehr (CoPIs); Linda Akli, Trish Barker, Dana Brunson, Maytal Dahan, Karla Gendler, and Amy Schuele (all Level 3 managers). The project will also benefit from leadership by people of color, including Linda Akli and Kevin Franklin (one of HPCWire’s people to watch 2010). The XSEDE team recognizes a key technical barrier to use of advanced digital services by persons with disabilities (such as visual or interaction impairment) is accessible design for user interfaces. Team members investigating and developing front-end and online XSEDE components will incorporate accessibility considerations in design and development activities. XSEDE strives to provide access to researchers, educators, and students at all academic institutions, from traditional PhD-granting institutions and two- and four-year colleges and universities, to minority- serving institutions and institutions in ESPCoR jurisdictions. The Campus Engagement effort pro-actively seeks membership among these diverse academic institutions. XSEDE2 will create an organization and expose an infrastructure of diverse resources and services to be made available in support of science and will make a long-term impact by broadening the use of cyberinfrastructure in the practice of U.S. science and engineering research. C.10 The XSEDE2 Team The XSEDE2 PIs represent four of the most successful academic high-performance computing centers in the world: NCSA, PSC, SDSC, and TACC. These centers comprise the majority of staff in XSEDE and operate the majority of SP resources. With a long history of collaboration, the four co-lead institutions have built the relationships and trust that underpin a cohesive team. The senior management team further encompasses three additional leading computing centers: Cornell University, NCAR, and NICS. The full complement of XSEDE staff encompasses carefully chosen leading individuals and institutions in CI research, development, education, project management, evaluation, and systems engineering. As with any great team, the whole is significantly greater than the sum of its parts. C.10.1 XSEDE2 Leadership Experienced, effective leadership is needed to ensure the collective talents of a distributed team are organized to execute complex plans and to change the plans as needed based on user requirements, technology evolution, and scientific opportunities. The XSEDE2 team is led by seven of the most experienced CI leaders in the world. Two team members with significant leadership experience are new

C-27

to senior management in XSEDE2. Each leader has responsibility for significant activities spanning operations, support, education, and R&D within XSEDE2. They have substantial budgetary commitments to XSEDE and will be spending upwards of 50% of their time leading XSEDE2. John Towns is the PI and Project Director for XSEDE and the Executive Director for Science and Technology at the National Center for Supercomputing Applications (NCSA) at the University of Illinois. Towns is also the director for the Illinois Campus Cluster Program and plays a significant role in the deployment and operation of high-end resources and services and distributed computing projects. He has gained a broad view of the computational needs of researchers through his key role in the policy development and implementation of the resource allocations processes of XSEDE and preceding NSF- funded programs. His background is in computational astrophysics, making use of a variety of computational architectures with a focus on application performance analysis. At NCSA, he provides leadership and direction in the support of an array of computational science and engineering research projects that depend upon advanced computing resources and services. Kelly Gaither, co-PI, will direct Community Engagement & Enrichment. She is the director of Visualization, the interim director of Education and Outreach, and a Senior Research Scientist at the Texas Advanced Computing Center (TACC). Dr. Gaither received her doctoral degree in Computational Engineering from Mississippi State University in May 2000. Gaither has refereed publications in computational mechanics, supercomputing applications, and visualization. Over the past 11 years, she has participated in the IEEE Visualization conference and served as the conference general chair in 2004. She is the co-chair of the 5th IEEE Symposium on Large Data Analysis and Visualization. Ralph Roskies, co-PI, co-directs the Extended Collaborative Support Service. He is a professor of physics at the University of Pittsburgh as well as a founder and co-scientific director of the Pittsburgh Supercomputing Center (PSC). Dr. Roskies is the author of over 60 papers in theoretical elementary particle physics. Roskies oversees operations, plans PSC’s future course, and concerns himself with its scientific impact. He has served as advisor to, and as reviewer of, a large number of U.S. and international supercomputing centers. Roskies' pivotal role in developing and implementing the NSF allocation process has given him a very broad overview of leading computational science and close ties to its most prominent practitioners. Nancy Wilkins-Diehr, co-PI, co-directs the Extended Collaborative Support Service. Wilkins-Diehr is associate director at the San Diego Supercomputer Center (SDSC) and is a co-PI on the Comet NSF Track 2 award, the CyberGIS project, and PI on the Science Gateways Institute S2I2 conceptualization award, a part of NSF’s SI2 software program. She led the science gateways program in the TeraGrid from its inception in 2004 through 2011. She has been at SDSC since 1993 and served in a number of roles, including project manager for the $35M NPACI program and associate director of Scientific Computing. She participates on a variety of national and international advisory committees on cyberinfrastructure. David Hart is Senior Personnel for XSEDE2 and directs the Resource Allocation Service (RAS). He has contributed to XSEDE’s ongoing efforts on reporting, metrics, and allocations, including the coordination of the XRAS development effort. Hart is the User Services Manager in the Computational and Information Systems Laboratory (CISL) at the National Center for Atmospheric Research (NCAR) and leads NCAR’s activities in allocations and accounting. Hart has more than a dozen papers on HPC metrics, usage patterns, allocations, and cyberinfrastructure services. Prior to joining NCAR, Hart spent 15 years at SDSC, participating in the TeraGrid program as allocations coordinator and later as Area Director for User-Facing Projects and Core Services. David Lifka is Senior Personnel for XSEDE2 and directs the XSEDE Community Infrastructure effort. He is director of the Center for Advanced Computing and Associate CIO for Cornell University. Dr. Lifka is an HPC industry veteran with over 20 years of experience in management and technology leadership positions at Cornell and Argonne National Laboratory. He is an active member of the Coalition for Academic Scientific Computation, serving as chair in 2013 and 2014. His areas of expertise include sustainable models for academic research computing facilities, parallel job scheduling and resource management systems, data management, high-throughput systems, Web services, and cloud computing. Lifka has a Ph.D. in Computer Science from the Illinois Institute of Technology. His scheduling technologies have been commercially licensed and he has received a ComputerWorld/Smithsonian award for innovations in IT.

C-28

Gregory D. Peterson is Senior Personnel and director of XSEDE2 Operations. Dr. Peterson is director of the National Institute for Computational Sciences (NICS) and professor of electrical engineering and computer science at the University of Tennessee. He received his BS, MS, and DSc in Electrical Engineering and BS and MS in Computer Science from Washington University in St. Louis. Dr. Peterson has performed research in industry, government, and academia, exploring emerging architectures for computational science and engineering. He has been involved in the design, deployment, operations, and use of parallel computing systems for over 25 years. He has over 150 refereed publications in areas of computer engineering and computational science and engineering. He is general chair for the XSEDE 2015 conference. This leadership team is complemented by staff from their institutions and our partner institutions: Indiana University, Oklahoma State, Ohio Supercomputing Center, Purdue, Shodor, SURA, University of Arkansas, University of Georgia, University of Oklahoma, University of Chicago/Argonne National Laboratory, and University of Southern California. C.10.2 Prior Support All members of the XSEDE2 Senior Management Team have prior support from NSF ACI-1053575, “XSEDE: eXtreme Science and Engineering Discovery Environment.” Intellectual merit: To date, the XSEDE project has created and operated the most advanced digital cyberinfrastructure in the world, supported with an expert and experienced team of CI professionals, which has enabled researchers across the nation to conduct transformational research efforts in science, engineering, and the humanities. Broader impact: XSEDE has carried out a Training, Education, and Outreach program to raise the competency of the present and future scientific community, reaching thousands of students and faculty at hundreds of institutions. Publications resulting from the XSEDE project to date include more than 14,000 publications from users supported by XSEDE and dozens of staff publications (listed in XSEDE’s quarterly reports), including [28]. XSEDE reports are available from the XSEDE website (www.xsede.org/web/guest/project-documents-archive), and more recent user and staff publications are available from the user portal (portal.xsede.org/publications#/). The XSEDE2 proposal represents a continuation of many efforts conducted during the ACI-1053575 award period, modified to address NSF requirements, reviewer recommendations, community and advisory feedback, and lessons learned. C.11 Conclusion For the past four years, XSEDE has delivered an advanced digital services infrastructure, distributed across multiple partner institutions and hundreds of participating campuses, and in doing so has established a long-term platform that empowers modern science and engineering research and education. Driven by the needs of the open research community, XSEDE has enhanced the productivity of a growing community of scholars, researchers, and engineers; federated with other high-end facilities as well as campus-based resources; and served as the foundation for a national e-science infrastructure. For the next five years, the XSEDE project proposes an ambitious plan to accelerate the integration of community-developed tools and services into the national ecosystem. To accomplish this objective and fulfill its mission, XSEDE will reorganize itself into goal-driven focus areas that will deliver new capabilities for users and provide a more agile and responsive organization. While making common as well as complex tasks easier, more reliable, and more productive, XSEDE will advance and sustain an environment that facilitates scientific discovery, extends the use of the ecosystem through innovative educational programs, and deepens the abilities of users to produce transformational science and engineering results.

C-29

C.12 References Addressing the “Expectations of a renewal proposal” As requested, we provide here a reference to sections of the proposal addressing the “Expectations of a renewal proposal” as delineated in the solicitation.

Successor Project Component Section  Project Governance (§C.8.1) Establish a Governance Model  XSEDE2 Organization and Budget (Appendix I)  List of Partners and Roles (Appendix II)  Community Engagement and Enrichment (§C.3)  Extended Collaborative Support Service (§C.4)  XSEDE Community Infrastructure (§C.5) Operate Primary Components  XSEDE Operations (§C.6)  Resource Allocations Service (§C.7)  Managing the Program (§C.8) Elevate Resource Allocation Service  Resource Allocations Service (§C.7) Increase Project Scalability  Managing the Program (§C.8)  Business Operations (§C.8.4)  XSEDE2 Leadership (§C.10.1) Obtain Institutional Commitment  List of Partners and Roles (Appendix II)  Letters of Collaboration (Appendix IV)

C-30

D. References Cited

[1] National Science Foundation. Investing in Science, Engineering, and Education for the Nation's Future. NSF 14-043, NSF, March 2014. [2] National Science Foundation. "Cyberinfrastructure Vision for 21st Century Discovery." March 2007. [3] National Science Foundation. "Advanced Computing Infrastructure: Vision and Strategic Plan." NSF 12-051, February 2012. [4] Apon, A.; Ahalt, S.; Dantuluri, V.; Gurdgiev, C.; Limayem, M.; Ngo, L. & Stealey, M. "High performance computing instrumentation and research productivity in US universities." Journal of Information Technology Impact 10, no. 2 (2010): 87-98. [5] Apon, Amy W, Linh B Ngo, Michael E Payne, and Paul W Wilson. "Assessing the effect of high performance computing capabilities on academic research output." Empirical Economics (Springer) 48, no. 1 (2014): 283-312. [6] Joseph, E.C., C. Dekate, and S. Conway. "Real-World Examples of Supercomputers Used For Economic and Societal Benefits: A Prelude to What the Exascale Era Can Provide (Special Study)." International Data Corporation (IDC). 2014. https://www.cac.cornell.edu/about/pubs/IDCReportRealWorldExamplesOfBenefitsOfSupercomputer s.pdf. [7] Tabor Griffin Communications. "High-Performance Computing Contributions to Society." 1998. http://www.tgc.com/hpcbook. [8] Stewart, C.A., R. Roskies, R. Knepper, R.L. Moore, J, Whitt, and T.M Cockerill. "XSEDE Value Added, Cost Avoidance, and Return on Investment." to appear in Proceedings of XSEDE15 Conference. St. Louis, Mo: IEEE, 2015. [9] Wang, Fugang, Gregor von Laszewski, Geoffrey C. Fox, Thomas R. Furlani, Robert L. DeLeon, and Steven M. Gallo. "Towards a Scientific Impact Measuring Framework for Large Computing Facilities - a Case Study on XSEDE." Proceedings of the 2014 Annual Conference on Extreme Science and Engineering Discovery Environment - XSEDE14 (ACM Press), 2014. [10] O'Hara, Maureen, Chen Yao, and Mao Ye. "What’s not there: The odd-lot bias in TAQ data." Johnson School Research Paper Series, no. 16-2012 (2012). [11] Gai, Jiading, Dong Ju Choi, David O'Neal, Mao Ye, and Robert S. Sinkovits. "Fast construction of nanosecond level snapshots of financial markets." Concurrency and Computation: Practice and Experience (Wiley-Blackwell) 26, no. 13 (Feb 2014): 2149-2156. [12] Payne, Christina M, Resch, M. G.; Chen, L.; Crowley, M. F.; Himmel, M. E.; Taylor, L. E.; Sandgren, M.; Ståhlberg, J.; Stals, I.; Tan, Z. "Glycosylated linkers in multimodular lignocellulose-degrading enzymes dynamically bind to cellulose." Proceedings of the National Academy of Sciences (National Acad Sciences) 110, no. 36 (2013): 14646-14651. [13] MacManes, Matthew D., and Eileen A. Lacey. "The Social Brain: Transcriptome Assembly and Characterization of the Hippocampus from a Social Subterranean Rodent, the Colonial Tuco-Tuco (Ctenomys sociabilis)." Edited by Gonzalo G.Editor de Polavieja. PLoS ONE (Public Library of Science (PLoS)) 7, no. 9 (Sep 2012): e45524. [14] Jockers, Matthew L. Macroanalysis: Digital methods and literary history. University of Illinois Press, 2013. [15] Manning, Patrick. Big data in history. Palgrave Macmillan, 2013. [16] —. "Authoring a Science Gateway Cookbook." Cluster Computing (CLUSTER), 2013 IEEE International Conference on. 2013. 1-3. [17] —. "Using XDMoD to facilitate XSEDE operations, planning and analysis." Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery. San Diego, California: ACM, 2013. 46. [18] Pearson, Doug, Minaxi Gupta, Wes Young, and Gabriel Iovino. "Security Event System Version 3 Federated Security Intelligence." Research and Education Networking Information Sharing and

D-1

Analysis Center. http://www.ren-isac.net/docs/NSF_SDCI_Sec_SESv3_PD_wo.pdf (accessed June 13, 2015). [19] Dart, Eli, Lauren Rotman, Brian Tierney, Mary Hester, and Jason Zurawski. "The science dmz: A network design pattern for data-intensive science." Scientific Programming (Hindawi Publishing Corporation) 22, no. 2 (2014): 173-185. [20] Hazzlewood, V., R. Ricci, and M. Berman. "Workshop Report: Workshop on the Development of a Next-Generation Cyberinfrastructure." October 2014. http://www.nics.utk.edu/~victor/nextgencipapers/NextGenCIWorkshop-Report-Final.pdf. [21] Desanctis, Gerardine, and Peter Monge. "Communication processes for virtual organizations." Journal of Computer-Mediated Communication (Wiley Online Library) 3, no. 4 (1998): 0-0. [22] Fiol, C Marlene, and Edward J O'Connor. "Identification in face-to-face, hybrid, and pure virtual teams: Untangling the contradictions." Organization science (INFORMS) 16, no. 1 (2005): 19-32. [23] Garud, Raghu, Arun Kumaraswamy, and Richard Langlois. Managing in the modular age: architectures, networks, and organizations. John Wiley & Sons, 2009. [24] Sommer, Anita Friis, Christian Hedegaard, Iskra Dukovska-Popovska, and Kenn Steger-Jensen. "Improved Product Development Performance through Agile/Stage-Gate Hybrids: The Next- Generation Stage-Gate Process?" Research-Technology Management (Industrial Research Institute, Inc) 58, no. 1 (2015): 34-45. [25] XSEDE. "Service Provider Forum: Charter, Membership, and Governance." XSEDE. http://hdl.handle.net/2142/49980. [26] XSEDE. "Requesting Membership in the XSEDE Federation as a Service Provider." XSEDE.org. July 10, 2014. https://www.ideals.illinois.edu/handle/2142/49981. [27] Towns, John, Timothy Cockerill, Maytal Dahan, Ian Foster, Kelly Gaither, Andrew Grimshaw, Victor Hazlewood, Scott Lathrop, Dave Lifka, Gregory D. Peterson, Ralph Roskies, J. Ray Scott, Nancy Wilkins-Diehr. "XSEDE: Accelerating Scientific Discovery." Computing in Science Engineering 16, no. 5 (2014): 62-74.

D-2

Appendix I XSEDE 2.0 Organization and Budget

I.1 XSEDE2 Organizational Structure The XSEDE2 organizational structure (Figure I-1) is aligned with XSEDE2’s strategic goals. By tightly coupling organizational units (correlated with L2 areas) to strategic goals (Figure I-2), accountability is strengthened and effort and budget are clearly linked to important outcomes for the project. This approach to organizational structure is consistent with best practices in the management of virtual organizations, where principles of modularity are particularly important. According to research on virtual organizations, due to the distributed and knowledge-intensive nature of virtual work, many traditional managerial practices don’t apply directly [22]. Clearly delineated responsibilities and interfaces reduce uncertainty and enable the autonomy and discretion required in virtual knowledge-work [23]. This also implies that well-specified outcome controls at the interfaces of units need to be put in place. Therefore XSEDE has adopted an approach strongly reliant on key performance indicators at the interface of units to maximize accountability and incentives for performance in a way that aligns with the project’s goals and objectives. This organization structure supports coordination and alignment of the work of 207 individuals expending 111.9 FTE of effort and participating from 18 institutions.

Figure I-1: XSEDE2 Organizational Chart.

1

Figure I-2: Mapping of XSEDE2 Strategic Goals to Organizational Structure.

I.2 XSEDE2 Work Breakdown Structure XSEDE2 governance (§C.8.1) delegates decision-making authority to the greatest extent possible, allowing for timely decisions and greater agility in response to opportunities. Making use of the XSEDE Work Breakdown Structure (WBS, Figure I-3), which aligns with the organizational structure (Figure I-1), each manager of an L3 WBS element has decision-making authority within the scope, schedule, and budget of that WBS area. Decisions are escalated where other WBS areas or budget changes between partner institutions are involved.

2

Figure I-3: XSEDE2 Work Breakdown Structure (WBS).

I.3 XSEDE2 Budget Detail Here we provide two sets of information—detailed budget information and budget distribution to each of the strategic goals. In Table I-1, we provide the detailed five-year budget down the WBS Level 3 groups.

3

Table I-1: XSEDE2 Detailed Five Year Budget to WBS Level 3 (in thousands of dollars). WBS PY6 PY7 PY8 PY9 PY10 Total 2 Innovation Fund $ 389 $ 815 $ 815 $ 815 $ 815 $ 3,636

2 PY6 Hardware refresh $ 439 $ - $ - $ - $ - $ 439

2.1 Community Engagement & Enrichment $ 3,697 $ 3,749 $ 3,847 $ 3,947 $ 4,050 $ 19,290 2.1.1 Community Engagement, Office of the Director $ 337 $ 346 $ 356 $ 366 $ 376 $ 1,781 2.1.2 Workforce Development $ 1,325 $ 1,359 $ 1,393 $ 1,429 $ 1,465 $ 6,972 2.1.3 User Engagement $ 407 $ 374 $ 384 $ 394 $ 405 $ 1,964 2.1.4 Broadening Participation $ 281 $ 289 $ 297 $ 305 $ 314 $ 1,487 2.1.5 User Interfaces and Online Information $ 837 $ 860 $ 883 $ 908 $ 933 $ 4,420 2.1.6 Campus Engagement $ 510 $ 521 $ 533 $ 545 $ 557 $ 2,666

2.2 Ectended Collaborative Support Service (ECSS) $ 6,793 $ 6,977 $ 7,166 $ 7,360 $ 7,559 $ 35,854 2.2.1 ECSS, Office of the Directors $ 568 $ 583 $ 599 $ 616 $ 633 $ 2,999 2.2.2 Extended Support for Research Teams $ 2,263 $ 2,325 $ 2,389 $ 2,455 $ 2,522 $ 11,953 2.2.3 Novel and Innovative Projects $ 1,053 $ 1,082 $ 1,111 $ 1,141 $ 1,172 $ 5,559 2.2.4 Extended Support for Community Codes $ 1,029 $ 1,057 $ 1,086 $ 1,116 $ 1,147 $ 5,435 2.2.5 Extended Support for Science Gateways $ 945 $ 971 $ 998 $ 1,025 $ 1,053 $ 4,992 2.2.6 Extended Support for Education, Outreach and Training $ 936 $ 959 $ 983 $ 1,007 $ 1,032 $ 4,916

2.3 XSEDE Community Infrastructure (XCI) $ 3,632 $ 3,732 $ 3,834 $ 3,940 $ 4,048 $ 19,185 2.3.1 XSEDE Community Infrastructure, Office of the Director $ 1,123 $ 1,154 $ 1,186 $ 1,218 $ 1,252 $ 5,932 2.3.2 Requirements Analysis and Capability Delivery $ 899 $ 923 $ 949 $ 975 $ 1,002 $ 4,747 2.3.3 Capabilities Evaluation & Testing $ 804 $ 826 $ 849 $ 873 $ 897 $ 4,249 2.3.4 XSEDE Capability and Resource Integration $ 806 $ 828 $ 851 $ 874 $ 898 $ 4,257

2.4 XSEDE Operations $ 3,405 $ 3,492 $ 3,586 $ 3,683 $ 3,783 $ 17,948 2.4.1 XSEDE Operations, Office of the Director $ 438 $ 445 $ 457 $ 469 $ 482 $ 2,291 2.4.2 Security $ 883 $ 907 $ 932 $ 958 $ 984 $ 4,663 2.4.3 Data Transfer Services $ 809 $ 831 $ 854 $ 877 $ 901 $ 4,271 2.4.4 XSEDE Operations Center (w/ Consulting Coordination) $ 560 $ 576 $ 592 $ 608 $ 625 $ 2,961 2.4.5 Systems Operational Support $ 715 $ 733 $ 752 $ 771 $ 791 $ 3,762

2.5 Resource Allocations Service (RAS) $ 1,839 $ 1,884 $ 1,932 $ 1,980 $ 2,030 $ 9,664 2.5.1 RAS, Office of the Director $ 238 $ 245 $ 252 $ 259 $ 266 $ 1,260 2.5.2 XSEDE Allocations Process and Policies $ 510 $ 519 $ 529 $ 539 $ 549 $ 2,647 2.5.3 Allocations CI Maintenance and Development $ 1,090 $ 1,120 $ 1,151 $ 1,182 $ 1,215 $ 5,758

4

WBS PY6 PY7 PY8 PY9 PY10 Total

2.6 Program Office $ 2,651 $ 2,721 $ 2,792 $ 2,865 $ 2,940 $ 13,970 2.6.1 Project Office $ 616 $ 632 $ 650 $ 668 $ 686 $ 3,252 2.6.2 External Relations $ 578 $ 593 $ 607 $ 623 $ 638 $ 3,039 2.6.3 Project Management, Reporting & Risk Management $ 702 $ 741 $ 762 $ 783 $ 15 $ 3,709 2.6.4 Business Operations (Contracts, Budget, Finance, Accounting) $ 196 $ 213 $ 219 $ - $ - $ 1,037 2.6.5 Strategic Planning, Policy, Evaluation and Organizational Improvement $ 559 $ 600 $ 614 $ 60 $ - $ 2,933 $ 22,845 $ 23,369 $ 23,971 $ 24,590 $ 25,225 $120,000

5

As noted previously, we have aligned the organizational units of XSEDE2 with the strategic goals of the program. This alignment also allows us to easily express the level of investment XSEDE is making to further each of the strategic goals and is shown in Figure I-4.

Figure I-4: Allocation of XSEDE2's Total Budget to Each of the Strategic Goals.

6

Appendix II List of Partners and Roles

Institution Role Level of Funding Cornell Provide Senior Personnel and L2 XSEDE Community Infrastructure $3,773,517 leadership, overseeing its four L3 areas. Provide Training leadership and staff for campus community software repository efforts. Indiana Provide deputy L2 XSEDE Community Infrastructure leadership. Provide $6,434,022 staff for Requirements Analysis and Capability Delivery, Capabilities Evaluation and Testing, XSEDE Capability and Resource Integration (including L3 Manager), Operations, Systems Operational Support, Extended Collaborative Support Service, Novel and Innovative Projects, Extended Support for Science Gateways (including L3 Manager), and Extended Support for Education, Outreach, and Training, Program Office, Strategic Planning, Policy, Evaluation and Organizational Improvement. PSC Provide co-PI and co-L2 Extended Collaborative Support Services $17,820,435 leadership overseeing its six L3 areas. Provide L3 leadership for Capabilities Evaluation, Novel and Innovative Projects, Allocations Process and Policies and co-leadership for Cybersecurity. Provide staff for Workforce Development, User Engagement, User Interfaces and Online Information, Extended Support for Research Teams, Novel and Innovative Projects, Extended Support for Community Codes, Extended Support for Science Gateways, Extended Support for Education, Outreach, and Training, XSEDE Community Infrastructure, Requirements Analysis and Capability Delivery, Capabilities Evaluation and Testing, Cybersecurity, Data Transfer Services, Systems Operational Support, Resource Allocation Services, External Relations, and Project Management, Reporting, and Risk Management. Illinois/NCSA Provide project administration and PI, L2, and L3 leadership. Provide staff $29,530,289 for Community Engagement & Enrichment, User Engagement, Extended Collaborative Support Services (including L3 Manager), Extended Support for Research teams, Extended Support for Community Codes, Extended Support for Science Gateways, Extended Support for Education, Outreach, and Training, XSEDE Community Infrastructure, Requirements Analysis and Capability Delivery, Capabilities Evaluation and Testing, XSEDE Operations, Cybersecurity (including L3 Manager co-lead), Data Transfer Services (including L3 Manager), XSEDE Operations Center (Consulting Coordination; including L3 Manager), Systems Operational Support, Resource Allocations Service (including L3 Manager), Allocations CI Maintenance and Development (including L3 Manger), Program Office (including L2 Director), Project Office (including L3 Manager), External Relations (including L3 Manager), Project Management, Reporting, and Risk Management, Business Operations (Contracts, Budget, Finance, Accounting; including L3 Manager), and Strategic Planning, Policy, Evaluation and Organizational Improvement (including L3 Manager).

1

Institution Role Level of Funding UTK/NICS Provide project Senior Personnel and Operations L2 leadership (L2 $9,758,387 Director and Deputy,) overseeing its five L3 areas. Provide staff for Community Engagement and Enrichment, User Engagement, Campus Engagement, Extended Collaborative Support Services, Extended Support for Research Teams (including L3 Manager), Extended Support for Community Codes, Extended Support for Education, Outreach, and Training, XSEDE Community Infrastructure, XSEDE Capability and Resource Integration, XSEDE Operations, Cybersecurity, Data Transfer Services, Systems and Operational Support (including L3 Manager), Project Office, Project and Operations Project Management, Reporting and Risk Management. Oklahoma State Provide L3 co-leadership for Campus Engagement in Community $853,528 Engagement and Enrichment, including Champion activities. OSC Provide staff for Community Engagement and Enrichment and assist $382,993 faculty with the development of new curricula that integrate computational science and/or create certificate and degree programs. Purdue Provide deputy L3 leadership for Campus Engagement and staff for $1,918,136 Community Engagement and Enrichment, Campus Engagement, Extended Collaborative Support Services, Extended Support for Community Codes, and Extended Support for Science Gateways. Shodor Provide deputy L2 and L3 leadership to expand and develop Education $2,804,550 and provide staff for Community Engagement and Enrichment, Workforce Development, Novel and Innovative Projects, and Extended Support for Training, Education, and Outreach. SURA Provide L3 leadership for Community Engagement and Enrichment $1,215,015 Broadening Participation to engage people from Minority Serving Institutions, programs for Under-Represented Minority researchers, and domains that are not traditional users of HPC. TACC Provide co-PI and L2 Community Engagement and Enrichment leadership, $17,819,029 overseeing its six areas. Provide staff for Community Engagement, Workforce Development, User Engagement (including L3 Manager), Broadening Participation, User Interfaces and Online Services (including L3 Manager), Extended Collaborative Support Services, Novel and Innovative Projects, Extended Support for Community Codes (including L3 Manager), Extended Support for Science Gateways, Extended Support for Education, Outreach, and Training, Requirements Analysis and Capability, Cybersecurity, Data Transfer Services, Systems and Operational Support, Allocations CI Maintenance and Development, External Relations, Project Management, Reporting, and Risk Management. UCAR/NCAR Provide Senior Personnel and L2 Resource Allocation Service leadership, $2,501,151 overseeing its three L3 areas. Provide staff for Extended Collaborative Support Services, Extended Support for Research Teams, Extended Support for Community Codes, Resource Allocations Service, and Allocations CI Maintenance and Development.

2

Institution Role Level of Funding UCSD/SDSC Provide co-PI and L2 Extended Collaborative Support Services co- $15,763,833 leadership, overseeing its six areas. Provide staff for Workforce Development, User Engagement, User Interfaces and Online Information, Extended Support for Research Teams, Novel and Innovative Projects, Extended Support for Community Codes, Extended Support for Science Gateways, Extended Support for Education, Outreach, and Training, Requirements Analysis and Capability Delivery, Capabilities Evaluation and Testing, Cybersecurity, Data Transfer Services, Systems Operational Support, Resource Allocation Services, External Relations, and Project Management, Reporting, and Risk Management. University of Provide staff for Campus Engagement in Community Engagement and $330,204 Arkansas Enrichment. University of Provide staff for Program Office, Project Office, and Strategic Planning, $353,568 Georgia Policy, Evaluation and Organizational Improvement. University of Provide L3 co-leadership for Campus Engagement in Community $345,782 Oklahoma Engagement and Enrichment and foster emerging regional, national, and international interactions. University of Provide staff for Extended Collaborative Support Services, Extended $7,986,613 Chicago/Argonne Support for Science Gateways, XSEDE Community Infrastructure, National Laboratory Requirements Analysis and Capability Delivery (including L3 Manager), and Strategic Planning, Policy, Evaluation and Organizational Improvement. University of Provide staff for Extended Collaborative Support Services through $422,613 Southern California- Extended Support for Science Gateways. Information Sciences Institute

3

Appendix III Conflict of Interests List

COI Last Name COI First COI Institution PI/SP Last PI/SP First Name Name Name Ahern Sean Computational Engineering Peterson Greg International Apon Amy Clemson University Peterson Greg Avery Paul University of Florida Roskies Ralph Baugher Kirk Honeywell Peterson Greg Belgin Mehmet GaTech Peterson Greg Bethel Wes Lawrence Berkeley Labs Gaither Kelly Bhaskaran Venkatesh Intel Peterson Greg Bi Yu Qualcomm Peterson Greg Blatecky Alan RTI International Towns John Bleile Ryan University of Oregon Gaither Kelly Blum Jennifer VHA Home Healthcare Towns John Boisseau John R Vizias Towns John Boisseau John R Vizias Gaither Kelly Borgman Christine UCLA Towns John Bottom James Clemson University Peterson Greg Bottom James Clemson University Towns John Briley Roger UT Chatanooga Gaither Kelly Brown Maxine University of Illinois-Chicago Towns John Campbell H E A University of New Brunswick Towns John Chamberlain Roger Washington University Peterson Greg Cheatham Tom Utah Towns John Childs Hank UC Davis/LBL Gaither Kelly Chun CS Current Affiliation Unknown Wilkins-Diehr Nancy Coady Siobhan Novocom Inc. Towns John Collins Will Ametek Peterson Greg de Waard Anita Elsevier Publishing Towns John Donzis Diego Texas A&M University Gaither Kelly Dorai Mahesh Cavium Networks Peterson Greg Dou Yong National University of Defense Peterson Greg Technology (China) Du Hongtau Moog Crossbow Peterson Greg Duan Yong UC Davis Roskies Ralph Dunacn Anthony University of Pittsburgh Roskies Ralph Dunning Thomas University of Washington Towns John Emeneker Wesley GaTech Peterson Greg Ertl Thomas University of Stuttgart Gaither Kelly Ezell Matt ORNL Peterson Greg Fisher Brian Simon Fraser University Gaither Kelly

1

COI Last Name COI First COI Institution PI/SP Last PI/SP First Name Name Name Foxall Roger Life Science Strategies Inc. Towns John Furlani Thomas SUNY-Buffalo Hart David Furlani Thomas SUNY-Buffalo Lifka David Furlani Thomas SUNY-Buffalo Towns John Gao Susan Nvidia Peterson Greg Garofalo Francesca CINECA Hart David Garth Christoph University of Kaiserslautern Gaither Kelly Geist Al ORNL Peterson Greg Gilpin Chris University of Texas Southwestern Gaither Kelly Medical Center Glotzer Sharon University of Michigan Towns John Gordon Mark Iowa State Peterson Greg Grimshaw Andrew University of Virginia Peterson Greg Grimshaw Andrew University of Virginia Towns John Grimshaw Andrew University of Virginia Lifka David Habermann Ted The HDF Group Towns John Hannisch Robert National Institute of Standards and Towns John Technology Hansen Chuck University of Utah Gaither Kelly Harrison Cyrus Lawrence Livermore National Labs Gaither Kelly Harrison Robert Stony Brook University Peterson Greg Hathaway Donald Governance DNA Towns John Hayden Linda Elizabeth City State University Wilkins-Diehr Nancy Hazlewood Victor The National Institute for Lifka David Computational Science He Yuan HY Financial Peterson Greg Henry Nathan Univ Tennessee Health Science Center Peterson Greg Hepburn John University of British Columbia Towns John Hunter Bryan USAF Peterson Greg Jacobs Cliff Jacobs Consulting Towns John Jang Yun University of Korea Gaither Kelly Janicke Heike Heidelberg University Gaither Kelly Jenkins David NSA Peterson Greg Johnson Greg Intel Gaither Kelly Jones Matthew SUNY-Buffalo Hart David Kafyeke Fassi Bombardier Towns John Kalimbi Supriya Intel Peterson Greg Katz Dan NSF Roskies Ralph Keselman Joanne University of Manitoba Towns John Klasky Scott ORNL Peterson Greg Knoll Aaron University of Utah Gaither Kelly

2

COI Last Name COI First COI Institution PI/SP Last PI/SP First Name Name Name Kothandaraman Sampath Qualcomm Peterson Greg Kovatch Patricia The Mount Sinai Hospital Towns John Krumm Jeff TVA Peterson Greg Lange Tom Proctor & Gamble Towns John Lawrence Katherine University of Michigan Wilkins-Diehr Nancy Lazzarini Albert Caltech Towns John Lee JunKyu Univ of Sydney Peterson Greg Leigh Jason University of Hawaii Gaither Kelly Leland Robert Sandia Nat Labs Towns John Livny Miron Wisconsin-Madison Towns John MacCabe Barney ORNL Peterson Greg Maciejewski Ross Arizona State University Gaither Kelly Maechling Phil SCEC Towns John Majumdar Aditi UC Irvine Gaither Kelly Mambretti Joe Northwestern Towns John Manno Paul GaTech Peterson Greg Marcum David Mississippi State University Gaither Kelly Mark William Intel Gaither Kelly McCollum J. Michael Google Peterson Greg Melek Adel Global Enterprise Risk Services, Towns John Deloitte Merchant Saumil IBM Peterson Greg Mesirov Jill Broad Institute Towns John Moore Shirley Univ Texas El Paso Peterson Greg Moorhead Robert Mississippi State University Gaither Kelly Nichols Jeff ORNL Peterson Greg Nyerges Timothy University of Washington Wilkins-Diehr Nancy Ostriker Jerry Princeton Towns John Pakala Swetha PMC-Sierra Peterson Greg Palmer Carole University of Washington Towns John Paris Joe Northwestern Towns John Park Hae Yung Korea Advanced Institute for Roskies Ralph Science&Technology Pascucci Valerio University of Utah Gaither Kelly Patlalla Dilip Siemens Peterson Greg Patra Abani SUNY-Buffalo Hart David Paul Nate ORNL Peterson Greg Petersen Andrew University of Pittsburgh Roskies Ralph Petersen Nils University of Alberta Towns John Peyton Jonathan Intel Peterson Greg Pierucci Mauro San Diego State University Wilkins-Diehr Nancy

3

COI Last Name COI First COI Institution PI/SP Last PI/SP First Name Name Name Rathgeb Chris Apptova Peterson Greg Rekapalli Bhanu BioTeam Peterson Greg Remington Karin Aruna Systems Towns John Ribarsky William University of North Carolina Charlotte Gaither Kelly Riedel Morris Julich Supercomputing Centre Lifka David Rogers Brandon Eaton Peterson Greg Rossant Janet The Hospital for Sick Children Towns John Sackett Penny Australian National University Roskies Ralph Schneider Barry National Institute of Standards and Towns John Technology Schneider Barry National Institute of Standards and Hart David Technology Schulz Karl Intel Gaither Kelly Sénéchal David Université de Sherbrooke Towns John Sharp Brian USAF Peterson Greg Smith Melissa Clemson University Peterson Greg Sosonkina Masha Iowa State Peterson Greg Sun Junqing Marvel Peterson Greg Svakhine Nicholai Adobe Gaither Kelly Taylor Valerie Texas A&M Towns John Teller Pat Univ Texas El Paso Peterson Greg Thurmon Brandon Exegy Peterson Greg Tolone William University of North Carolina Charlotte Gaither Kelly Tomita Masaru Keio University (Japan) Hart David Turknett Rob IBM Gaither Kelly Uppala Murali IBM Peterson Greg Vanguri Phani Siemens Peterson Greg Vetter Jeff ORNL Peterson Greg von Oehsen Barr Clemson University Peterson Greg Wald Ingo Intel Gaither Kelly Wang Xiaoyu University of North Carolina Charlotte Gaither Kelly Ward Christine Siemens Peterson Greg Weber Rick Microsoft Peterson Greg Weiler Jordan University of Oregon Gaither Kelly Westing Brandt Microsoft Gaither Kelly Whitfield David University of Tennessee Chattanooga Gaither Kelly Windus Theresa Iowa State University Peterson Greg Windus Theresa Iowa State University Towns John Wolski Richard University of California Santa Barbara Lifka David Worley Brian ORNL Peterson Greg

4

COI Last Name COI First COI Institution PI/SP Last PI/SP First Name Name Name Wu Guiming National University of Defense Peterson Greg Technology (China) Wu Jin Chu National Institute of Standards and Roskies Ralph Technology Yang Depeng Qualcomm Peterson Greg Yeung P.-K. Georgia Tech Gaither Kelly

5

Appendix IV Letters of Collaboration Here we provide letters of collaboration we have received from our collaborators.

1

Appendix V Data Management Plan Data produced by XSEDE fall into two categories: primary data products and secondary data products. Broadly speaking, primary data products include software and records, while secondary data products include documents, materials, and documentation derived from project operations and primary data products. Both data types are discussed in detail later in this section. We do not address the issue of user data stored on XSEDE resources in this document; while XSEDE will provide facilities to assist in the execution of those user-defined data management plans, and it is expected that XSEDE allocated resources will be used for data management by a number of projects and users. This discussion of user-level data management plans is outside the scope of this document. The measures documented in this data management plan provide for the basic distribution of user and usage information to the research community, and include but do not detail plans to develop more robust, all-encompassing repository capabilities to handle secondary data products in particular. As XSEDE itself evolves, it is expected that the data products of interest, the sources of such data products, and the potential uses of these data products, will all continue to increase and evolve. Further planning and efforts regarding long-term preservation and access for both primary and secondary data products of XSEDE will be undertaken in concert with NSF-funded data infrastructure providers. The long-term goal for XSEDE data management planning and execution will be to minimize the proliferation of data sources and to provide simple and widely available access, both to document the usefulness of XSEDE infrastructure and to enable research on the use of XSEDE. V.1 Primary Data Products Primary data products of XSEDE will include: records of all users requesting allocations on XSEDE resources; records of the allocations granted and the usage of those allocations; records of resource usage, including specific users, job names and characteristics, including software used (if identifiable), and the allocated projects associated with each job; records of resources and services registered using the XSEDE information services, including in some cases availability information for specific resources and services; records of historical monitoring results; and of particular importance, user needs and requirements data gathered in the course of XSEDE development, including user surveys. Since user data may include personal information, access to some of this data must be restricted. Finally, there are software artifacts created by XSEDE staff, which may be either alterations to existing code or XSEDE- specific code for example code created for the XSEDE user portal. XSEDE recognizes the potential value of these data sets for current and future research and is committed to preserving and providing this data as appropriate. As most primary data products resulting from the allocations and accounting framework are also critical to the normal functioning of XSEDE, all recording of resource and usage information is currently stored in replicated databases, with at least two full copies of all data maintained at all times. Full backups are performed on a weekly basis, ensuring the preservation of both the data and the dependent service. These procedures will be continued into XSEDE2. Further, XSEDE plans to build on the existing software infrastructure and user database, while preserving all existing data from XSEDE, the preceding TeraGrid effort, and prior NSF programs when available. The XSEDE infrastructure itself includes facilities to support preservation of data, such as metadata, replication, and other data management infrastructure. This infrastructure will be used within the project itself to provide a high-reliability storage layer for this primary data. Providing access to the primary usage data produced by XSEDE is dependent upon implementing appropriate access controls that ensure privacy for personal information while also making non-personal information widely available. For example, previous efforts have provided summary reporting on allocations and resource usage for public consumption. The XD Metrics Service (previously known as The XD Technology Audit Service for XSEDE) has provided new tools for examining usage and other information on Service Provider operations, using simple web interfaces for querying past usage data. Future efforts in XSEDE will focus on cooperation with the XD Metrics Service to expand the range of data available via these web interfaces, and to provide tools enabling the use of XSEDE primary data in research settings. A crucial component of the XSEDE engineering and development process involves the gathering of user needs and requirements information, largely through user surveys. These surveys will be performed by Indiana University under the guidance of an Institutional Review Board (IRB). Anonymized, de-identified

1

versions of the raw user survey data will be made available, subject to IRB approval. Derived data on user needs, comprising the inputs to the ongoing systems engineering and management process, will be made openly accessible (subject to privacy and IRB constraints) as it is generated based on user surveys and other requirements-gathering processes. This approach will ensure the transparency of XSEDE engineering practice, as well as providing an important window into user perceptions of cyberinfrastructure. Software products generated by XSEDE will be sufficiently diverse to require a range of data management strategies. The most basic form of software product developed by XSEDE staff will be improvements or alterations to existing open source toolkits. In these situations, patches will be submitted to the software developers for maintenance by the open source development projects. For purpose-built software developed by XSEDE for XSEDE staff or users, such as web portal code, software deployment scripts, and other XSEDE-specific software products, the existing XSEDE software repository will be maintained and integrated into other XSEDE web interfaces for data and software. Software developed by XSEDE with potential for reuse will be made available via open repositories for others to download. Other products, for instance code developed for the XSEDE user portal, is unlikely to have significant potential for reuse and as such may not be made available in source form. If interest is shown in such software products, however, these decisions may be revisited. In all respects, XSEDE will attempt to provide open access to software code and other data generated by the project. V.2 Secondary Data Products and Related Projects In addition to the primary documentation of user activity, XSEDE creates a number of secondary data products with potential value. These include documentation of project activities such as quarterly and annual reports; ECSS work plans; publications written by project staff describing XSEDE activities or infrastructure; meeting agendas, minutes, and decisions; planning documents; source code developed for user interfaces; training materials developed for live, webcast and self-paced tutorials; materials developed by faculty for public dissemination and re-use in other undergraduate and graduate classrooms; and reports of advisory meetings. This information will continue to be available via the User Portal including publicly accessible project wiki pages. The self-paced tutorials are accessible via the Cornell Virtual Workshop website housed at Cornell University, and the CI-Tutor website hosted by NCSA. Long-term data preservation is handled through the XSEDE digital object repository based on IDEALS, utilizing handles with DOIs and persistent identifiers. Finally, documents and data will be jointly created by the XSEDE project team and XD Metrics Service. Since most of these latter data are generated explicitly for outside entities, in most cases XSEDE cannot be viewed as the provider of the data in question. However, we recognize the potential value of these secondary data products and the potential for integration and use of such data in novel ways. For these reasons, the XSEDE Program Office will continue interactions with the XD Metrics Service on the development of an integrated repository for these diverse data products. This may initially be as simple as a common, web-accessible location for deposit and retrieval of data. As with the primary data products, providing access to this repository to external researchers will be dependent upon privacy and confidentiality concerns, but the goal will be to provide the maximum degree of access possible to all primary and secondary data. These repositories, web services, and diverse data products will be discoverable by and accessible to the XD community.

2