<<

E‐Infrastructure Roadmap Executive Summary

E‐Infrastructure Roadmap

Engineering and Physical Council (EPSRC) Research Infrastructure Team Dr Louise Tillman Dr Susan Morrell Dr Tracy Hanlon Dr Daniel Emmerson Dr Michele Erat Dr Edward Clark Mr Timothy Erskine

Version 1.1 August 2014 Page | 1 E‐Infrastructure Roadmap Executive Summary Contents

Executive Summary ...... 3

Methodology ...... 14

Capability ...... 16

Connections ...... 22

Software Development ...... 28

Data Infrastructure ...... 35

Hardware and Compute ...... 42

Integration and Coordination ...... 47

Pathways to Impact ...... 50

Glossary ...... 51

Version 1.1 August 2014 Page | 2 E‐Infrastructure Roadmap Executive Summary Executive Summary

Purpose

A coherent strategy for developing and delivering the UK’s future e‐infrastructure needs is essential in driving forward the continued development of a globally competitive research base within the UK. Since 2009 there have been a number reports and reviews on e‐infrastructure, outlining our future requirements and how e‐ infrastructure should be funded and supported.

As shown in the figure below, the EPSRC along with its sister Research Councils, the Funding Councils, the Technology Strategy Board and BIS plays a key role in developing the strategy as well as delivering the funding to support e‐infrastructure in the UK. The development of a sustainable and cutting edge e‐infrastructure eco‐ system is vital in allowing EPSRC to deliver its Strategic Goals and support excellent and innovative and engineering research.

Figure 1: E‐infrastructure pipeline taken from the E‐Infrastructure Leadership Council: One year On report

The EPSRC Research Infrastructure team, with the help of its Strategic Advisory team, Research Council colleagues and key members of the EPS community have formulated an EPSRC e‐infrastructure roadmap to begin to develop a clear strategy and action plan for EPSRC. In the roadmap EPSRC aims to:

. Understand the whole UK e‐infrastructure landscape, view it holistically and consider it within an international context. . Understand the requirements of the EPS research community that make use of e‐infrastructure; ensuring there are no gaps or duplication. . Identify where EPSRC, and more specifically the EPSRC Research Infrastructure team can add the most value. . Provide a framework for spending reviews and business cases for funding opportunities from government. . Be used as a discussion tool with other stakeholders and Research Councils.

Version 1.1 August 2014 Page | 3 E‐Infrastructure Roadmap Executive Summary

Scope

The EPSRC e‐infrastructure roadmap will not be an RCUK roadmap and focuses on the actions that EPSRC can lead on or work in close partnership with others to deliver. However with the establishment of the RCUK E‐infrastructure group (a group that involves representatives from EPSRC, ESRC, BBSRC, MRC, STFC, NERC, AHRC, JANET, TSB, , JISC) the EPSRC e‐infrastructure roadmap provides an important tool in helping to develop a longer‐term cross‐council view, as well as providing strategic input to BIS’s E‐infrastructure Leadership Council. Ultimately this will lead to co‐ordinated strategies for developing the science base and making an impact on industry, from SMEs to large industrial primes.

Due to the fast moving nature of developments and investments in this area we are proposing that the roadmap will be treated as living document that will be updated on a regular basis (6 monthly initially) after further feedback from the broader research community and other key stakeholders.

E‐infrastructure Eco‐system:

The e‐infrastructure eco‐system is a complex and multi‐dimensional entity with many different strands, requirements and stakeholders. For the purposes of the EPSRC e‐infrastructure roadmap, the landscape has been divided into a number of interrelated themes represented in the diagram below:

Management and Co-ordination Pathways to Impact Governance

Research And Computing and Data People Development Sector Skills Domain Knowledge

Security and Networks Authentication

Data Hardware Software Infrastructure And Compute Development

Figure 2: E‐Infrastructure Eco‐system Theme Key Points

Presented below is a summary of key points from each theme, focusing on the Vision for the Future, i.e. what we want the UK landscape to look like in 10 years time. Further information on each theme is available in the body of the report.

Version 1.1 August 2014 Page | 4 E‐Infrastructure Roadmap Executive Summary

Capability:

By Capability, we mean People Development, Computing and Data Skills and Research and Sector Domain Knowledge. These underpin the whole roadmap: having skilled people is both an input to and output of a healthy e‐infrastructure ecosystem. Training in both tools (such as programming and software engineering, and basic data analysis) and research methods (applying computational techniques and data analytics as research tools) is required at all career stages.

The key challenges are around ensuring that researchers in software/computational techniques get recognition, and have access to sustainable academic career paths. This is a particular problem for research software engineers and research technologists. There are also benefits to be gained from co‐ordination and integration of training, such as sharing best practice and resources, and community building.

In the capability theme our vision for the future is as follows: . UK researchers will be classed as among the best in the world. . Investments in people and skills will lead to a flow of talented people who will help the UK to capitalise on the information revolution and drive the economy forward. . World class researchers who are domain experts will have sufficient software engineering expertise to develop their codes to be `right first time` but also reusable. . A strong cadre of software engineers will exist, who can take on longer‐term maintenance and development of codes, working closely with the domain experts to add functionality as needed. . Graduates will have the required analytical skills. . A smooth training pathway will exist between the various career stages within universities (e.g. effective linkage from Centres of Doctoral Training/Postgraduate training to postdoctoral and first academic post) leading to skills that are also recognised in industry, facilitating the movement of staff between universities and industry . A new breed of innovators and entrepreneurs will exist in ‘computational and data science’ in an open, interdisciplinary environment. . Research data management will be embedded in all research projects, thanks to a good understanding of the issues involved and sufficient technologists to provide the necessary expertise. . Sufficient numbers of researchers will exist with the data visualisation, analytics and interpretation skills needed to help capitalise on the information economy.

Connections: As the volume of data being generated through scientific research rises rapidly, and the scale of international collaboration increases it is essential that the research community has access to high speed, high capacity infrastructure that can be shared in an open and secure manner.

With the launch of JANET6 in autumn 2013, it is important that the community and EPSRC work closely with JANET to ensure that maximum benefit is derived from the services offered both through the new network and authentication and security programmes such as Moonshot.

In the connections theme our vision for the future is as follows: . Connections will enable a safe and secure research environment, allowing researchers to collaborate and use flexible e‐infrastructure, moving seamlessly from and between desk‐top computing, mobile technologies, and high performance compute and storage resources. . The UK will have a highly reliable, secure and robust research network, with the flexibility to deal with changes in demand and usage. HEI internal network infrastructure will mirror the external WAN connectivity where appropriate.

Version 1.1 August 2014 Page | 5 E‐Infrastructure Roadmap Executive Summary

. Researchers will be able to easily access the bandwidth that they require to facilitate their research, i.e. bandwidth on demand. . National facilities will have high capacity, high density connectivity available. . A Dark Fibre Network will be available for researchers to use ‘on demand’ and federated test‐beds will enable international research. . Flexible and secure network access for industry partners will be available, facilitating data exchange and collaborative research. . An engaged researcher and funder community will work with JANET to specify future networking requirements and service provision. . E‐infrastructure will be underpinned with standardised, low maintenance security infrastructures, with the continued promotion and ubiquitous adoption of authorisation and authentication measures, underpinned by ‘single sign‐on’ approaches.

Software Development: The importance of software development at all levels of the software stack has been highlighted in a number of high‐profile reports. Software is where much intellectual property, knowledge and understanding resides and this is why software has such longevity. Software and algorithm development also represents major investments in skilled scientists and engineers and the large suite of codes used in research therefore needs to be regarded as a research infrastructure in its own right, requiring support and maintenance along the innovation chain, and throughout its lifecycle.

There is a continued requirement to invest in people and training in software development, with a growing need to recognise the role and value of the research software engineer. The 2012 EPSRC Software as an Infrastructure strategy made a long‐term commitment to support software development.

In the software development theme our vision for the future is as follows: . The UK will continue to support a thriving community of computational scientists who are recognised internationally. . Basic scientific research that underpins software development will be supported, leading to the development of new methods and algorithms. . Sustainable and robust software will be available to support the current and future needs of the EPS community, both academic and industrial. . Strong multi‐disciplinary software development teams will exist, comprising experts from industry, the mathematical and physical sciences, informatics and computational science, together with the domain experts and hardware developers. . Expertise will be focused on enhancing existing software capabilities, allowing community codes to run on high end machines for simulation and data‐intensive computing, transferring important commercial codes to high end machines and the development of new codes based on new ideas for academic and industrial communities. . The community will be working on a number of agreed Grand Challenges with their international counterparts, continuing to lead and participate in European and global projects, for example the development of important exascale codes. . The value of software development expertise will be recognised by funders, HEIs and academic researchers, and a clearer career path for the research software engineer will be developed. A pool of researchers with the required skills and experiences to be the code and software developers of the future will be developed. . Students and researchers will have the required training to take full advantage of the available computational resources.

Version 1.1 August 2014 Page | 6 E‐Infrastructure Roadmap Executive Summary

. Open innovation will be a key enabler, allowing collaborative development of software projects. . The impact of software investments will be maximised by continued software engineering and community code development support, leading to robust, reliable and sustainable software. . Industry will be more engaged with existing UK E‐infrastructure, due to the development of application driven software. . Software and code will be increasingly portable as a result of established standards, validations and usable interfaces. . Best practise on software provenance, testing and security will be developed and disseminated. . Researchers will share software through resources such as Opensource.

Data Infrastructure:

It is clear that data science is becoming more important. There is growing awareness amongst our research community that data is a valuable asset. To stay competitive, the UK needs to invest in the development of cutting‐edge skills in data analytics and software development and provide a clear career path for data management professionals.

In addition to traditional structured data, research will increasingly use unstructured data coming from non‐ traditional sources such as crowd sourcing initiatives and social media. We need to ensure that the skills and infrastructure are present in the UK to support these new challenges.

RCUK encourages the open sharing of research data and code. Already open access publication is becoming the norm. At the same time, data needs to be stored securely and actively managed and curated to be usefully accessible for future generations. Ideally a standardised unifying identity and access management structure across the UK research landscape will ensure data security and accessibility.

In the data infrastructure theme our vision for the future is as follows: . Data intensive research will be supported by the requisite infrastructure and will play a major role in the UK continuing to be at the forefront of Physical Sciences and Engineering research. . The UK will be at the forefront of research and development in data science and analytics. . The UK’s data infrastructure will be well managed and curated, offering a treasure trove of valuable information to generate new knowledge. . The Research Data Facility at Edinburgh will be expanded and will provide a storage capacity for the ever increasing data intensive research outcomes from HPC‐related research. . Enhanced dialogue between industry and academia will lead to a better understanding of industrial requirements with respect to data security and long term integrity, facilitating collaboration. . Citizens will interactively use and provide data for new communal services in Smart Cities. . Data will increasingly come from non‐traditional sources, including crowd sourcing and social media. . The Cloud may take on an increasingly integrative role in allowing data to be used in an agile and flexible way. . New techniques will be developed to extract knowledge from Data and the infrastructure will be configured to promote and support this. New software to manage, curate and analyse data will be developed and supported. . Students and researchers will have the appropriate training and skills to exploit available data sets. . Researchers will openly share and reuse data, open access publication will be the norm (online publishing in an open access archive ‐ see for example astrophysics). . Crowdsourcing will be ever more important in obtaining and dealing with large amounts of data – the “internet of things” will gather sensor data from connected devices across the globe. . Existing long term preservation and integrity issues will be resolved.

Version 1.1 August 2014 Page | 7 E‐Infrastructure Roadmap Executive Summary

Hardware and Compute: Computing hardware used to carry out modelling, simulation, data analysis and visualisation ranges from desk‐ top machines, to university and regional systems to the national HPC service and access to international machines and new services such as Cloud.

Integration across the tiers of the eco‐system is still immature, but increased co‐ordination and integration of systems and services will allow the UK to maximise the impact of capital investment in this area plus provide users with easy access to the type of e‐infrastructure they require.

In the hardware and compute theme our vision for the future is as follows: . The UK will have a computing ecosystem that is: . Appropriately populated. . Balanced and integrated across the tiers. . Regarded as amongst the best in the world amidst stiff competition. . Responsive, to meet the needs of internationally competitive science. . The UK will have strong involvement in and links with international e‐infrastructure. . A long‐term investment plan for both capital and recurrent running costs will have been developed. . Increased numbers and range of users (both academic and industrial) will be doing cutting‐edge computational science and engineering research. . Researchers will be engaged appropriately with infrastructure development. This may mean that new models for user engagement need to be employed in order to ensure that the users of today and tomorrow are involved in this process.

Key Messages and Actions:

A number of key messages have been highlighted by EPSRC’s advisory team and the key stakeholders who were consulted during the development of the roadmap:

. People support is essential: The support of people in the e‐infrastructure eco‐system is as important if not more so than the capital investment that has and is being made. We need to ensure that support for people and training is central to any future strategic investments as this will ensure that existing and future e‐infrastructure is used, developed and applied successfully.

. Engagement: Due to the fast moving nature of the technological developments in the field it is essential that a stronger working relationship is developed with the broader community, including industrial users and collaborators, and non‐expert users.

. Pathways to Impact: In order to make the case for future e‐infrastructure investments it is important that we develop a strong set of key performance indicators. It is essential that the benefits are felt by the broader research community, and routes for working with industry of all scales are developed.

. Integration is key: Support for computational science and engineering is essential in ensuring that the research we support remains world‐leading. To provide a sustainable and cutting edge e‐infrastructure eco‐system it is vital that we integrate vertically and horizontally across the eco‐system and that different communities and stakeholders engage and work together.

Version 1.1 August 2014 Page | 8 E‐Infrastructure Roadmap Executive Summary Actions Plan

An integrated action plan has been created that draws together and consolidates the EPSRC actions from each roadmap theme.

The integrated action plan is structured around a set of Strategic goals that reflects both EPSRC’s and the communities’ priorities: Shaping Capability, Developing Leaders and Skilled People, Delivering Impact and User Support, Ensuring Trust and Planning for the Future.

Strategic Goal Short term Actions Medium / Long term Actions

. Maintain and develop links with other . The e‐Infrastructure Strategic elements of the e‐infrastructure eco‐ Advisory Team will continue to system e.g. HPC‐SIG, regional centres, provide advice and input on all DiRAC etc. aspects of e‐infrastructure to the . Continue to fund activities that Research Infrastructure Theme at support integration and collaboration EPSRC, and through us to the rest of within the research community; for the organisation. instance, CCPs, consortia, networks, . Develop closer links with the US, to international interactions. explore the potential for greater . Support more effective multi‐ collaboration and access for UK disciplinary networking and researchers, e.g. establish links with collaboration in software Xsede and INCITE, and the funding Shaping development projects by working bodies NSF and DoE in the US with Capability: more closely with our current the aim of collaborating with them. Building the capability software investments. . Develop further joint funding models to deliver high quality . Enhance the Research Data Facilities with EU partners, for example the and important functionality and make it more user‐ next G8 call. research. friendly, attracting a wider user base . Follow up on the outputs of the 2009 and increasing capacity. Applications/Algorithm roadmapping . Encourage HPC regional centres to activity. share storage facilities, as well as . Issue further software development skills and expertise in analysing and calls. interpreting data. . Refresh the CCP and consortia . Support the Dark Fibre mid‐range portfolio. facility. . Review infrastructure requirement . Work with other EPSRC themes (e.g. for Digital Economy’s interest in Digital Economy, ICT) and the RCUK MTurk (crowd sourcing), e.g. Amazon. e‐infrastructure group to co‐ordinate data infrastructure requirements, including security, accessibility and management of data.

Version 1.1 August 2014 Page | 9 E‐Infrastructure Roadmap Executive Summary

Strategic Goal Short term Actions Medium / Long term Actions

. Work on UK training marketplace: . Ensure that appropriate training is work with ELC, SSI and others on the given priority in future government implementation. investments (e.g. National Network . Ensure that appropriate training and of Data Analytics Centres). mechanisms for disseminating . Influence doctoral training in software best practise are embedded universities via Doctoral Training in the new EPSRC Centres for Doctoral Partnership awards. Training. . Provide training possibilities (e.g. as . Report on the Research Software part of Centres for Doctoral Training) Engineer with SSI. to ensure that the software . Continue to support early career development and data interpretation fellowships in software development, skills base in the UK remains cutting‐ advertising the opportunity more edge. widely. . Encourage Universities to use Developing . Emphasise training aspects in current doctoral prize funding to support Leaders and and future national HPC service career development when awarding Skilled People: provision. additional year to computationally‐ . Building on current Continue to try to exert influence on intensive domain‐based PhDs. . expertise, supporting policy development, e.g. BIS and ELC. Consider mechanisms to encourage the careers of postdoctoral mobility with the aim of researchers and blurring domain and technology developers, providing boundaries; Network and Mobility training and actively awards could be used for short, encouraging intensive e‐skills training. . collaborative working Engage with cohort of EPSRC fellows and inter‐ to understand their career path. . disciplinarity. Long term commitment to fellowships in software development. . Support the development of a clear career path for software developers and researchers providing appropriate support for potential and established leaders. . Continue to provide CSE support as part of the national service. . Publicise long term RC support – this will lead to universities appointing staff in the area. . Continue to engage with other stakeholders to be “joined up” in fostering skills. . Support locally delivered training, e.g. software carpentry events.

Version 1.1 August 2014 Page | 10 E‐Infrastructure Roadmap Executive Summary

Strategic Goal Short term Actions Medium /Long term Actions

. Work with the community and the . Increase the industrial membership national service to develop and of the SAT to ensure we get an populate an appropriate impact industrial user perspective. Delivering Impact framework and put in place a process . Catalyse links between TSB and and User Support: for gathering the evidence. EPSRC’s ICT programme to ensure UK . Ensure that EPSRC funded CSE companies and researchers are Maximising the support is providing value for money. benefiting from any ETP academic, societal . Plan workshops looking at the impact opportunities in co‐design and other and economic impact and evolution of specific codes. R&D activities. of current and future Identify and disseminate best practise . Develop joint funding models with investments by for software development and industry and TSB. providing appropriate exploitation. . Continue to provide CSE support as user support, . Provide support to Regional HPC part of the national HPC service. developing business centres, to facilitate relationship . Support further university CSE calls. models and building with the TSB catapults, SMEs . Work with JANET and the EPS promoting and industry. research community to ensure that collaboration with . Provide JANET with links to EPSRC there are clear, well understood industrial partners Strategic Partners to assist in the guidelines for industrial use of the

development of industrial pilot JANET network.

studies and regulatory framework. . Explore how best to fund the development of case studies.

Version 1.1 August 2014 Page | 11 E‐Infrastructure Roadmap Executive Summary

Strategic Goal Short term Actions Medium / Long term Actions

. On‐going management of the national . Improve remote access to data. service to high standards, ensuring a . Develop plans to incentivise wide range of users and maximum researchers to share software usage. outputs. . Gather user views on and . Support EPSRC Cyber‐security requirements of the national service research as part of the Global at regular user meetings, and also Uncertainties theme. through the governance structure, . Work with proposed Cyber‐security Ensuring Trust: where a Scientific Advisory working group led by JANET to Providing a robust, Committee or equivalent will provide identify areas of common interest reliable and secure the user perspective on the service. (reporting to RCUK e‐Infrastructure shared e‐ . Improve peer review of software group). infrastructure that development, embedding guidance in enables high quality the EPSRC Pathways to Impact research guidance. . Actively support open data policies and open access publishing. . Participate in dialogue on data security vs. accessibility with user base. . Encourage key EPSRC investments e.g. HPC Regional Clusters to participate in Moonshot pilots.

Version 1.1 August 2014 Page | 12 E‐Infrastructure Roadmap Executive Summary

Strategic Goal Short term Actions Medium /Long term Actions

. Continue to engage constructively . Explore a range of means of gathering with European colleagues in user views and requirements, with a PRACE on the provision of a particular focus on reaching out to new European e‐infrastructure, communities and young researchers, as particularly in developing and well as maintaining our interactions with pushing forward the consortium current expert users. model and also ensuring a . Develop a specification for, and then fund coherent development of Tier‐1 a `future architectures` study which services. includes getting access to appropriate . Continue to engage with a range hardware pilots/test beds. of providers, for instance through . Ensure UK has an awareness of the Planning for the the HPC‐SIG and through regional technology roadmap into the future. Future centres. . Discuss potential networking Developing a longer . Develop closer links with JANET requirements for EPSRC facilities and term strategy and and cross‐Council colleagues research communities. sustainable funding through the RCUK e‐ . Discuss and agree funding models for streams for the e‐ Infrastructure group. bandwidth on demand. infrastructure eco‐ . Support JANET in the development of system. business case for future research networking provision. . Keep a watching brief on the potential of cloud services for research, and fund further studies to supplement those already completed, if required. . Depending on the outcome of the next CSR, look at ways to increase investment in software development. . Work with other stakeholders in the eco‐ system to develop a business case for the whole ecosystem in an integrated way.

Version 1.1 August 2014 Page | 13 E‐Infrastructure Roadmap Methodology Methodology

Thematic Structure

The roadmap is structured around of number of interrelated e‐infrastructure themes:

. Capability: People Development, Computing and Data Skills and Research and Sector Domain Knowledge . Connections: Networks and Security and Authentication . Software Development . Data Infrastructure . Hardware and Compute

Key Questions

Within each theme a number of key questions have been asked that have allowed us to understand the current UK landscape, within an international context; to develop a vision for the next 10 years; and to produce a number of actions and recommendations for EPSRC, the research community and other stakeholders and funders in the domain:

Defining the current UK landscape What is the international context? What are the key challenges being addressed globally at this time? What is the UK’s current role and position? How does EPSRC currently contribute to this position? How do other key stakeholders currently contribute to this, e.g. HEI’s, other funding bodies etc.?

Vision for the future What do we want the UK landscape to look like in 10 years time? What do we need to do to reach this goal? What are the barriers to achieving this vision?

Actions and recommendations What is EPSRC’s role in in the short, medium and long term? What is the role of other stakeholders in contributing to this vision? What is the relationship with the other e‐infrastructure themes?

Integration and Impact

The last section of the roadmap is dedicated to the underpinning need for integration within the e‐ infrastructure eco‐system and the importance of demonstrating and capturing the impact of the investments made.

Version 1.1 August 2014 Page | 14 E‐Infrastructure Roadmap Methodology Consultation and Publication Process

The roadmap structure and initial content was produced in consultation with the EPSRC Strategic Advisory Team, EPSRC colleagues and JANET and draws heavily from relevant reports and reviews in the area which are referenced throughout.

Subsequently the draft roadmap was circulated to key EPSRC stakeholders, JISC, TSB and RCUK colleagues for comment and input prior to the on‐line release of the roadmap in December 2013.

EPSRC now welcomes further feedback and input from the wider community. This will help the roadmap to evolve and will help to inform EPSRC plans and activities in the coming years.

Minor updates and revisions were made in August 2014. The next proposed revision date will be in early 2015.

Acknowledgements

Throughout the development of the EPSRC E‐infrastructure roadmap, EPSRC has been deeply indebted to the E‐infrastructure Strategic Advisory Team for their advice and guidance:

 Professor Richard Kenway, University of Edinburgh  Professor Spencer Sherwin, Imperial College  Professor Mike Payne, University of Cambridge  Dr Clare Gryce, University College London  Professor , University of Southampton  Professor Paul Watson, Newcastle University  Professor Bryan Lawrence, University of Reading  Professor Jean‐Christophe Desplat, ICHEC  Professor Simon McIntosh‐Smith, University of Bristol

EPSRC would also like to thank Neil Chue‐Hong and Simon Hettrick from SSI and Bob Day, Jeremy Sharp, David Salmon and Henry Hughes from JANET for their input and contribution in the early developmental stages of the roadmap.

Version 1.1 August 2014 Page | 15 E‐Infrastructure Roadmap Capability Capability

Defining the current UK landscape

Definition: The Capabilitty theme includes People Developmeent, Computing and Data Skills and Research and Sector Domain Knowledge. It derives from all the other sections of this Roadmap and is both an input and output of a healthy e‐infrastructure ecosystem.

The skills focus for this roadmap includes both proogramming (tools) and research (methods): computing; programming and software engineering skills; data analysis; data curation and management; numerical analysis and algorithm development; applying computational techniques and data analytics as research tools; application of standard codes; matching problems with hardware and scale‐up beyond the desktop. Figure 3: Capability Theme

All career stages are referenced and several distinct research career paths are encompassed: Researchers who use software; Researcher‐Developers; Research Software Engineers; Research Software Support and Research Systems Providers. There are many other skills which are relevant to e‐infrastructure, e.g. electronics, phhotonics, data security, which support the various other aspects discussed in this roadmap such as hardware and networks. EPSRC supports research & training in these areas but software and data are the main focus of this section of the roadmap.

What is the international context? What are the key challenges being addressed globally at this time? . Information technology (especially big data problems) is prevalent across all of the engineering and physical sciences. Progress crucially relies on the development of cutting‐edge skills. . In the current economic climate strategic and tacctical funding decisions are needed to secure skills development in mathematics, statistics, programming and software engineering, data analysis, numerical analysis and algorithms, applied computational techniques and data analytics for research.1 . Integration of domain knowledge with e‐infrastructure knowledge. . The challenge of demonstrating the impact of training – assessment is needed some time after the training has taken place and expertise is needed to do this. Successful promotion of e‐infrastructure skills, showcasing good examples is required. . Training at all career stages is a challenge: undergraduate, Masters, PhD, postdoctoraal & throughout an academic career, because the e‐infrastructure ecoosystem is complex for the “long taill” of training required. Online training in addition to traditional/existing ttraining is becoming increasingly important. . Clear need for researchers in software/computatiional techniques however traditional metrics such as papers have not alwaays worked well, with issues concerning inclusion in REF. Sustainable acaademic career paths is a challenge, requiring an understanding of what that means and how it may be achieveed. . IT support to run the infrastructure is essential, but retention is a longstanding issue due to lack of career structure and prevalence of short term posts. This problem also extends to research software engineers and research technologists in general.

Version 1.1 August 2014 Page | 16 E‐Infrastructure Roadmap Capability

What is the UK’s Current Role and Position? Pluses Potentials . Healthy research base in physical sciences and . Show evidence for impact of software support on engineering. research productivity, through outcomes of . The UK hosts a number of internationally‐renowned initiatives such as distributed computational science and engineering departments. science and engineering support. . Strong investment in capital infrastructure. . Use of simulation, modelling and data analytics to . Integration in the European HPC initiative PRACE, which maintain and enhance the UK’s research profile by also provides training. the extension of the use of e‐infrastructure . Software support works well when software engineers techniques into new research fields. work closely with researchers to understand their . Scope for more integrated e‐skills training across requirements. disciplines (how can I use e‐infrastructure for my . Signs of changing HEI view of importance of software in research?) This is top priority for industry as research e.g. Research Software Development Team at enables mobility to other disciplines. UCL. Concerns Opportunities . Global competition investing heavily to retain & exploit . Impact of software support on research domestic talent. productivity could be harnessed by research . “Skills gap” in software development and data directors (pro‐VCs) to make a business case to interpretation. retain critical mass of software expertise in their . Career progression in HEIs often means moving away HEIs. from technical expertise and towards management; . Long‐term commitment from RCs and other those with advanced technical skills not rewarded.2 stakeholders to funding skills ‐ this will lead to . Often IT expertise in institutions is not well embedded appointments from universities. with researchers. . CPD for academics at all career stages to avoid . Lack of career prospects for research computing staff in weaknesses being transferred to PhD students and academia; short term contracts lead to loss of expertise RAs. This relies on effective identification of need if a new contract is awarded. and time (e.g. sabbaticals). . Insufficient numbers of academics are trained in . Focus of new national curriculum on programming software engineering or closely involved in the practical should increase market for computer science and aspects of software development, and therefore are less software engineering courses. able to articulate their support needs. . Appetite amongst e‐infrastructure training . Falling computer science graduate employability – could community to share resource and offer more discourage next generation. uniform provision of training.3 . Training in academia often too narrowly focused on . Summer Schools. specific software packages – need to train people more . Dissemination of good examples where skilled widely on how to use e‐infrastructure for their research. technical people were provided with a career path. . Training programmes & materials are expensive, value . Feedback from students who go to industry to often not appreciated outside of own subject. foster knowledge transfer and collaboration. . Training often fragmented with providers focussing on . Use of internships and summer projects to their own subjects – little overview of training providers encourage cross‐fertilisation between industrial as a group. users of academic codes. . Software not being recognised as a valid output (e.g. in . Exploring different cost models for training; free at REF), such that measures such as usage of a code and its the point of delivery is often over‐subscribed and significance and role in a research field can be properly subject to `no shows`. captured as impacts. . BIS capital money overvalues “kit” versus manpower.

Version 1.1 August 2014 Page | 17 E‐Infrastructure Roadmap Capability

How does EPSRC currently contribute to this position? Studentship Training . Support for training requested as part of Centres for Doctoral Training (CDTs). . Industrial‐CASE (I‐CASE) accounts to improve industry integration with student training. . Doctoral Training Partnerships (DTPs) – underpinning doctoral training provision in universities funded via flexible institutional allocations. This mechanism (coupled with I‐CASE and CDTs) is how EPSRC supports training of PhD students [Note: EPSRC does not support Masters training]. . Opportunity to request training as part of standard research grants. . EPSRC has funded the High Performance Computing Short Course centre. Fellowships . Early Career Fellowships in Software Development for Novel Physical Sciences and Engineering Research are currently available. Two early career fellows have recently been awarded in Engineering with more applications currently being reviewed. Researchers . HECToR and ARCHER Service Provision and Computational Science and Engineering contracts include training support. . Support for Collaborative Computational Projects (CCPs) and High End Computing (HEC) consortia who provide training and expertise through their knowledge of specific codes. . EPSRC funds the Software Sustainability Institute (SSI). This provides a range of advice and services to researchers, ranging from on‐line help and resources (e.g. guides and evaluations), community engagement (workshops and Fellowships), and direct services (e.g. consultancy, training). It is the UK co‐ordinator for Software Carpentry. . Software Carpentry aims to help scientists be more productive by teaching them basic computing skills. The approach combines short, intensive workshops (`boot camps`) with self‐paced online instruction. The SSI has been leading on the organising of boot camps in the UK, and has been developing a community of helpers and instructors that can deliver the training (i.e. training the trainer). Stakeholder Engagement . Seeking to influence or engage with other stakeholders e.g. ELC, other RCs, PRACE, industry, charities.

How do other key stakeholders currently contribute to this? International . PRACE: Recently supported six Advanced HPC Training Centres, one of which is at EPCC in Edinburgh. . ELIXIR: Managing and safeguarding Life Sciences data generated by publicly funded research. . US National Centres: Visits to UK universities and national laboratories and vice versa. . European Grid Infrastructure (EGI): Training marketplace to advertise training events and online courses. UK . Other Research Councils: e.g. DiRAC working with SSI on a “software driving license”; BBSRC support for Sysmic ‐ online course in interdisciplinary skills required for biological research. . Funding Councils: Provision of Masters training. . TSB: Catapult centres, knowledge transfer networks and partnerships. . Hartree Centre at Daresbury . DCC: Training on digital curation. . Hardware vendors and suppliers of commercial codes: Provide training. . NAG Ltd: Numerical software. . Universities: Undergraduate and Masters provision.

Version 1.1 August 2014 Page | 18 E‐Infrastructure Roadmap Capability

. Universities: Mathematics and computer sciences, but also data mining, management and curation courses in social sciences and humanities, use of simulation in biology. . UK e‐infrastructure academic user community forum: Objective to maintain and grow the community of UK researchers who use computers of any shape or size in their research, regardless of discipline or domain. . Information Economy Council: Looking at skills and capability to support growth of UK digital technology. . Industry: Send employees to undertake MSc and training courses.

Vision for the Future What do we want the UK landscape to look like in 10 years time? . UK people classed as among the best in the world. . Investments in people and skills will lead to a flow of talented people who will help the UK to capitalise on the information revolution and drive the economy forward plus providing a highly skilled workforce for industry. . World class researchers who are domain experts will have sufficient software engineering expertise to develop their codes to be `right first time` but also reusable. . A strong cadre of software engineers will exist who can take on longer‐term maintenance and development of codes, working closely with the domain experts to add functionality as needed. . Graduates will have the appropriate analytical skills. . A smooth training pathway between the various career stages within universities (e.g. effective linkage from Centres of Doctoral Training/Postgraduate training to postdoctoral and first academic post) leading to skills that are also recognised in industry, facilitating the movement of staff between universities and industry . A new breed of innovators and entrepreneurs will exist in ‘computational and data science’ in an open, interdisciplinary environment. . Research data management will be embedded in all research projects, thanks to a good understanding of the issues involved and sufficient technologists to provide the necessary expertise. . Sufficient numbers of researchers will exist with the data visualisation, analytics and interpretation skills needed to help capitalise on the information economy.

What do we need to do to reach this goal? . Translate current training focus on HPC to all levels of machines, including desktop. . Involve technologists in automisation of experiments, e.g. MRI, crystallography. Higher throughput is a huge impact. . Learn lessons from the large US labs. . Incentivise universities; foster culture change to recognise research software engineering as part of academic career progression. . Stimulate multidisciplinary working to create new centres etc. in universities. . Be aware of the teaching‐training continuum (a lot of training happens within teaching); improvements in teaching will impact training requirements. . There is a need for undergraduate degrees to include aspects of computational science. and engineering. . Create a `market place` for available training to provide a) a `single point` contact for what’s available, b) a source of best practice that could be shared, and c) a means of national coordination. Overall, this could assist in improving standards. . Up‐skilling of industry users’ via short courses delivered by distance learning. . Consider different cost models for training to avoid oversubscription and “no‐shows”. . Tailoring of training to the academic questions being addressed. . Where intensive training is needed aim for local delivery but with national co‐ordination. . Good training of the trainers so messages are conveyed well.

Version 1.1 August 2014 Page | 19 E‐Infrastructure Roadmap Capability

. Scale up of training to put out material efficiently; establish libraries of training materials; write once, deliver many times. . Workshops with software developers and university Human Resources. . Demonstrate impact of software development on research productivity. . Publicise successful examples of career paths for skilled technical people, e.g. software engineers. . Translation type fellowships. . Liaise with industry and academia in order to develop a shared skill set that will facilitate the movement of trained individuals between industry and academia (secondments and cross‐placement could be valuable).

What are the barriers to achieving this vision? . Existing skill gap. . Under‐provision of e‐skills training ‐ according to the user community.4 . Lack of communication between providers of e‐Infrastructure training. . Funding constraints. . Different stakeholders responsible for different career stages. . Lack of incentive for academics to make their code reliable and efficient, software not recognised in the same way as grant funding and publications; leads to a restricted career path if moving further towards technical software development. . Lack of career path for research computing in academia – technologists, research software engineers, data scientists, library professionals, etc.)

Actions and Recommendations What is EPSRC’s role in in the short, medium and long‐term? How does this relate to the EPSRC Strategic Goals? Short Term . Ensure that appropriate training and mechanisms for disseminating software and data best practice are embedded in the new EPSRC Centres for Doctoral Training (CDTs). . Emphasise training aspects in current and future HPC involvements (e.g. CSE and SP contracts). . Work on UK training marketplace: work with ELC, SSI and others on the implementation. . Report on the Research Software Engineer.5 . Continue to support early career fellowships in software development, advertising the opportunity more widely. . Improve peer review of software development, embedding guidance in the EPSRC Pathways to Impact guidance. . Continue to try to exert influence on policy development, e.g. BIS and ELC. Medium/Long Term . Continue to provide CSE support as part of the national service and expand the CSE programme. . Support locally delivered training, e.g. software carpentry events. . Support the development of a clear career path for developers and researchers providing appropriate support for potential and established academic leaders. . Influence doctoral training in Universities via Doctoral Training Partnership awards to foster more generic computational skills. . Encourage Universities to use doctoral prize funding (additional year of funding for the very best PhD students supported via DTPs) to support career development when awarding additional year to computationally‐ intensive domain‐based PhDs.

Version 1.1 August 2014 Page | 20 E‐Infrastructure Roadmap Capability

. Ensure that appropriate training is given priority in future government investments (e.g. National Network of Data Analytics Centres). . Consider mechanisms to encourage postdoctoral mobility with the aim of blurring domain and technology boundaries; Network and Mobility awards could be used for short, intensive e‐skills training. . Publicise long‐term Research Council support – this will lead to universities appointing staff in the area. . Engage with cohort of EPSRC fellows to understand their career path and commission a career path study, particularly for research technologists. . Continue to engage with other stakeholders to be “joined up” in fostering skills. . Long term commitment to fellowships in software development – consider follow‐through to next stage. . Long term commitment to software development as a priority. Train software developers, embed into domains.

What is the role of other stakeholders in contributing to this vision? . Funding Councils: Support for Masters programmes. . Universities: Provide state of the art courses in physical sciences, engineering, computer science, mathematics, software engineering etc.; encourage inter‐disciplinarity. . Universities: Making strategic staff appointments. . Universities: Increasingly running MOOCs. . Leading figures in academic community: Champions ‐ influencing universities to recognise importance of software within research such that career paths can be established for software engineers and researchers who invest in “getting their software right”. . Industry: Training their workforce in state‐of‐the art e‐skills, allow for CPD. . Other Research Councils: Strategy and funding. . `Training the trainer` important, to a) develop a cadre of skilled people locally b) allow training to be delivered and tailored locally and c) enable timely training. . Peer reviewers: Understanding and acceptance of training requests and development of software within a research programme. . Software centres of expertise: Providing guidance and training on the use of software in research and the issues facing the people who develop that software. What is the relationship with the other e‐infrastructure themes? . Capability relates to all other themes: Skilled people develop, implement and maintain Hardware and Software, analyse, interpret, manage and curate Data, construct and make use of Connections. References

1 Department of Business Innovation and Skills. Ibid. 2 RCUK input to ELC Training Business Plan, Oct 2012 3 E‐infrastructure Training Meeting, August 2013. 4 Yates, J and N.C Hong, H. Dhanoa and P. Lewis. “National E‐Infrastructure Survey Report.” Department of Business, Innovation and Skills. London, June 2012. http://www.bis.gov.uk/assets/biscore/science/docs/d/12‐1246‐developing‐e‐infrastructure‐in‐ engineering‐and‐manufacturing‐industries 5 Baxter, R., Chue Hong, N., Gorissen, D., Hetherington, J., Todorov, I. “The Research Software Engineer.” Digital Research Conference. Oxford, September 2012.

Version 1.1 August 2014 Page | 21 E‐Infrastructure Roadmap Connections

Connections

Defining the current UK landscape Definition:

Connections cover the provision of research networks and the security and authentication technologies that are essential in enabling users to utilise shared infrastructure. As the UK’s National Research and Education Network (NREN) JANET is a key player in developing the strategy and technology that drives this theme. EPSRC’s role is to ensure that EPS users have access to the high speed, high capacity infrastructures and advanced services they require and are able to use shared infrastructures in an open and secure manner, enabling collaboration.

What is the international context? What are the key challenges being addressed globally at this time? . The volume of data generated through scientific research is rising rapidly and research is more data‐intensive. Escalating demand from leading edge users, e.g. at the Large Hadron Collider has brought the need to capture, store and process terabytes of data per day. Networking requirements are prompting the establishment of 100 Gbps connections.6 . There are growing requirements for good international connectivity due to the increasing collaborative nature of scientific research, support for large multi‐national projects and facilities, and the trend towards international campuses for many HEIs.7 . National Research and Education Networks (NRENs) exist in most countries, e.g. JANET in the UK, and are connected to continental networks (e.g. GÉANT) and other parts of the world. They utilise high‐speed, high‐ capacity infrastructures and provide advanced services. They are also involved in establishing and operating experimental test beds.8 . Greater online collaboration has led to users needing to access computational resources inside and outside their organisations. A number of different security and authentication approaches have been developed to allow this: - Eduroam: International initiative that allows members to easily gain network access at other member sites through a single authentication scheme – JANET Roaming is part of this. - Shibboleth Approach: Access to web based resources via a single sign‐on. - Certification: Avoids use of passwords, high security environment, but overhead in terms of managing certification. Approach adopted by the Grid community, GridPP in the UK has helped to lead the way. - Moonshot: JANET‐led initiative, in partnership with GÉANT to develop a single unifying technology for extending the benefits of federated identity to a broad range of non‐Web services, including Cloud, HPC and Grid infrastructures.

Version 1.1 August 2014 Page | 22 E‐Infrastructure Roadmap Connections

What is the UK’s Current Role and Position?

Pluses Potentials . High performance, well‐managed JANET backbone, . JANET keen to understand and support requirements linked to GÉANT, one of the largest research and of UK research community. education networks. . £12M provided for additional fibre access to key sites . Importance of research networks highlighted in recent e.g. Hinxton Bioinformatics Cluster. reports – led to £26M e‐infrastructure investment . JANET has capacity to set up Lightpath services e.g. from BIS, plus scheduled capital refresh for between ISIS and ILL, ARCHER and MONSOON, the JANET6.9, 10 Research Data Facility and JASMIN, etc. . JANET6 launched in autumn 2013 providing flexible, . JANET support test‐beds and platforms for R&D which agile, reliable, robust and secure network with latent do not affect the network. capacity to accommodate increases in demand. 5‐10 . A new National Dark Fibre Infrastructure Service years funding. (NDFIS) has been supported by EPSRC and JANET with . Native capacity on core will be 100Gbit/s, 80 channels UCL as the prime contractor for a consortium of 100Gbit/s per channel, roadmap for 400Gbit/s and comprising the Universities of Bristol, Cambridge and 1Tbit/s. Southampton. It has been set up to enable . JANET runs security emergency response team – researchers to develop the underpinning CIRST, to advise the academic community. UK is also a communications technologies for the future internet. world leader in security R&D and a trusted place to do . Moonshot: JANET working on a general solution to research. federated trust infrastructure which could facilitate access to shared e‐infrastructures.

Concerns Opportunities . End‐to‐end performance issues – many users struggle . £4M of e‐infrastructure funding allocated to JANET to to achieve the required end‐to‐end rates. develop “Industry connectivity.” . Users uncertain about end‐to‐end security of . JANET in dialogue with EPSRC regional centres to applications that transfer confidential data over understand networking requirements, plus links with networks. industry. . Security and authentication schemes can put an . Discussing network requirements of the RDF and administrative burden on non‐expert end‐users.11 ARCHER. . Difficulties in engaging with industry: Uncertainty . JANET looking to create bandwidth on demand over industry access to JANET (state aid issues, services. charging models etc.), information security and IP . 18 month Moonshot pilots run from April 2013, with seen as a major blocks to industrial collaboration and European GN3+ pilot to follow.13 usage of shared infrastructure.12 . Clear advice on end‐to‐end security could encourage innovation and sharing in UK research, JANET looking to take a more proactive approach.

How does EPSRC currently contribute to this position? . EPSRC ICT theme supports fundamental research and innovation in Optical Communications and ICT Networks and Distributed Systems. ICT also have a priority theme “Towards an intelligent information infrastructure”. . A new National Dark Fibre Infrastructure Service (NDFIS) has recently been supported by EPSRC and JANET with UCL as the prime contractor for a consortium comprising the Universities of Bristol, Cambridge and Southampton.EPSRC funded regional HPC Regional Centres are in dialogue with JANET about potential networking requirements and links to industrial partners.

Version 1.1 August 2014 Page | 23 E‐Infrastructure Roadmap Connections

. The ARCHER Service is working with JANET on the networking requirements for the upgrade of the Research Data Facility at Edinburgh. How do other key stakeholders currently contribute to this? International: . GÉANT: GÉANT is the pan‐European research and education network that interconnects Europe’s NRENs. Together they connect over 50 million users at 10,000 institutions across Europe. GÉANT supports networking, service development and joint research activities. . DANTE: DANTE is the managing partner, project co‐ordinator and operator of the GÉANT network

UK: . JANET: JANET is the key UK stakeholder in this area, supporting the current JANET network, providing the UK contribution to GÉANT and leading on the implementation of JANET6.

Figure 4: JANET6 multi‐service network architecture diagram

. JANET has the capacity to set up lightpath services (committed capacity point‐to‐point connections) where the case can be made e.g. between ISIS and ILL and between ARCHER and MONSOON JANET has invested £12M for additional fibre access to key sites e.g. Norwich and Hinxton Bioinformatics Cluster. . JANET also supports test‐beds and platforms for research and development which can be used without effecting e.g. AURORA dark fibre network used by the optical communications community. . JANET offer 40+ services for customer base, including JANET Research Support Unit which engages with research communities to understand requirements and provide diagnostic assistance and performance advice on both network and end‐system issues. . £4M of e‐infrastructure funding has been allocated to JANET to develop “Industry connectivity” and relations with the commercial sector. A clear, permissive legal position on the industry use of university networks could encourage collaboration, co‐location and start‐ups, therefore increasing impact. JANET are working with NAG Ltd on proof of concept looking at policy and regulatory framework. . JANET also provide JANET roaming as part of Eduroam, allowing researchers to easily gain network access at other member sites through a single authentication scheme. . JANET are leading the development of Moonshot, a general solution to federated trust infrastructure building on deployed, proven technology. Moonshot could facilitate wider access to shared infrastructures, collaboration and provide avenue for industrial access due to standardisation, number of case studies in HPC, STFC, Diamond and Cancer Research underway. . NES: NES provided access to a broad range of computational and data based resources and helped to pioneer the certification security model. . Technology Strategy Board . Commercial providers

Version 1.1 August 2014 Page | 24 E‐Infrastructure Roadmap Connections Vision for the Future What do we want the UK landscape to look like in 10 years time? . Connections will enable a safe and secure research environment, allowing researchers to collaborate and use flexible e‐infrastructure, moving seamlessly from and between desk‐top computing, mobile technologies, and high performance compute and storage resources.14, 15 . The UK will have a highly reliable, secure and robust research network, with the flexibility to deal with changes in demand and usage. HEI internal network infrastructure will mirror the external WAN connectivity where appropriate. . Researchers will be able to easily access the bandwidth that they require to facilitate their research, i.e. bandwidth on demand. . National facilities will have high capacity, high density connectivity available. . Dark Fibre Network will be available for researchers to use ‘on demand’ and federated test‐beds will enable international research. . Flexible and secure network access for industry partners will be available, facilitating data exchange and collaborative research. . An engaged researcher and funder community will work with JANET to specify future networking requirements and service provision. . E‐infrastructure will be underpinned with standardised, low maintenance security infrastructures, with the continued promotion and ubiquitous adoption of authorisation and authentication measures, underpinned by ‘single sign‐on’ approaches.

What do we need to do to reach this goal? . An appropriate long‐term funding model to support the research network in the UK. . Continue to work as a global partner, active in offering the most advanced networking services, providing links to other continents, contributing to major international research projects. Improved end‐to‐end connectivity and seamless multi‐domain networking, including flexible bandwidth‐on‐demand. . Appropriate research support services that are available and widely used by the research community. . Pro‐active involvement of Research Councils in working with JANET to understand and identify their research communities networking requirements. For example, identifying new areas of research that may lead to significant increases in bandwidth demand, facilities that require light paths and the need for test beds as a basis for innovative applied research. . Information on the effect that moving to Cloud computing will have on networking requirements. . An appropriate policy and regulatory framework to allow Industrial access and collaboration over the research network. . Enhanced connectivity between OEMs and SMEs within their supply chains, HPC hubs and academic institutions with improvements to reach, speed and bandwidth. . Development of robust authentication and security systems to enable trusted users to utilise shared infrastructures in an open manner, e.g. further development and rollout of Moonshot – enabling single sign‐on federated access to a range of services. . Development of internationally recognised and understood standards for security and authentication, leading to greater confidence. . Greater understanding of the security risks identified by organisations (academic, industrial, third sector) and acceptable controls and measures. . Networking and security case studies, using industrial language to allay fears and concerns.

Version 1.1 August 2014 Page | 25 E‐Infrastructure Roadmap Connections

What are the barriers to achieving this vision? . End to end connectivity can be limited by local infrastructure and knowledge. . Restrictions on industry use of networking could negatively affect university‐industry collaboration. . Threat of cyber‐attacks discourages use of research networks for transferring valuable data. . Lack of common vocabulary and cultural differences when talking about risk and information security. . Implications around data security and Intellectual Property not fully understood. . Lack of realistic risk management within e‐Infrastructure providers and lack of skills to provide the assurances required. . Authentication not perceived as a priority by most users.

Actions and Recommendations What is EPSRC’s role in in the short, medium and long‐term? How does this relate to the EPSRC Strategic Goals? Short Term . Discuss potential networking requirements for ARCHER, PRACE and the Research Data Facility. . Support for the new National Dark Fibre Infrastructure Service (NDFIS). . Provide JANET with links to EPSRC Strategic Partners to assist in the development of industrial pilot studies and regulatory framework. . Encourage key EPSRC investments e.g. the Regional Centres to participate in Moonshot pilots. . Developing closer links with JANET and cross‐Council colleagues through the RCUK e‐Infrastructure group. . Work with proposed Cyber‐security working group led by JANET to identify areas of common interest. . Identify how to efficiently use the networks to exploit commercial providers. Medium term: . Discuss potential networking requirements for EPSRC facilities and research communities with JANET. . Discuss funding models for bandwidth on demand with JANET. . Support EPSRC Cyber‐security research as part of the Global Uncertainties theme. . Work with JANET and the EPS research community to ensure that there are clear, well understood guidelines for industrial use of the JANET network. Long term: . Support JANET in the development of the business case for future research networking provision.

What is the role of other stakeholders in contributing to this vision? JANET . Be open and accessible to academia and industry. . Be ahead of demand. . Support a richer set of connectivity services on and off‐net. . Provide customer controlled access to bandwidth. . Support users getting the best performance. . Work closely with Research Councils to understand research communities’ requirements. . Continue linking with other international NRENs and commercial network providers to provide global networking services. . Continue to develop and deliver Moonshot. . Working on developing international standards that will build trust and confidence in authentication and security schemes. . More pro‐active approach for CIRST.

Version 1.1 August 2014 Page | 26 E‐Infrastructure Roadmap Connections

. New Services being developed to assist e‐Infrastructure service providers in a) securing infrastructures b) providing policy guidance. GÉANT . Continue to work as a global partner. DANTE . Represent the common interests of European NRENs internationally by acting as co‐ordinator, facilitator and envoy in respect of matters concerning inter‐regional connectivity and inter‐regional collaborations. . Enable NRENs to expand their user base. . Leverage the collective purchasing power of the network community.16

What is the relationship with the other e‐infrastructure themes? . Data Infrastructure: Strongly connected as increased levels of data puts pressure on connectivity and bandwidth and leads to greater information security issues. . Integration: Connections can enable the use of shared infrastructure and scientific co‐ordination.

References

6 Trans‐European Research and Education Networking Association. Site Home. http://www.terena.org/. November 2013. http://www.terena.org/ [06 December 2013] 7 Trans‐European Research and Education Networking Association. Ibid. 8 GÉANT Expert Group. “Knowledge without Borders: GÉANT 2020 as the European Communications Commons.” European Commission. , October 2011. 9 Department for Business Innovation and Skills “Report of the e‐Infrastructure Advisory Group.” London, June 2011. http://www.rcuk.ac.uk/documents/documents/e‐IAGreport.pdf 10 Tildesley, Dominic. “A Strategic Vision for UK e‐Infrastructure: A roadmap for the development and use of advanced computing, data and networks.” Department for Business Innovation and Skills, London, July 2011. https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/32499/12‐517‐ strategic‐vision‐for‐uk‐e‐infrastructure.pdf 11 Department for Business Innovation and Skills. “Report of the e‐Infrastructure Advisory Group.” Ibid. 12 E‐Science Leadership Council, Engineering and Manufacturing Working Group. “Developing E‐Infrastructure in the UK’s Engineering and Manufacturing Industries.” Department for Business Innovation and Skills, London, June 2012. https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/34674/12‐1246‐ developing‐e‐infrastructure‐in‐engineering‐and‐manufacturing‐industries.pdf 13 Moonshot case studies 14 GÉANT Expert Group. “Knowledge without Borders: GÉANT 2020 as the European Communications Commons.” European Commission. Brussels, October 2011. 15 Department of Innovation, Industry, Science and Research. “2011 Strategic Roadmap for Australian Research Infrastructure: Exposure Draft.” Canberra, June 2011. 16 Delivery of Advanced Network Technology to Europe (DANTE). “Dante Strategy 2012‐2015: More for all.” European Commission, 2012. http://www.dante.net/About_Us/Vision_and_Strategy/Pages/Home.aspx

Version 1.1 August 2014 Page | 27 E‐Infrastructure Roadmap Software Development

Software Development

Defining the current UK landscape Definition:

Software developed for experimental facilities and instrumentation, modelling and simulation and data‐ analysis is a critical and valuable resource. Software and algorithm development represents major investments in skilled scientists and engineers and the large suite of codes and algorithms used in research should be regarded as a research infrastructure, requiring support and maintenance along the innovation chain, and throughout its lifecycle. It is important for EPSRC to articulate its strategy for investing in software development, ensuring that funding continues to support leading scientific research and key codes used by the EPS community.

What is the international context? What are the key challenges being addressed globally at this time? . Scientific software development supports world‐leading scientific research in the EPS space. Community codes have emerged that allow pooled skills and effort to support cutting edge, reproducible, research. . Scientific software development is needed to maximise the benefit of current and next generation computer architectures. Expertise needs to focus on:

1. Developing appropriate codes for academic and industrial communities, based on new ideas. 2. Maintaining and enhancing existing software capabilities. 3. Allowing community codes to run on high end machines for simulation and data‐intensive computing and transferring important commercial codes to high end machines.

. International Exascale Software Program (IESP) has developed a roadmap17 identifying the need to integrate, test, maintain, support and develop an integrated collection of software / X‐stack. IESP is also looking to provide a framework to enable the international community to work together. The European Exascale Software Initiative (EESI) has been funded by EC to build a vision and roadmap to address the challenge of performing scientific computing on next generation of computers.18 . Energy efficient computing is seen as a key challenge by IESP and EESI, the EESI roadmap suggests that standards to support energy‐aware algorithms should be agreed in 2014/2015 with energy‐aware libraries available in 2016.19 . Co‐design of hardware and software is a way to develop hardware, X‐stack software and applications in an efficient way –it is difficult to balance vendor and academic interests and plans for co‐design centres in USA, Japan and are on hold.20 . Making software more portable between different environments, for example the Cloud and ensuring that software can be run and used by industry. . Social Programming is becoming a popular tool, enabling open and collaborative development of software resources.

Version 1.1 August 2014 Page | 28 E‐Infrastructure Roadmap Software Development

What is the UK’s Current Role and Position?

Pluses Potentials . The current portfolio of UK funding activities has . ELC activities could lead to new opportunities. supported a thriving, internationally recognised . EPSRC software development Fellowships on offer. community of computational scientists and engineers. . CCP Forge a software repository and collaboration . The importance of software development at all levels tool has been supported for 4 years – remit now of the software stack has been highlighted as an area expanded beyond the CCPs, to be more broadly used for action in high‐profile reports.21, 22 It needs to be by EPS community. given an equal footing with the purchase of hardware . Guidance on software development to be included in and networking. EPSRC Pathways to Impact, encouraging planning and . Strong UK involvement in both IESP and EESI. resources for software development at start of . Leading involvement in a number of EU/EC funded projects. projects e.g. CRESTA EU project looking at co‐design. . HEC Consortia and CCPs are well positioned to tackle . CCPs and Consortia build networks of users and the Exascale Grand Challenges in Engineering and disseminate new methods and software, encouraging fundamental science identified by EESI.23 industrial involvement.

Concerns Opportunities . Complex and fragmented funding landscape. . EPSRC made a long‐term commitment to supporting . Issues around sustainability and robustness of existing software in the Software as an Infrastructure codes and transfer to new computer architectures. Strategy.25 . Capital support is not enough; there is continued . RCUK E‐infrastructure group will lead to greater co‐ need to invest in people and training in software ordination of activities. development. . Appropriate CSE training is embedded in EPSRC’s new . Lack of recognition of role and value of the software Centres for Doctoral Training. engineer in academia means it is difficult to retain . Plans to produce report on the Software Engineer expertise. with SSI. . Peer review of software development in projects can . Connecting HPC providers with the TSB Catapult be / perceived to be problematic. centres as a way of brokering new relationships with . Software provision presents a number of issues to the industry, including SMEs and providing access to HPC user base: Cost, complexity and usability of specialist facilities. software.24 . Need for open standards for parallel computing across new architectures.

How does EPSRC currently contribute to this position? Over the last five years EPSRC has invested approximately £9 million per annum on software development: Shaping Capability . Collaborative Computational Projects (CCPs): 9 CCPs currently supported each tackling large‐scale scientific software development projects, maintaining and distributing code, and providing training and user support (£2.7M). New EPSRC funding for CCP Flagship software development projects and networking and core support have recently been announced. . HPC Consortia: 7 consortia provide members with allocation of computing resources on the national service to enable research in a certain area of science or engineering (~£3.5M). . Regular Software Development calls: 3 calls over 4 years (£13M), Software for the Future Call 2012/13: (£7M), Software for the Future II Call 2014/15: (£ 4M) with a further call planned in 2015/15.

Version 1.1 August 2014 Page | 29 E‐Infrastructure Roadmap Software Development

. EPSRC network on numerical Algorithms and high performance computing: Providing a focus for new collaborations between numerical analysts, computer scientists and developers and users of software and HPC. . Extreme Computing sandpit: Novel approaches to the development of software to exploit next generation HPC hardware. 3 projects were funded (£2.4M). . EPSRC‐NSF software development projects: 4 proposals supported (£3M). . SeIUCCR: Network to promote wider take up and exploitation of UK e‐infrastructure (£250K). Developing Leaders and Skilled People . NAIS: S and I award investigating algorithms and software methodology underpinning HPC. (£4.5M). . HPC Short Course Centre: Training programme in HPC methods (£300K). . Fellowships: Fellowships in Software Development for Novel Physical Sciences and Engineering Research currently available. 2 early career fellows recently awarded in Engineering with more applications in these two priority areas currently being reviewed. Delivering Impact and User Support: . Computational Science and Engineering (CSE) support: Support to users of national service including embedded CSE (eCSE) and training. EPSRC also supported a university dCSE pilot in 2012/13. . Support for SLA with STFC: Daresbury, provides underpinning support to the CCPs and Consortia. . NSCCS – National service for computational chemistry software: Access to software, compute and training in chemistry. Ensuring Trust: . Software Sustainability Institute: Works with researchers to identify and shape the software considered to be important to research. 2010‐2015 (£4.4M). . CCP Forge: Software repository and collaboration tool ~£1M.

EPSRC has made an on‐going commitment to support software development in the EPSRC Software as an Infrastructure strategy25 and Action Plan.26 An audit and update of the strategy and action plan will be published in 2014. Software Engineering is also supported through the EPSRC ICT theme and a number of relevant CSE platform and responsive mode grants are funded through the other EPSRC themes.

How do other key stakeholders currently contribute to this? During the EPSRC Software for the Future workshop27 a map of current software activities in the UK was created. A diagram can be found in Annex 1 of the workshop report. Below is a summary: International: . G8 Research Councils Initiative on Multilateral Research Funding: Interdisciplinary Program on Application Software towards Exascale Computing for Global Scale Issues ‐ 6 projects funded including UK led NU‐FUSE. . International Exascale Software Program: Developing roadmap and co‐ordination activities. . PRACE: 6 Advanced training centres including Edinburgh. . Other EU/EC: European Exascale Software Initiative, FP7: e.g. CRESTA– project focused on co‐design of hardware and software – UK led, APOS‐EU, Virtual physiological human network of excellence. UK . STFC: Computational Science and Engineering Department at Daresbury develop and apply powerful simulation codes, Hartree Centre, DiRAC, Diamond: DAWN, GDA. . Other RCs: Each runs and supports own activities e.g. BBSRC tools and techniques calls. . RCUK E‐infrastructure Group: Brings together RCs, JISC, JANET, Met Office and TSB to share and co‐ordinate activities. . TSB: Energy Efficient Computing Emerging Technology Theme, KTPs for multicore and parallel processing.

Version 1.1 August 2014 Page | 30 E‐Infrastructure Roadmap Software Development

. JISC: Research tools programme, Developing Community Supporting Innovation, Software Hub, support for Neurohub. . E‐Leadership Council: Developing a strategy to provide a world class e‐infrastructure. . HPC SIG: Work to demonstrate the value of HPC facilities, e‐IAUCF. . University level CSE support, Commercial providers, e.g. NAG. . Project Directors Group: Set up to co‐ordinate projects funded through recent e‐infrastructure investment.

Vision for the Future What do we want the UK landscape to look like in 10 years time? Shaping Capability . The UK will continue to support a thriving community of computational scientists who are recognised internationally. . Basic science research that underpins software development will be supported leading to the development of new methods and algorithms. . Sustainable and robust software will be available to support the current and future needs of the EPS community, both academic and industrial. . Strong multi‐disciplinary software development teams will exist, comprising experts from industry, the mathematical and physical sciences, informatics and computational science, together with the domain experts and hardware developers. . Expertise will be focused on enhancing existing software capabilities, allowing community codes to run on high end machines for simulation and data‐intensive computing, transferring important commercial codes to high end machine and the development of new codes based on new ideas for academic and industrial communities. . The community will be working on a number of agreed Grand Challenges with their international counterparts, continuing to lead and participate in European and global projects, for example the development of important exascale codes. Developing Leaders and Skilled People . The value of software development expertise will be recognised by funders, HEIs and academic researchers, and a clearer career path for the software engineer will be developed. A pool of researchers with the required skills and experiences to the be code and software developers of the future will be developed. . Students and researchers will have the required training to take full advantage of the available computational resources. Delivering Impact and User Support . Open innovation will be a key enabler, allowing collaborative development of software projects. . The impact of software investments will be maximised by continued software engineering and community code development support, leading to robust, reliable and sustainable software. . Industry will be more engaged with existing UK E‐infrastructure, due to the development of application driven software. Ensuring Trust . Software and code will be increasingly portable as a result of established standards, validations and usable interfaces. . Best practise on software provenance, testing and security will be developed and disseminated. . Researchers will share software through resources such as Opensource.

Version 1.1 August 2014 Page | 31 E‐Infrastructure Roadmap Software Development

What do we need to do to reach this goal? Planning for the Future Analyse the impact of current and past software investments to demonstrate their value, making the case for increased levels of recurrent investment in development projects, CSE support and people and training. Shaping Capability . Develop and re‐engineer existing code in key areas for current and new architectures, encouraging code consolidation where appropriate. Develop novel code based on new ideas for academic and industrial communities in key areas. . Invest in tools that help software and hardware developers to work together to co‐design hardware and software. . Develop a number of cross‐cutting grand challenges with international partners that are relevant to both industry and academia. Provide support for collaborative and cross‐disciplinary team working in these areas. Developing Leaders and Skilled People . Encourage the continued provision of CSE training for researchers in academia and industry, enabling professional development of skills. . Promote best practise and embed CSE training in successful Doctoral Training Centres, encouraging wider usage. . Support the development of a clear career path for developers and researchers in this domain, from PhD onwards, developing measures and success features that Universities and Research Councils can use to recognise innovation in software provision. Delivering Impact and User Support: . Long‐term support for software maintenance and development. . Provide broad access to the infrastructure for industrial partners, suppliers and Independent Software Vendors (ISVs), as well as the academic community, potentially via on‐ramps. . Develop e‐Infrastructure Expertise Teams to allow businesses in the UK engineering and manufacturing sector access to critical software and applications in the short term. . Develop longer‐term engagement with industrial strategic partners, to understand their different e‐ infrastructure needs and requirements. . Develop joint funding models with Industry to sustainably support software through “Follow‐on funding” type activities. Ensuring Trust . Develop standards and best practise. Provide peer review guidance on importance of software outputs. . Work with the international community to define standards for ‘scientific software.’ . Encourage software developers to understand the uncertainties in their models and in the experimental data that is used for validation. . Incentivise researchers to share software outputs.

What are the barriers to achieving this vision? . Lack of co‐ordination between funders. Lack of peer review guidance on importance of software outputs. . Lack of recurrent funding to support people as the development of expertise and training is key to this theme. . Lack of suitably skilled people. . Lack of involvement of industrial partners. . Licensing issues and implications with Independent Software Vendors.

Version 1.1 August 2014 Page | 32 E‐Infrastructure Roadmap Software Development Actions and Recommendations What is EPSRC’s role in in the short, medium and long‐term? How does this relate to the EPSRC Strategic Goals? Short Term . Support more effective multi‐disciplinary networking and collaboration by working with current investments. . Work with ELC, SSI and others on the development of a training marketplace and a report on the Research Software Engineer. . Continue to support early career fellowships in software development, advertising the opportunity more widely. . Ensure that appropriate training and mechanisms for disseminating software best practise are embedded in the new EPSRC Centres for Doctoral Training. . Ensure that EPSRC funded CSE support is providing value for money. . Plan workshops looking at the impact and evolution of specific codes. Use this opportunity to ask “How should we use community codes, how do we evolve them and exploit them.” Identify and disseminate best practise for software development and exploitation (CCP case studies/SSI/dCSE outcomes). . Provide support to Regional HPC centres, to facilitate relationship building with SMEs and industry. . Improve peer review of software development, embedding guidance in the EPSRC Pathways to Impact guidance. . Issue further software development calls and refresh the CCP portfolio. Medium term: . Follow up on the outputs of the 2009 Applications/Algorithm road mapping activity. . Develop further joint funding models with international partners e.g. the US and EU partners. . Develop joint funding models with industry and TSB. . Support further University CSE provision. . Develop plans to incentivise researchers to share software outputs. Long term: . Depending on the outcome of the next CSR, look at ways to increase investment in software development. . Support the development of a clear career path for developers and researcher in this domain, providing appropriate support for potential and established leaders. . Provide CSE support as part of the national HPC service.

What is the role of other stakeholders in contributing to this vision? International  Horizon 2020  IESP and EESI: Taking the roadmaps developed through these initiatives and developing next steps, for example action plan for the Grand Challenges identified by EESI.  PRACE: Developing the case for exascale facilities.  ESFRI: Roadmap for development of EU research infrastructure. UK  RCUK: Working to co‐ordinate activities where possible through the RCUK E‐Infrastructure group, look to develop business cases for both capital investment and recurrent government funding in e‐infrastructure. Organising cross‐council Strategic conference to understand cross cutting issues and challenges.  Software centres of expertise: Providing guidance and training on the use of software in research and the issues facing the people who develop that software.  E‐Leadership Council: Provide strategic advice and input to BIS. Version 1.1 August 2014 Page | 33 E‐Infrastructure Roadmap Software Development

 Research Community: Supporting groups to promote and disseminate best practice and recommendations in software development areas such as training, research software engineer, AAAI, HPC and cloud computing.

What is the relationship with the other e‐infrastructure themes? . Capability: There is a very strong connection with people and skills due to the expertise and training requirements. . Hardware and Compute: There is a strong relationship with hardware theme. . Data Infrastructure: There is a relationship with data, in terms of producing new software for data intensive research.

References

17 Dongarra, Jack et al. “The International Exascale Software Project Roadmap.” The International Exascale Software Project. 2010. http://www.exascale.org/mediawiki/images/2/20/IESP‐roadmap.pdf 18 Ricoux, Philippe. “European Exascale Software Initiative: Home Page.” 10 December 2013. http://www.eesi‐ project.eu/pages/menu/homepage.php [10 December 2013] 19 Michielse, Peter and Patrick Alerts. “Report on International Activities.” European Exascale Software Initiative Consortium. 30 November 2011. http://www.exascale.org/mediawiki/images/b/bb/EESI‐D2_3‐ report‐on‐international‐activities‐16112011.pdf 20 Michielse, Peter and Patrick Alerts. Ibid. 21 Tildesley, Dominic. “A Strategic Vision for UK e‐Infrastructure: A roadmap for the development and use of advanced computing, data and networks.” Department of Business Innovation and Skills. London, July 2011. https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/32499/12‐517‐ strategic‐vision‐for‐uk‐e‐infrastructure.pdf 22 Department of Business Innovation and Skills “Report of the e‐Infrastructure Advisory Group.” London, June 2011. http://www.rcuk.ac.uk/documents/documents/e‐IAGreport.pdf 23 2011 Exascale Grand Challenges in Engineering and fundamental science 24 E‐Science Leadership Council, Engineering and Manufacturing Working Group. “Developing E‐Infrastructure in the UK’s Engineering and Manufacturing Industries.” Department for Business Innovation and Skills. London, June 2012. https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/34674/12‐1246‐ developing‐e‐infrastructure‐in‐engineering‐and‐manufacturing‐industries.pdf 25 Engineering and Physical Sciences Research Council (EPSRC). “Software as an Infrastructure Strategy.” EPSRC. Swindon, February 2012. http://www.epsrc.ac.uk/newsevents/news/2012/Pages/softwareinfrastructurestrategy.aspx 26 Engineering and Physical Sciences Research Council (EPSRC). “Software as an Infrastructure Strategy: Action Plan.” EPSRC. Swindon, February 2012. http://www.epsrc.ac.uk/SiteCollectionDocuments/other/SoftwareAsAnInfrastructureActionPlan.pdf 27 Engineering and Physical Sciences Research Council (EPSRC). “Software for the Future: 31 May 2012.” EPSRC. Swindon, 31 May 2012. http://www.epsrc.ac.uk/SiteCollectionDocuments/other/OutcomesofSoftwarefortheFutureworkshopMay 2012.pdf

Version 1.1 August 2014 Page | 34 E‐Infrastructure Roadmap Data Infrastructure

Data Infrastructure

Defining the current UK landscape Definition: The explosion of data production and availability is transforming all areas of society. All scientific disciplines are faced with the enormous challenge of retaining data in a usable, easily accessible but at the same time secure format for future use. EPSRC supports research in data science and big data analytics, leads cross‐ council activities in digital economy and cyber security and provides infrastructure to store, manage and process data. Here we focus on EPSRC’s role in supporting the infrastructure required by the Engineering and Physical Sciences communities, including data storage capacity and accessibility, management and security. The research challenges, skills and capability requirements of big data analytics are currently being discussed by an EPSRC working group in connection with other research councils and TSB.

Figure 4: Definition of Data Infrastructure What is the international context? What are the key challenges being addressed globally at this time? . Global digital data expanded from 130 exabytes in 2005 to 1,227 exabytes in 2010 with a predicted rise to 7,910 exabytes by 2015. . A new scientific methodology driven by data‐intensive problems is currently emerging.28 . Data sets are being constantly generated from experiments, computer simulation/modelling, the internet (including social media), and mobile devices. . Many research questions involve the analysis and interpretation of big data‐sets. Handling and analysing “Big Data” is the largest software challenge facing research consortia.29

Version 1.1 August 2014 Page | 35 E‐Infrastructure Roadmap Data Infrastructure

. Important issues need to be addressed on ownership, curation and management, analysis and interpretation, standardisation and usability, anonymisation and data protection. . Digital data can be stored, shared, searched, combined and duplicated with extraordinary speed at comparatively low cost. Importantly, it is accompanied by large amounts of highly descriptive metadata. . The public is increasingly concerned about the release of personal data. . The E‐Infrastructure Advisory Group identified a pressing need for guidance and dialogue on “Big Data.“30

What is the UK’s Current Role and Position?

Pluses Potentials . The importance of data is recognised in the UK’s e‐ . Increasing availability of large datasets for research.33 infrastructure for Research and Innovation report.31 . Open Access Agenda: Scope for innovative sharing Data is core to the action plan. mechanisms, e.g. government project: ‘Making Public . UK has some of the world’s most complete life Data Public’ – platform to develop new technologies sciences, healthcare, social, environmental and food and services. security data depositories.32 . New ways of doing science emerging as . Strong UK skills base in data analytics, computer computational and communications technologies science and data management. enable massive datasets to be assembled, explored . Strong open data movement and track record of and shared.34 making public data available; more than 6000 data . New big data related businesses could create as many sets released through http://www.data.gov.uk. as 58,000 new jobs until 2017.35

Concerns Opportunities . UK behind US, Japan and some European countries in . To optimise the value of research data, we need an showing coherent national initiatives to respond and infrastructure that enables access, interoperability, drive the “data‐revolution”.36 citation of providers; assures quality and provenance . Many research communities are ill‐prepared for and has users’ confidence – i.e. a national data management of large data volumes. repository network with common standards and . Archives need to be more than just data stores if they services. are to serve a useful future purpose. Data has to be . New software/ skills required to realise full potential stored in a standardised format, and actively curated. of Big Data. . Funding mechanism may not reflect real cost/ . Scope for Research Councils to work together. importance of properly managed data repositories.37 . Data, computation, software and capability need to . Protective attitude of researchers towards “their be treated as interacting entities. data”. . Rapid growth of intelligent search capabilities and . Long‐term preservation and integrity: lack of format machine‐to‐machine services. standardisation jeopardises future usage. . Scope for an enhanced dialogue between academia . Automatic generation and propagation of meta‐data. and industry regarding each other's requirements and . Access to data: need virtual links to avoid moving perceived tensions regarding confidential data and data. the RCUK open data policy. . Lack of understanding of open access licensing options.38

Version 1.1 August 2014 Page | 36 E‐Infrastructure Roadmap Data Infrastructure

How does EPSRC currently contribute to this position? . EPSRC Research Infrastructure Theme is the lead funder of the Research Data Facility (RDF) for HPC related data, co‐located with ARCHER. After the current expansion phase, the RDF will have a capacity of ~ 20 PB of disk space with backup on tape, pre‐/post‐processing clients, improved meta‐data storage and network connection, a user friendly management system and a disaster recovery facility. Part of the RDF will be available to HPC researchers from all scientific areas. . Databases is a research area supported through the ICT theme which EPSRC aims to maintain in terms of funding. . “Data, information and knowledge” and “cloud computing” are both sub‐themes in the RCUK Digital Economy programme – potential infrastructure needs: MTurk (crowd‐sourcing). . EPSRC supports the Academic Centres of Excellence in Cyber Security Research. . EPSRC has supported an number of relevant Centres for Doctoral Training in the recent call, e.g. the EPSRC Centre for Doctoral Training in My Life in Data. . EPSRC have recently published a policy framework on research data which is aligned with an agreed set of RCUK principles: “EPSRC‐funded research data is a public good produced in the public interest and should be made freely and openly available with as few restrictions as possible in a timely and responsible manner (guiding principles allow legal, ethical and commercial constraints on data release). Sufficient metadata should be recorded to allow understanding of further potential and re‐use of the data. It is appropriate to use public funds to support the preservation and management of publicly‐funded research data.”

How do other key stakeholders currently contribute to this? International . European Grid: National Grid Service is UK’s partner. . PRACE: potential role for RDF in secure data storage. . Social media, software and internet service companies (Google, Facebook, Microsoft, Apple). . Publishing industry: Open access policies. . Think tanks and non‐profit organisations: Big Innovation Institute, and Open Data Institute, campaign for open data. . EU: FP7 Capacities Specific Programme focusing on scientific data infrastructures. . International initiatives such as DataCite, Orcid, Research Gate, EUDAT, ELIXIR. UK . Government: The Eight Great Technologies, Shakespeare report calling for National Data Strategy39, Open Data White Paper,40 Data Strategy Board to advise ministers on release of public data, Information Economy Council data strategy.41 . TSB: Enabling communication and information technologies, funded CDEC (Connected Digital Economy Catapult). . Industry: Specific requirements especially with respect to data security and long term integrity. . HEIs: Several universities (for example UCL, Bristol and Southampton) have invested in petascale research data centres. Concerned about on‐going cost of data curation and look for regional alliances and partnerships as well as guidance from the research councils.42 . ESRC: Capital funding 2013 ‐ Business Datasafe (£14M), Understanding Populations (£14M), Administrative Data Research Network (£34M). . BBSRC: ‐ omics data; new Bioscience Facility in Cambridgeshire on bioinformatics data, co‐funded by EMBL‐ EBI. . MRC: 2010 Big Data Strategy Workshop; Research initiative E‐Health informatics research, UK Biobank. . NERC: 2013 capital investment in Big Data and robotics ‐ £13M; stakeholder in RDF.

Version 1.1 August 2014 Page | 37 E‐Infrastructure Roadmap Data Infrastructure

. NERC/Metoffice: joint data analysis facility JASMIN for data from Earth systems modelling. . STFC: Stakeholder in RDF, large datasets by astronomers, particle physicists, GRIDPP. . HEFCE: Supporting the Li Ka Shing Centre for Health Information and Discovery at Oxford University. . Charities et al.: e.g. Sanger Institute, JISC (digital repositories, shared services), UK Data Archive.

Vision for the Future What do we want the UK landscape to look like in 10 years time? . Data intensive research will be supported by the requisite infrastructure and will play a major role in the UK continuing to be at the forefront of Physical Sciences and Engineering research. . The UK will be at the forefront of research and development in data science and analytics. . The UK’s data infrastructure will be well managed and curated, offering a treasure trove of valuable information to generate new knowledge. . Citizens will interactively use and provide data for new communal services in Smart Cities. . Data will increasingly come from non‐traditional sources, including crowd sourcing and social media. . The Cloud may take on an increasingly integrative role in allowing data to be used in an agile and flexible way. . New techniques will be developed to extract knowledge from Data and the infrastructure will be configured to promote and support this. New software to manage, curate and analyse data will be developed and supported. . Students and researchers will have the appropriate training and skills to exploit available data sets. . Researchers will openly share and reuse data, open access publication will be the norm (online publishing in an open access archive ‐ see for example astrophysics). . Crowdsourcing will be ever more important in obtaining and dealing with large amounts of data – the “internet of things” will gather sensor data from connected devices across the globe. . Existing long term preservation and integrity issues will be resolved.

What do we need to do to reach this goal? . Provide further storage capacity for ever increasing data intensive research outcomes (RDF, regional centres, cloud). . Pre‐/post‐processing capability must be directly linked to storage capacity. . Make sure, the stored data is secure, well managed and accessible (e.g. invest in smart tertiary storage and cloud computing). Issues regarding long term data integrity and accessibility must be universally recognised and actively worked on. . Ensure that networks (e.g. JANET) are able to handle the level of data required for universal access to data. . Encourage standardisation of data formats to ensure that stored data can be accessed and reused. . Actively support open data policy to incentivise researchers and businesses to share data – get publishers on board (open access). . Invest in the development of cutting‐edge skills and capability in exploiting data and turning it into useful information. . Provide data management professionals in academia with a better career structure. . Incentivise closer cooperation between data experts and application specialists. . Raise awareness of available data storage and expertise and foster potential synergies between academia and the private sector. What are the barriers to achieving this vision? . People’s attitudes towards data curation and data sharing – people need to appreciate the value of their data.

Version 1.1 August 2014 Page | 38 E‐Infrastructure Roadmap Data Infrastructure

. Insufficient techniques and skills for extracting knowledge from data. . Businesses fear loss of competitive advantage upon data sharing. . Open access publication is very expensive. . Public scandals undermine trust in data repositories. . Anonymisation of data poses challenges. . Data protection laws still relying on 1980 OECD guidelines. . Data storage and compute capacity often scattered, limiting data transfer speed to the available bandwidth.

Actions and Recommendations What is EPSRC’s role in in the short, medium and long‐term? How does this relate to the EPSRC Strategic Goals? Short Term . Enhance the Research Data Facility functionality and make it more user‐friendly to attract a wider user base and increase capacity. The open usage policy should also encourage this. . The extended Research Data Facility will allow faster access to data through increased bandwidth to clients and an improved external network via JANET. It will be directly linked to a pre‐/post processing cluster for primary data visualisation and analysis. . Encourage the HPC Regional Centres to share storage facilities, skills and expertise in analysing and interpreting data. . Actively support open data policies and open access publishing. . Participate in dialogue on data infrastructure requirements and questions regarding security vs. accessibility with user base through regular town meetings. . Work with other EPSRC themes (e.g. Digital Economy, ICT) and the RCUK e‐infrastructure group to co‐ordinate data infrastructure requirements, including security, accessibility and management of data. Medium term: . Provide training possibilities (e.g. as part of Centres for Doctoral Training) to ensure that the software development and data interpretation skills base in the UK remains cutting‐edge. . Improve remote access to data. . Create a national network of centres for Big Data Analytics. Long term: . Review EPSRC’s alignment with TSB priorities in enabling information and communication technologies. . Work with the Digital Economy theme to review infrastructure requirement for MTurk (crowd sourcing), e.g. Amazon. . Work with the community to ensure a healthy data infrastructure in the UK, including solutions for long term data integrity and accessibility.

What is the role of other stakeholders in contributing to this vision? UK: . Government: Take forward a clear, predictable, accountable ‘National Data Strategy’39: twin‐track to release data early, but store final high quality data as publicly accessible National Core Reference Data (transparency AND quality). . BIS capital funding allocation: £189M into ‘Big Data and energy efficient computing.’ . Other Research Councils: Investing in e‐infrastructure – e‐infrastructure for Biosciences, JASMIN for climate and earth system modelling, Administrative Data Research Centres, Business Data Safe, Digital Transformations in Arts and Humanities, Medical Bioinformatics.

Version 1.1 August 2014 Page | 39 E‐Infrastructure Roadmap Data Infrastructure

. Due to the nature of their communities’ research, STFC/NERC/BBSRC might have an even stronger need for super‐large data handling abilities. . Research Community: Raise awareness of importance of data research through school outreach programmes. What is the relationship with the other e‐infrastructure themes? . Hardware and Compute: Processing clusters need to be integrated with storage facilities. . Software Development: Software is needed to analyse, manage and curate data. . Capability: Identified skills gap for data interpreters. . Connections: Fast and reliable access is a key requirement to make effective use of stored data.

References

28 Hey, Tony., Tansley, Stewart. Tolle, Kristin. The Fourth Paradigm: Data‐Intensive Scientific Discovery. Microsoft Corporation, 2009. http://research.microsoft.com/en‐ us/collaboration/fourthparadigm/4th_paradigm_book_complete_lr.pdf 29 Yates, J and N.C Hong, H. Dhanoa and P. Lewis. “National E‐Infrastructure Survey Report.” Department of Business, Innovation and Skills. London, June 2012. http://www.bis.gov.uk/assets/biscore/science/docs/d/12‐1246‐developing‐e‐infrastructure‐in‐ engineering‐and‐manufacturing‐industries 30 Department of Business Innovation and Skills “Report of the e‐Infrastructure Advisory Group.” London, June 2011. http://www.rcuk.ac.uk/documents/documents/e‐IAGreport.pdf 31 Department of Business Innovation and Skills. “Delivering the UK’s E‐Infrastructure for Research and Innovation.” London, July 2010. http://www.rcuk.ac.uk/documents/research/esci/e‐ Infrastructurereviewreport.pdf 32 Tildesley, Dominic. “A Strategic Vision for UK e‐Infrastructure: A roadmap for the development and use of advanced computing, data and networks.” Department of Business Innovation and Skills. London, July 2011. https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/32499/12‐517‐ strategic‐vision‐for‐uk‐e‐infrastructure.pdf 33 Department of Business Innovation and Skills “Report of the e‐Infrastructure Advisory Group.” London, June 2011. http://www.rcuk.ac.uk/documents/documents/e‐IAGreport.pdf 34 Boulton, Geoffrey et al. "Science as an open enterprise: open data for open science.” The Royal Society Centre. Report 02/12. London, June 2012. http://royalsociety.org/uploadedFiles/Royal_Society_Content/policy/projects/sape/2012‐06‐20‐SAOE.pdf 35 Mohamed, Shehan., Ismail., Osman et al. “Data equity: Unlocking the value of big data.” Centre for Economics and Business Research Ltd. London, April 2012. http://www.sas.com/offices/europe/uk/downloads/data‐equity‐cebr.pdf 36 "Strategic Case for a National Network of Data Analytics Centres“, EPSRC 37 Backway, Prue., Office for Science and Innovation. “Developing the UK’s E‐infrastructure for Science and Innovation: Report of the OSI Working Group.” London, January 2006. http://www.nesc.ac.uk/documents/OSI/report.pdf 38 Creative Commons. “Creative Commons: About the Licenses.” http://creativecommons.org/licenses/ [06 December 2013] 39 Shakespeare, Stephan. “Shakespeare Review: An Independent Review of Public Sector Information.” Department of Business Innovation and Skills. London, May 2013.

Version 1.1 August 2014 Page | 40 E‐Infrastructure Roadmap Data Infrastructure

https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/198752/13‐744‐ shakespeare‐review‐of‐public‐sector‐information.pdf 40 Maude, Francis. “Open Data White Paper: Unleashing the Potential.” . London, June 2012. http://data.gov.uk/sites/default/files/Open_data_White_Paper.pdf 41 Department of Business, Innovation and Skills. “Information Economy Strategy.” London, June 2013. https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/206944/13‐901‐ information‐economy‐strategy.pdf 42 Yates, J and N.C Hong, H. Dhanoa and P. Lewis. “National E‐Infrastructure Survey Report.” Department of Business, Innovation and Skills. London, June 2012. http://www.bis.gov.uk/assets/biscore/science/docs/d/12‐1246‐developing‐e‐infrastructure‐in‐ engineering‐and‐manufacturing‐industries

Version 1.1 August 2014 Page | 41 E‐Infrastructure Roadmap Hardware and Compute

Hardware and Compute

Defining the current UK landscape Definition: Hardware and compute covers the computing hardware used to carry out research (modelling, simulation, data analysis and visualisation). It ranges from desk‐top machines, through lab/departmental machines, university and regional systems, to the national HPC service and international machines (such as PRACE or DoE machines in the US). It also includes Cloud services, both commercial and academic.

What is the international context? What are the key challenges being addressed globally at this time? . US, Japan and China have made major investments in large machines for a variety of political/scientific reasons. This has had the by‐product that the hardware companies in these countries are well‐supported. Germany remains the key benchmark country in Europe in terms of power of computer systems. . PRACE: Discussions on‐going about the next phase of PRACE. Possibility of creating a consortium of countries to purchase a shared Tier 0 machine. . Main EC drivers around strengthening/developing the supply‐side, and business innovation rather than as research tool per se: ETP4HPC roadmap and research agenda, and conclusions of EU Competitiveness Council . Architecture road‐map is as follows: . x86 architecture likely to remain the standard for CPUs. Multi‐core offer challenges for existing codes . Accelerators: increasing use but cause significant challenges for most existing codes . New entrants: heterogeneous (CPU/accelerator), ARM . Co‐processors (such as Intel Xeon Phi) may well provide alternative with better user adoption (ease of use and standard paradigm ‐ OpenMP) . `System on a chip` (SOC) such as AMD’s APU, NVIDIA Tegra . Energy efficiency/power usage. In hardware, drawing on issues from mobile computing. . Specialist/tailored machines, where the architecture `matches` the application code requirements, versus general purpose machines . Exascale is driving hardware innovation by trying to deliver performance within a given timeframe and power envelope, such as new memory technologies, optical interconnects, scaling I/O subsystems. All these issues will have implications for software. . Cloud is increasingly used by both `non‐HPC` and HPC researchers.

Version 1.1 August 2014 Page | 42 E‐Infrastructure Roadmap Hardware and Compute

What is the UK’s Current Role and Position?

Pluses Potentials . UK has a well‐populated ecosystem across the levels . A subset of UK researchers need access to the largest of the Tier pyramid. machines . National system at appropriate level internationally . Most researchers will have their requirements met by . UK active in development of future PRACE model national, regional and local machines. (consortium). . Increased innovation in the scale and complexity of . UK companies (both users and suppliers). modelling and simulation that are possible. . Leadership position in computational science in some . Specialist/tailored machines. areas. . Use of Cloud in research.

Concerns Opportunities . Integration between levels across UK still immature. . Integration across the ecosystem: global `AU` Role of new regional centres still developing. currency for users (see Connections). . Sustainability and predictability of infrastructure not . Partnerships with other countries (as well as PRACE). well‐addressed. Irregular bursts of capital, no long‐ . Co‐design activities with UK and international term investment plan. companies on new architectures, for example . Need to understand the business models – selling equivalent to Mont Blanc project. cycles may not be sufficient. . Development of good set of KPIs for science outputs . Ensuring we have the running costs and people with and impact. appropriate skills to run the systems. . Joint Government/Industry funded large scale HPC . Not good at demonstrating impact of investment. facilities dedicated to industrial research and exploitation of HPC, meeting the requirements of both SME and Large organisations.

How does EPSRC currently contribute to this position? . EPSRC is the UK representative on PRACE and administers the peer review of UK applications to the Distributed European Computing Initiative (DECI). . EPSRC co‐ordinates the business case and procurement of the UK national service (ARCHER) in partnership with NERC, and is the managing agent of the national service (ARCHER) on behalf of the partner Research Councils. . EPSRC supports the costs of the usage of HPC facilities and Cloud on grants. . EPSRC distributes access to the national HPC service for the EPS community using a variety of mechanisms, including the High End Computing Consortia and the Resource Allocation Panel (RAP). . EPSRC provided the initial capital investment to support the 5 Regional HPC centres. . Provides funding for small equipment which forms Figure 5: Computing provision supported by EPSRC part of the Tier‐2 layer / base of ecosystem. . In collaboration with JISC, EPSRC has supported a number of Cloud Pilot projects to explore and develop new cloud computing technologies. A cost analysis study has also been supported. . EPSRC currently chairs the RCUK E‐infrastructure group.

Version 1.1 August 2014 Page | 43 E‐Infrastructure Roadmap Hardware and Compute

. EPSRC provides direct input to the national strategy, holding an advisory role to government departments and making significant contributions to recent reports on the E‐infrastructure eco‐system. . EPSRC defines and fund training and technology support programmes.

How do other key stakeholders currently contribute to this? International . PRACE: PRACE (Partnership for Advanced Computing in Europe) looks to enable high impact scientific discovery and engineering research and development across all disciplines to enhance European competitiveness for the benefit of society. They provide support for training and HPC access. . DECI: As part of PRACE, DECI (Distributed European Computing Initiative) enables European researchers to obtain access to the most powerful national (Tier‐1) computing resources in Europe regardless of their country of origin or employment and to enhance the impact of European science and technology at the highest level. . US: Individual researchers have links / access to US machines, for example through INCITE programme UK . Other RCs: Provide specialist machines to their communities, for example STFC supports DiRAC and NERC supports MONSOON. . Other HPC providers: Universities (who work together through HPC‐SIG), regional centres, HPC Wales, Hartree centre. AWE and Met Office are major HPC users, with their own in‐house systems. . TSB: key focus on supporting the UK supply industry. In addition, Catapult Centres are focused on developing links with SMEs and may be involved in providing links to existing HPC facilities. . Industry users: Collaboration with academic groups. Some large companies also have their own in‐house computing resources.

Vision for the Future What do we want the UK landscape to look like in 10 years’ time? . The UK will have a computing ecosystem that is: . Appropriately populated. . Balanced and integrated across the tiers. . Regarded as amongst the best in the world amidst stiff competition. . Responsive, to meet the needs of internationally competitive science. . Increased numbers and range of users (both academic and industrial) will be doing cutting‐edge, internationally competitive computational science and engineering research. . The UK will have strong involvement in and links with international e‐infrastructure (as the country will not be able to meet the costs alone). . A long‐term investment plan for both capital and recurrent running costs will have been developed. . Researchers will be engaged appropriately with infrastructure development. This may mean that new models for user engagement need to be employed in order to ensure that the users of today and tomorrow are involved in this process.

What do we need to do to reach this goal? . Continuous and pragmatic technology watch and user engagement. . Ensure that UK researchers have access to new technologies and new architectures so that they can be prepared for future hardware changes. . Ensure that the UK has an integrated business plan for hardware, which focuses on long‐term sustainability and covers recurrent spending and support. . Ensure that the UK’s ecosystem is structured and managed to allow users to move between the levels easily. Version 1.1 August 2014 Page | 44 E‐Infrastructure Roadmap Hardware and Compute

. Ensure strong links with international partners and national centres, for example through the movement of people.

What are the barriers to achieving this vision? . Cost. . Cultural issues in a diverse constituency: . I need a different machine to you for my research . I must have my own. . I can’t share it with anyone. . My machine is bigger than yours. . Procurement processes are not very agile or flexible. . Future hardware changing rapidly and radically. Diversity of hardware can be a positive driver for software development: need to be able to run the same software on a range of different environments. Implies software as a service, machine virtualisation. . The development of better models for industrial engagement and usage, including the range of different requirements arising from collaborative research, pre‐competitive in‐house research and commercially sensitive research.

Actions and Recommendations What is EPSRC’s role in in the short, medium and long‐term? How does this relate to the EPSRC Strategic Goals? Short Term . Maintain links with other elements in the eco‐system (HPC‐SIG, regional centres, DiRAC etc.) using the existing forums. . Continue UK involvement in PRACE at current levels, keeping a watching brief on future developments. Exploring the potential of a consortium model and also ensuring a coherent development of Tier‐1 services. . On‐going management of national service to high standards, ensuring a wide range of users and maximum usage. . Develop and populate an impact framework for the national service, and put in place a process for gathering the evidence. Medium Term . Fund a `future architectures` study which includes getting access to appropriate hardware pilots/test beds. The precise specification and requirement for such a study will be developed in consultation with the community to ensure that previous Architecture Comparison Exercises are built on. . Develop closer links with the US, to explore potential for greater collaboration and access for UK researchers. . Catalyse links between TSB and EPSRC’s ICT programme to ensure UK companies and researchers are benefiting from any ETP opportunities in co‐design and other R and D activities. Long Term . Ensure UK has an awareness of the technology roadmap into the future. . Work with other stakeholders in the ecosystem to develop a business case for the whole ecosystem in an integrated way. . Maintain competitive provision across an appropriate range of architectures and functionalities.

Version 1.1 August 2014 Page | 45 E‐Infrastructure Roadmap Hardware and Compute

What is the role of other stakeholders in contributing to this vision? . Hartree: Potential for collaboration on hardware testbeds . Suppliers: Involvement in ETP4HPC via TSB? Provision of test beds? Commercial suppliers of Cloud. . HEIs: HPC‐SIG key partners in moving towards an integrated ecosystem. . Regional and specialist: HPC Wales, regional centres: interactions with SMEs; DiRAC: co‐design and integration example; Met Office: potential partnership. . Academic and industrial users: Engagement in requirements and specification process. Help in understanding industrial usage and needs.

What is the relationship with the other e‐infrastructure themes? . Software Development: Co‐design agenda, software compatibility. . Data Infrastructure: There needs to be a close link between flops and bytes – both in terms of network connections and co‐location to enable rapid data transfer, and technical compatibility. . Capability: If you don’t have skilled people the hardware will not be used appropriately. . Connections: Integration of access to hardware requires systems to be aligned and connected through networks

References

47 Yates, J and N.C Hong, H. Dhanoa and P. Lewis. “National E‐Infrastructure Survey Report.” Department for Business Innovation and Skills. London, June 2012. http://www.bis.gov.uk/assets/biscore/science/docs/d/12‐1246‐developing‐e‐infrastructure‐in‐engineering‐ and‐manufacturing‐industries

48 Department for Business Innovation and Skills “Report of the e‐Infrastructure Advisory Group.” London, June 2011. http://www.rcuk.ac.uk/documents/documents/e‐IAGreport.pdf

49 Tildesley, Dominic. “A Strategic Vision for UK e‐Infrastructure: A roadmap for the development and use of advanced computing, data and networks.” Department for Business Innovation and Skills. London, July 2011. https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/32499/12‐517‐ strategic‐vision‐for‐uk‐e‐infrastructure.pdf

50 Council of the European Union. “Conclusions on ‘High Performance Computing: Europe’s place in a Global Race’: 342nd Competitiveness (Internal Market, Industry, Research and Space) Council Meeting.” Council of the European Union. Brussels, 30 May 2013.

Version 1.1 August 2014 Page | 46 E‐Infrastructure Roadmap Integration and Coordination

Integration and Coordination

What needs integrating? We need to consider how best to integrate:

. Vertically across the eco‐system, so users have easy access to the type of e‐infrastructure they need; . Horizontally across the different elements. For practical purposes, the elements of e‐infrastructure, plus the people/skills/knowledge aspects have been treated separately in this roadmap. However, as the diagram indicates, it is important to ensure a balance across all the elements when planning investments, so that there is a coherent whole; . Across the different research communities and the different stakeholders.

Our aspiration is for the UK to have an integrated e‐infrastructure: one that is run and managed as a whole without silos or boundaries, where there are simple processes by which users can get access to the e‐ infrastructure they need across the eco‐system, as appropriate for the type or stage of research they are doing. Obviously, EPSRC is not directly responsible for the whole ecosystem, but we will exert our influence where possible to help achieve this aim.

The EPS research community is a very diverse one. This provides excellent opportunities for multi‐disciplinary research, but also raises the challenge of ensuring that we have a good interaction with and understanding of the very wide range of research fields that may require e‐infrastructure. In addition, we need to take into account the requirements of different types of researchers (academic users from students to senior professors, industrial users from a range of sectors). We will think imaginatively about new ways to help us to reach e‐infrastructure users, in particular new users from different research fields, and young researchers.

EPSRC will work closely with other e‐infrastructure providers (e.g. universities, international, commercial cloud providers) to ensure that EPS users have the appropriate e‐infrastructure for their research, across the pyramid. EPSRC will also continue to develop collaborative relationships with international providers of e‐ infrastructure to enable UK researchers to gain access to a wide range of e‐infrastructure and to work collaboratively with researchers in other countries.

Whatever strategy and tactics EPSRC adopts for e‐infrastructure need to be related to and linked with the wider context; acting unilaterally may or may not give the optimal outcome for our researchers. We need to coordinate with colleagues in the other Research Councils and other stakeholders, through the RCUK e‐ Infrastructure Group, in order to reach the goal of an integrated e‐infrastructure across the UK. This group has the following terms of reference:

. To maintain an overall research and training strategy for e‐infrastructure, with coordinated implementation plans. This would be updated annually to ensure it remains relevant and is in place ready for each spending review. . To develop the necessary business cases for e‐infrastructure investment. . To share information on member‐specific activities in e‐infrastructure.

Through this group, we also have links with the e‐Infrastructure Leadership Council, giving input as required.

Version 1.1 August 2014 Page | 47 E‐Infrastructure Roadmap Integration and Coordination Integration Actions User interactions . We will continue to fund activities that support integration and collaboration within the research community; for instance, CCPs, consortia, networks, international interactions. . We will continue to have close interactions with the existing CCPs and consortia to support them in their work of community building and user interactions. . We will explore a range of means of gathering user views and requirements, with a particular focus on reaching out to new communities and young researchers, as well as maintaining our interactions with current expert users. . User views on and requirements of the national service will be gathered through the regular user meetings, and also through the governance structure, where a Scientific Advisory Committee or equivalent will provide the user perspective on the service. . The Research Infrastructure team has identified contacts from the other EPSRC Themes. This will enable us both to gather views and inputs from across the spectrum, and to give information about e‐infrastructure. . The e‐Infrastructure Strategic Advisory Team will provide advice and input on all aspects of e‐infrastructure to the Research Infrastructure Theme at EPSRC, and through us to the rest of the organisation. We will increase the industrial membership of the SAT to ensure we get an industrial user perspective. Providers . EPSRC will continue to engage with a range of providers, for instance through the HPC‐SIG and through regional centres, encouraging the development of an integrated offering to both academic and industrial users. . We will continue to keep a watching brief on the potential of cloud services for research, and we will fund further studies to supplement those already completed, if required. International links . We will ensure that we continue to engage constructively with our European colleagues in PRACE on the provision of a European e‐infrastructure. . We will also establish links with Xsede and INCITE, and the funding bodies (NSF and DoE) in the US with the aim of collaborating with them.

Version 1.1 August 2014 Page | 48 E‐Infrastructure Roadmap Integration and Coordination

10 year investment plan to upgrade and maintain EPSRC e‐infrastructure resources

The table below gives an indication of the sort of investments that may be needed. It does not imply that these budgets will be available, as this depends on Government Spending Review allocations. It does not cover the investment in e‐infrastructure that universities will need to make, or investments in networking by JANET.

Table 1: 10 year investment plan to upgrade and maintain EPSRC e‐infrastruccture

[1] This only includes the software development grrants supported by the Research Infrastructure theme, and does not incluude the support for fellowships and sstudentships and responsive mode applications from other EPSRC themes.

Version 1.1 August 2014 Page | 49 E‐Infrastructure Roadmap Pathways to Impact Pathways to Impactc

In order to make a strong science and business case for future e‐infrastructure investmments, it is essential that we demonstrate the impact of past investment. There are many types of impact, and the table below captures the range of mmeasures we intend to use. Impact Actions

. We will workk with the community and the service providers to develop and populate the impact framework below. . We will workk closely with companies and trade bodies to understand their requiremeents for computational research and how these can be addressed. In the first instance, we will talk to EPSRC’s strategic partners. . We will workk closely with the regional centres as they establish connections to SMEs, and with the national service and the Hartree Centre to learn from their experience working with companies, in order to develop models for supporting industrial use of e‐infrastructure. . We will improve and increase industrial representtation on our peer review and advisory bodies. . We will explore how best to fund the development of effective case studies. Impact Framework

Table 2: Impact Framework

Version 1.1 August 2014 Page | 50 E‐Infrastructure Roadmap Glossary Glossary

ARCHER ARCHER will be the new national HPC service, managed by EPSRC BBSRC Biotechnology and Biological Sciences Research Council BIS Department of Business Innovation and Skills CCP Collaborative Computational Projects CDTs Centres for Doctoral Training CIRST Computer Incident Response and Security Team CPD Continuing Professional Development CSE Computational Science and Engineering Delivery of Advanced Network Technology to Europe: The managing partner, project DANTE co‐ordinator and operator of the GÉANT network dCSE Distributed Computational Science and Engineering DECI Distributed European Computing Initiative The DiRAC Facility provides HPC services for the UK Theoretical Particle Physics and DiRAC Theoretical Astrophysics Community. DoE Department of Energy DTP Doctoral Training Partnership e‐IAUCF UK e‐Infrastructure Academic User Community Forum ELC E‐Leadership Council EPCC Edinburgh Parallel Computing Centre EPSRC Engineering and Physical Sciences Research Council ETP4HPC The European Technology Platform for High Performance Computing GÉANT is the pan‐European data network for the research and education community. GÉANT It links national research and education networks (NRENs) across Europe GridPP is a collaboration of particle physicists and computer scientists from the and CERN. They manage and maintain a distributed computing grid GridPP across the UK with the primary aim of providing resources to particle physicists working on the Large Hadron Collider experiments at CERN HEC High End Computing HECToR is the UK's high‐end computing resource, funded by the UK Research HECToR Councils. It is available for use by academia and industry in the UK and Europe. HEI Higher Education Institution HPC High Performance Computing HPC‐SIG High Performance Computing ‐ Special Interest Group I‐CASE Industrial Case ILL Institut Laue–Langevin It is one of the world centres for research using neutrons Innovative and Novel Computational Impact on Theory and Experiment (INCITE) INCITE program is operated by the Argonne and Oak Ridge Leadership Computing Facilities ISIS is a pulsed neutron and muon source is part of the Science and Technology ISIS Facilities Council UK government‐funded organisation, which provides computer network and related JANET collaborative services to UK research and education. JASMIN is a "super‐data‐cluster", and it is being deployed on behalf of NCAS at the JASMIN STFC Rutherford Appleton Laboratory. We are a registered charity and champion the use of digital technologies in UK JISC education and research. MONSooN is a shared supercomputing service jointly funded by Met Office and NERC MONSooN to facilitate collaborative research.

Version 1.1 August 2014 Page | 51 E‐Infrastructure Roadmap Glossary

MOOCs Massive Open Online Courses NAG Numerical Algorithm Group The Centre for Numerical Algorithms and Intelligent Software Science and Innovation NAIS Award: http://www.nais.org.uk/ NERC Natural Environment Research Council NRENs National Research and Education Networks NSCCS National service for computational chemistry software NSF National Science Foundation OEMs Original Equipment Manufacturer PRACE Partnership for Advanced Computing in Europe RA Research Associate RC Research Councils RCUK Research Councils UK REF Research Excellence Framework SMEs Small to Medium Enterprises SP Service Provision SSI Software Sustainability Institute STFC Science and Technology Facilities Council TSB Technology Strategy Board VC Vice‐Chancellor The Extreme Science and Engineering Discovery Environment (XSEDE) is supported by Xsede the National Science Foundation

Version 1.1 August 2014 Page | 52