The Challenges and Prospects of the Intersection of Humanities and Data Science: a White Paper from the Alan Turing Institute the Alan Turing Institute
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Annual Report 2018–19
Annual report 2018–19 2018 –19 Annual Report 2018–19 Part 1 Our year 3 Part 2 Trustees’ and strategic report 74 Part 3 Financial statements 84 Our year 1.1 Chair’s foreword 1.2 Institute Director and Chief Executive’s foreword 1 1.3 Outputs, impact and equalities 1.4 Turing trends and statistics 1.5 Research mapping: Linking research, people and activities 3 Creating real world impact We have had yet another year of rapid growth Since 2016, researchers at the Institute have produced and remarkable progress. over 200 publications in leading journals and demonstrated early impacts with great potential. Some of the impacts The Alan Turing Institute is a multidisciplinary institute are captured in this annual report and we are excited to formed through partnerships with some of the UK’s see the translation of our data science and AI research leading universities. This innovative approach gives us into a real-world context. access to expertise and skills in data science and AI that is unparalleled outside of a handful of large tech platforms. We are witnessing a massive growth of data science Our substantial convening power enables us to work and AI. We are also seeing important challenges and across the economy: with large corporations, the public breakthroughs in many areas including safe and ethical sector including government departments, charitable AI, quantum computing, urban transport, defence, foundations and small businesses. These collaborations manufacturing, health and financial services. Across cut across disciplines and break through institutional the Institute we are leading the public conversation in boundaries. -
STEM Education Centre E-Bulletin: March 2014
STEM Education Centre E-bulletin: March 2014 Welcome to the March e-bulletin from the University of Birmingham’s STEM Education Centre. It is designed to provide you with information about STEM-related events, resources, news and updates that may be of interest to you and your colleagues. STEM Policy and Practice News Budget 2014: Alan Turing Institute to lead big data research A £42m Alan Turing Institute is to be founded to ensure that Britain leads the way in big data and algorithm research, George Osborne has announced. Drawing on the name of the British mathematician who led code-breaking work at Bletchley Park during the World War II, the institute is intended to help British companies by bringing together expertise and experience in tackling problems requiring huge computational power. Turing’s work led to the cracking of the German "Enigma" codes, which used highly compl ex encryption, is believed to have saved hundreds or even thousands of lives. He later formed a number of theories that underpin modern computing, and formalised the idea of algorithms – sequences of instructions – for a computer. This announcement marks further official rehabilitation of a scientist who many see as having been treated badly by the British government after his work during the war. Turing, who was gay, was convicted of indecency in March 1952, and lost his security clearance with GCHQ, the successor to Bletchley Park. He killed himself in June 1954. Only after a series of public campaigns was he given an official pardon by the UK government in December 2013. Public Attitudes to Science (PAS) 2014 Public Attitudes to Science (PAS) 2014 is the fifth in a series of studies looking at attitudes to science, scientists and science policy among the UK public. -
Automating Data Science: Prospects and Challenges
Final m/s version of paper accepted (April 2021) for publication in Communications of the ACM. Please cite the journal version when it is published; this will contain the final version of the figures. Automating Data Science: Prospects and Challenges Tijl De Bie Luc De Raedt IDLab – Dept. of Electronics and Information Systems Dept. of Computer Science AASS Ghent University KU Leuven Örebro University Belgium Belgium Sweden José Hernández-Orallo Holger H. Hoos vrAIn LIACS Universitat Politècnica de València Universiteit Leiden Spain The Netherlands Padhraic Smyth Christopher K. I. Williams Department of Computer Science School of Informatics Alan Turing Institute University of California, Irvine University of Edinburgh London USA United Kingdom United Kingdom May 13, 2021 Given the complexity of typical data science projects and the associated demand for human expertise, automation has the potential to transform the data science process. Key insights arXiv:2105.05699v1 [cs.DB] 12 May 2021 • Automation in data science aims to facilitate and transform the work of data scientists, not to replace them. • Important parts of data science are already being automated, especially in the modeling stages, where techniques such as automated machine learning (Au- toML) are gaining traction. • Other aspects are harder to automate, not only because of technological chal- lenges, but because open-ended and context-dependent tasks require human in- teraction. 1 Introduction Data science covers the full spectrum of deriving insight from data, from initial data gathering and interpretation, via processing and engineering of data, and exploration and modeling, to eventually producing novel insights and decision support systems. Data science can be viewed as overlapping or broader in scope than other data-analytic methodological disciplines, such as statistics, machine learning, databases, or visualization [10]. -
Alan Turing Institute Symposium on Reproducibility for Data-Intensive Research – FINAL REPORT Page | 1 Lucie C
Alan Turing Institute Symposium on Reproducibility for Data-Intensive Research – FINAL REPORT Page | 1 Lucie C. Burgess, David Crotty, David de Roure, Jeremy Gibbons, Carole Goble, Paolo Missier, Richard Mortier, 21 July 2016 Thomas E. Nichols, Richard O’Beirne Dickson Poon China Centre, St. Hugh’s College, University of Oxford, 6th and 7th April 2016 ALAN TURING INSTITUTE SYMPOSIUM ON REPRODUCIBILITY FOR DATA-INTENSIVE RESEARCH – FINAL REPORT AUTHORS Lucie C. Burgess1, David Crotty2, David de Roure1, Jeremy Gibbons1, Carole Goble3, Paolo Missier4, Richard Mortier5, Thomas E. Nichols6, Richard O’Beirne2 1. University of Oxford 2. Oxford University Press 3. University of Manchester 4. Newcastle University 5. University of Cambridge 6. University of Warwick TABLE OF CONTENTS 1. Symposium overview – p.2 1.1 Objectives 1.2 Citing this report 1.3 Organising committee 1.4 Format 1.5 Delegates 1.6 Funding 2. Executive summary – p.4 2.1 What is reproducibility and why is it important? 2.2 Summary of recommendations 3. Data provenance to support reproducibility – p.7 4. Computational models and simulations – p.10 5. Reproducibility for real-time big data – p.13 6. Publication of data-intensive research – p.16 7. Novel architectures and infrastructure to support reproducibility – p.19 8. References - p.22 Annex A – Symposium programme and biographies for the organising committee and speakers Annex B – Delegate list ACKNOWLEDGEMENTS The Organising Committee would like to thanks the speakers, session chairs and delegates to the symposium, and in particular all those who wrote notes during the sessions and provided feedback on the report. This report is available online at: https://osf.io/bcef5/ (report and supplementary materials) https://dx.doi.org/10.6084/m9.figshare.3487382 (report only) Alan Turing Institute Symposium on Reproducibility for Data-Intensive Research – FINAL REPORT Page | 2 Lucie C. -
CV Adilet Otemissov
CV Adilet Otemissov RESEARCH INTERESTS Global Optimization; Stochastic Methods for Optimisation; Random Matrices EDUCATION DPhil (PhD) in Mathematics 2016{2021 University of Oxford The Alan Turing Institute, the UK's national institute for data science and AI - Thesis title: Dimensionality reduction techniques for global optimization (supervised by Dr Coralia Cartis) MSc in Applied Mathematics with Industrial Modelling 2015{2016 University of Manchester - Thesis title: Matrix Completion and Euclidean Distance Matrices (supervised by Dr Martin Lotz) BSc in Mathematics 2011{2015 Nazarbayev University AWARDS & ACHIEVEMENTS - SEDA Professional Development Framework Supporting Learning award for successfully completing Developing Learning and Teaching course; more details can be found on https://www.seda.ac.uk/supporting-learning, 2020 - The Alan Turing Doctoral Studentship: highly competitive full PhD funding, providing also a vibrant interdisciplinary research environment at the Institute in London, as well as extensive training in the state-of-the-art data science subjects from world's leading researchers, 2016{2020 - NAG MSc student prize in Applied Mathematics with Industrial Modelling for achieving the highest overall mark in the numerical analysis course units in my cohort at the University of Manchester, 2016 PUBLICATIONS - C. Cartis, E. Massart, and A. Otemissov. Applications of conic integral geometry in global optimization. 2021. Research completed, in preparation - C. Cartis, E. Massart, and A. Otemissov. Constrained global optimization of functions with low effective dimensionality using multiple random embeddings. ArXiv e-prints, page arXiv:2009.10446, 2020. Manuscript (41 pages) submitted for publication in Mathematical Programming. Got back from the first round of reviewing - C. Cartis and A. Otemissov. A dimensionality reduction technique for unconstrained global optimization of functions with low effective dimensionality. -
Seshat: Global History Databank Publishes First Set of Historical Data the Seshat Project the Evolution Institute
Cliodynamics: The Journal of Quantitative History and Cultural Evolution Seshat: Global History Databank Publishes First Set of Historical Data The Seshat Project The Evolution Institute Abstract This short report describes the publication of the first batch of historical data produced by Seshat: Global History Databank. The data is available as free, open access material here; see also our website for more information on the Seshat project as a whole. Figure 1. Screen-grab of map showing regions included in first batch of data published to http://dacura.scss.tcd.ie/seshat/. Seshat: Global History Databank Seshat: Global History Databank (Seshat) has published its first batch of systematically coded, expert-vetted, and referenced historical data, which can be accessed at http://dacura.scss.tcd.ie/seshat/. Seshat is a large, online, open- access store of information about the human past, a groundbreaking resource that is bringing together the most current and comprehensive body of knowledge about human history available. Previously, our collective knowledge of history remained scattered throughout various texts and isolated in the brains of individual historians. Seshat: Global History Databank gathers as much of this knowledge as Corresponding author’s e-mail: [email protected] Citation: Seshat Project. 2017. Seshat: Global History Databank Publishes First Set of Historical Data. Cliodynamics 8: 75–79. Seshat Project: First Set of Data. Cliodynamics 8:1 (2017) possible into a single, large database that can be used to test scientific hypotheses about the evolution of human societies during the last 10,000 years. The Seshat Project works directly with academic experts on past societies who volunteer their knowledge on the political and social organization of human groups from the Neolithic to the modern period. -
Article Seshat Methodology
A Macroscope for Global History Seshat Global History Databank: a methodological overview Authors: Pieter François (1, 2), Joseph Manning (3), Harvey Whitehouse (2), Robert Brennan (4), Thomas Currie (5), Kevin Feeney (4), Peter Turchin (6). 1 University of Hertfordshire 2 University of Oxford 3 Yale University 4 Trinity College Dublin 5 University of Exeter, Penryn Campus 6 University of Connecticut Abstract This article introduces the ‘Seshat: Global History’ project, the methodology it is based upon and its potential as a tool for historians and other humanists. The article describes in detail how the Seshat methodology and platform can be used to tackle big questions that play out over long time scales whilst allowing users to drill down to the detail and place every single data point both in its historic and historiographical context. Seshat thus offers a platform underpinned by a rigorous methodology to actually do 'longue durée' history and the article argues for the need for humanists and social scientists to engage with data driven ‘longue durée' history. The article argues that Seshat offers a much needed infrastructure in which different skills sets and disciplines can come together to analyze the past using long timescales. In addition to highlighting the theoretical and methodological underpinnings, the potential of Seshat is demonstrated by showcasing three case studies. Each of these case studies is centred around a set of long standing questions and historiographical debates and it is argued that the introduction of a -
A New Era in the Study of Global History Is Born but It Needs to Be Nurtured
[JCH 5.1-2 (2018–19)] JCH (print) ISSN 2051-9672 https://doi.org/10.1558/jch.39422 JCH (online) ISSN 2051-9680 A New Era in the Study of Global History is Born but It Needs to be Nurtured Harvey Whitehouse1 University of Oxford, UK Email: [email protected] (corresponding author) Peter Turchin2 University of Connecticut Email: [email protected] (corresponding author) Pieter François3, Patrick E. Savage4, Thomas E. Currie5, Kevin C. Feeney6, Enrico Cioni7, Rosalind Purcell8, Robert M. Ross9, Jennifer Larson10, John Baines11, Barend ter Haar12, R. Alan Covey13 Abstract: Thisa rticle is a response to Slingerland e t al. who criticize the quality of the data from Seshat: Global History Databank utilized in our Nature paper entitled “Complex Societies Precede Moralizing Gods throughout World History”. Their cri- tique centres around the roles played by research assistants and experts in procuring and curating data, periodization structure, and so-called “data pasting” and “data fill- ing”. We show that these criticisms are based on misunderstandings or misrepresenta- tions of the methods used by Seshat researchers. Overall, Slingerland et al.’s critique (which is crosslinked online here) does not call into question any of our main findings, but it does highlight various shortcomings of Slingerland et al.’s database project. Our collective efforts to code and quantify features of global history hold out the promise of a new era in the study of global history but only if critique can be conducted con- structively in good faith and both the benefitsa nd the pitfalls of open science fully recognized. -
Enabling a World-Leading Regional Digital Economy Through Data Driven Innovation
©Jonathan Clark Fine Art, Representatives of the Artist's Estate Enabling a World-Leading Regional Digital Economy through Data Driven Innovation Edinburgh & South East Scotland City Region A Science and Innovation Audit Report sponsored by the Department for Business, Energy & Industrial Strategy MAIN REPORT Edinburgh and South East Scotland City Region Consortium Foreword In Autumn 2015 the UK Government announced regional Science and Innovation Audits (SIAs) to catalyse a new approach to regional economic development. SIAs enable local Consortia to focus on analysing regional strengths and identify mechanisms to realise their potential. In the Edinburgh and South East Scotland City Region, a Consortium was formed to focus on our rapidly growing strength in Data-Driven Innovation (DDI). This report presents the results which includes broad-ranging analysis of the City Region’s DDI capabilities, the challenges and the substantial opportunities for future economic growth. Data-led disruption will be at the heart of future growth in the Digital Economy, and will enable transformational change across the broader economy. However, the exact scope, scale and timing of these impacts remains unclear1. The questions we face are simple yet far-reaching – given this uncertainty, is now the right time to commit fully to the DDI opportunity? If so, what are the investments we should make to optimise regional economic growth? This SIA provides evidence that we must take concerted action now to best position the City Region to benefit from the disruptive effects of DDI. If investment is deferred, we run the risk of losing both competitiveness and output to other leading digital clusters that have the confidence to invest. -
PCAST Report
INDUSTRIES OF THE FUTURE INSTITUTES: A NEW MODEL FOR AMERICAN SCIENCE AND TECHNOLOGY LEADERSHIP A Report to the President of the United States of America The President’s Council of Advisors on Science and Technology January 2021 INDUSTRIES OF THE FUTURE INSTITUTES: A NEW MODEL FOR AMERICAN SCIENCE AND TECHNOLOGY LEADERSHIP About the President’s Council of Advisors on Science and Technology Created by Executive Order in 2019, PCAST advises the President on matters involving science, technology, education, and innovation policy. The Council also provides the President with scientific and technical information that is needed to inform public policy relating to the American economy, the American worker, national and homeland security, and other topics. Members include distinguished individuals from sectors outside of the Federal Government having diverse perspectives and expertise in science, technology, education, and innovation. More information is available at https://science.osti.gov/About/PCAST. About this Document This document follows up on a recommendation from PCAST’s report, released June 30, 2020, involving the formation of a new type of multi-sector research and development organization: Industries of the Future Institutes (IotFIs). This document provides a framework to inform the design of IotFIs and thus should be used as preliminary guidance by funders and as a starting point for discussion among those considering participation. The features described here are not intended to be a comprehensive list, nor is it necessary that each IotFI have every feature detailed here. Month 2020 – i – INDUSTRIES OF THE FUTURE INSTITUTES: A NEW MODEL FOR AMERICAN SCIENCE AND TECHNOLOGY LEADERSHIP The President’s Council of Advisors on Science and Technology Chair Kelvin K. -
Artificial Intelligence Research in Northern Ireland and the Potential for a Regional Centre of Excellence
Artificial Intelligence Research in Northern Ireland and the Potential for a Regional Centre of Excellence Review Prepared by Ken Guy1 and Rob Procter2 with inputs from Franz Kiraly3 and Adrian Weller4 on behalf of The Alan Turing Institute5 April 2019 1 Ken Guy, Director, Wise Guys Ltd. 2 Professor Rob Procter, Turing Fellow, University of Warwick 3 Dr. Franz Kiraly, Turing Fellow, University College London 4 Dr. Adrian Weller, Programme Director and Turing Fellow, University of Cambridge 5 Although prepared on behalf of The Alan Turing Institute, responsibility for the views expressed in this document lie solely with the two main authors, Ken Guy and Rob Procter. Contents Foreword by Dr. Robert Grundy, Chair of Matrix ................................................................... i Foreword by Tom Gray, Chair of the Review Panel ............................................................. ii Executive Summary ................................................................................................................ vi 1. Introduction ....................................................................................................................... 11 1.1 Objectives ......................................................................................................................... 12 1.2 Expectations ..................................................................................................................... 13 1.3 Methodology ................................................................................................................... -
A New Solution to Data Harvesting and Knowledge Extraction for Archaeology
Dacura: A New Solution to Data Harvesting and Knowledge Extraction for Archaeology Peter N. Peregrine Rob Brennan Thomas Currie Kevin Feeney Pieter François Peter Turchin, and Harvey Whitehouse SFI WORKING PAPER: 2017-07-023 SFI Working Papers contain accounts of scienti5ic work of the author(s) and do not necessarily represent the views of the Santa Fe Institute. We accept papers intended for publication in peer-reviewed journals or proceedings volumes, but not papers that have already appeared in print. Except for papers by our external faculty, papers must be based on work done at SFI, inspired by an invited visit to or collaboration at SFI, or funded by an SFI grant. ©NOTICE: This working paper is included by permission of the contributing author(s) as a means to ensure timely distribution of the scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the author(s). It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may be reposted only with the explicit permission of the copyright holder. www.santafe.edu SANTA FE INSTITUTE Dacura: A New Solution to Data Harvesting and Knowledge Extraction for Archaeology Peter N. Peregrine, Rob Brennan, Thomas Currie, Kevin Feeney, Pieter François, Peter Turchin, and Harvey Whitehouse Peter N. Peregrine, Lawrence University, 711 E. Boldt Way, Appleton WI 54911 and Santa Fe Institute, 1399 Hyde Park Road, Santa Fe NM, 87501 ([email protected])