Design Patterns and Software Techniques for Large-Scale, Open and Reproducible Data Reduction

Design Patterns and Software Techniques for Large-Scale, Open and Reproducible Data Reduction

DESIGN PATTERNS AND SOFTWARE TECHNIQUES FOR LARGE-SCALE, OPEN AND REPRODUCIBLE DATA REDUCTION by Gijs Jan Molenaar A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy Supervised by Prof. Oleg Smirnov 2021 Contents Abstract v Declaration of Authorship vii Acknowledgements viii Preface x Publications xi Source Code xii 1 History of data reduction pipelines 1 1.1 Introduction . .2 1.2 1930s . .2 1.3 1940s . .2 1.4 1950s . .5 1.5 1960s . .5 1.6 1970s . .7 Zero-generation calibration to first-generation calibration . .8 Astronomical Image Processing System . 10 CANDID . 10 CLEAN . 11 1.7 1980s . 11 Self-calibration and the second generation of calibration algorithms . 13 GIPSY . 13 DWARF..................................... 14 IRAF ...................................... 15 Miriad . 15 1.8 Meanwhile in South Africa . 15 1.9 1990s . 17 AIPS++ . 17 NEWSTAR . 18 The Measurement Set version 1 . 18 The Measurement equation . 19 DIFMAP . 19 1.10 2000s . 19 The Measurement Set version 2 . 19 i CONTENTS ii MeqTrees, third-generation calibration algorithms . 19 Obit....................................... 19 CASA...................................... 20 Casacore . 20 1.11 2010s . 21 Cuisine . 21 Docker...................................... 21 The ALMA pipeline . 21 Common Workflow Language . 22 DDFacet and killMS . 22 Kliko....................................... 22 Stimela . 23 CARACal (formerly known as MeerKATHI) . 23 Default pre-processing pipeline . 23 The LOFAR two-metre survey pipeline . 24 1.12 Overview of events . 25 1.13 Discussion . 27 2 Fundamentals 29 2.1 Electromagnetic radiation . 30 2.2 Interferometry . 33 Aperture synthesis . 34 The (u; v; w) coordinate system . 36 The radio interferometer measurement equation . 37 2.3 Image reconstruction . 40 2.4 Calibration . 42 Reference calibration . 42 Self-calibration . 44 3 KERN 46 3.1 Introduction . 47 3.2 The target platform . 47 3.3 Other packaging methods . 48 Anaconda . 48 Python and pip . 48 Collaboration with Debian . 49 3.4 Usage . 50 3.5 Notable packages . 50 Casacore . 50 Casacore data . 51 MeqTrees . 51 CASA...................................... 51 AIPS....................................... 52 LOFAR ..................................... 52 Pulsar software . 53 Unversioned packages . 53 3.6 Containerisation . 53 Docker...................................... 53 CONTENTS iii Singularity . 53 3.7 Project structure . 54 The release cycle . 54 Technical structure . 54 3.8 Recommended usage . 55 3.9 Usage numbers . 55 3.10 Conclusions . 56 4 Kliko 57 4.1 Introduction . 58 Software in science . 58 Software containerisation with Docker . 58 4.2 The Kliko specification . 60 The Kliko image . 60 Expected run-time behaviour . 60 Flavours of Kliko images . 61 The /kliko.yml schema . 61 The /parameters.json file . 62 4.3 Running Kliko containers . 63 Running a container manually . 63 Inside the Kliko container . 64 Kliko-run . 64 4.4 Chaining containers . 65 4.5 Example of usage of Kliko . 66 VerMeerKAT . 66 RODRIGUES . 67 4.6 Software availability . 70 4.7 Discussions and prospects . 70 Limitations . 70 Future work . 71 4.8 Conclusions . 71 5 CWL and Buis 73 5.1 Introduction . 74 5.2 The CommonWL standard . 74 CommandLineTool class file . 75 Job file . 75 Workflow class file . 75 Runners . 76 5.3 Buis – the web-based frontend for CommonWL runners . 76 Functional design . 76 Technical design . 78 Usage . 79 5.4 Use case example: a 1GC pipeline . 80 5.5 Discussion . 83 CONTENTS iv 6 Vacuum Cleaner 85 6.1 Introduction . 86.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    134 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us