OTFM for fast mapping & rapid data products Steven T. Myers (NRAO) for the VLASS team

1 B&E Caltech July 2016 The VLA Sky Survey (VLASS) On-the-fly Mosaicking (OTFM) for efficient fast mapping of the sky and pipelines for rapid generaon of Data Products

… wherein we elucidate the sundry benefits of using the recently upgraded Karl G. Jansky Very Large Array to survey large areas of the sky to discover new transient phenomena and illuminate heretofore hidden explosions throughout the Universe, and to explore cosmic magnetism

hps://science.nrao.edu/science/surveys/vlass

2 B&E Caltech July 2016 Why a new VLA Sky Survey? A New VLA deserves a New Sky Survey https://science.nrao.edu/science/surveys/vlass ¥ Expanded (bandwidth, time resolution) capabilities of Jansky VLA ¥ 20 years since previous pioneering VLA sky surveys Ð NVSS 45” 1.4GHz (84MHz) 450µJy/bm (2932hrs) 30Kdeg2 2Msrcs Ð FIRST 5” 1.4GHz (84MHz) 150µJy/bm (3200hrs) 10.6Kdeg2 95Ksrcs ¥ Preparing the way for a new generation of surveys Ð ASKAP and MeerKAT Ð SKA phase 1 Ð new O/IR surveys (i/zPTF, PanSTARRS,DES,LSST), X-ray (eROSITA) ¥ Science Proposal and survey definition by community Survey Science Group (SSG) following AAS 223 workshop in Jan 2014 Ð community review and approval in March 2015

3 B&E Caltech July 2016 VLASS Survey Science Group A Survey For the Whole Community Proposal Authors: Table 8: VLASS Proposal Contributors, Including VLASS White Paper Authors F. Abdalla, Jose Afonso, A. Amara, David Bacon, Julie Banfield, Tim Bastian, , Stefi A. Baum, Tony Beasley, Rainer Beck, Robert Becker, Michael Bell, Edo Berger, Rob Beswick, Sanjay Bhat- nagar, Mark Birkinshaw, V. Boehm, Geoff Bower, Niel W. Brandt, A. Brazier, , Michael Brotherton, Alex Brown, Michael L. Brown, Shea Brown, Ian Browne, Gianfranco Brunetti, Sarah Burke Spolaor, Ettore Carretti, Caitlin Casey, Sayan Chakraborti, Claire J. Chandler, Shami Chatterjee, Tracy Clarke, Julia Comerford, Jim Cordes, Bill Cotton, Fronefield Crawford, Daniele Dallacasa, Constantinos Science Proposal and Demetroullas, Susana E. Deustua, Mark Dickinson, Klaus Dolag, Sean Dougherty, Steve Drake, Alas- survey definion by tair Edge, Torsten Ensslin, Andy Fabian, Xiaohui Fan, Jamie Farnes, Luigina Feretti, Pedro Ferreira, Dale Frail, Bryan Gaensler, Simon Garrington, Joern Geisbuesch, Simona Giacintucci, Adam Ginsburg, Gabriele community Survey Giovannini, Eilat Glikman, Federica Govoni, Keith Grainge, Meghan Gray, Dave Green, Manuel Guedel, Science Group (SSG) Nicole E. Gugliucci, Chris Hales, Gregg Hallinan, Martin Hardcastle, Ian Harrison, Marijke Haverkorn, Martha Haynes, George Heald, Sue Ann Heatherly, Alan Heavens, Joe Helmboldt, C. Heymans, Ian Hey- following AAS 223 wood, Julie Hlavacek-Larrondo, Jackie Hodge, Michael Hogan, Assaf Horesh, C.-L. Hung, Zeljko Ivezic, workshop in Jan 2014 Neal Jackson, Matt Jarvis, B. Joachimi, Atish Kamble, David Kaplan, Namir Kassim, S. Kay, Amy Kim- ball, T.D. Kitching, Roland Kothes, Diederik Kruijssen, Shri Kulkarni, Mark Lacy, Cornelia Lang, Casey Law, Joe Lazio, J.P. Leahy, Jeff Linsky, Xin Liu, Britt Lundgren, R. Maartens, Antonio Mario Magalhaes, Minnie Mao, Sui Ann Mao, Maxim Markevitch, Walter Max-Moerbeck, Ian McGreer, Brian McNamara, Community review and Lance Miller, Elisabeth Mills, Kunal Mooley, Tony Mroczkowski, Eric J. Murphy, Matteo Murgia, Tom approval in March 2015 Muxlow, Steve Myers, Bob Nichol, Shane O’Sullivan, Niels Oppermann, Rachel Osten, Pat Palmer, P. Patel, Wendy Peters, Emil Polisensky, Ed Prather, J. Pritchard, Cormac Purcell, A. Raccanelli, Scott Ran- som, Urvashi Rao, Paul Ray, A. Refregier, Gordon Richards, Anita Richards, C. Riseley, Tim Robishaw, Anish Roshi, Larry Rudnick, Michael Rupen, Helen Russell, Elaine Sadler, M. Santos, Anna Scaife, B.M. Schafer, Richard Schilizzi, Dominic Schnitzeler, Yue Shen, Kartik Sheth, Greg Sivakoff, Lorant Sjouw- erman, Ian Smail, Oleg Smirnov, Vernesa Smolcic, Alicia Soderberg, Dmitry Sokolov, Tim Spuck, Judy Stanley, J.-L. Starck, Jeroen Stil, John Stoke, Michael Strauss, Meng Su, Xiaohui Sun, R. Szepietowski, A.N. Taylor, Russ Taylor, Valentina Vacca, Reinout van Weeren, Tiziana Venturi, Andrew Walsh, Wei- Hao Wang, Dave Westpfahl, Robert Wharton, Rick White, Stephen White, L. Whittaker, Peter Williams, Kathryn Williamson, Tony Willis, Tom Wilson, Maik Wolleben, Nicholas Wrigley, Ashley Zauderer, J. Zuntz

4 Note: MembersB&E of theCaltech SSG are listed July in 2016bold.

Each working group was led by two co-chairs, with the co-chairs comprising the SSG Governing Council. The SSG Governing Council itself had two co-chairs (Stefi Baum and Eric Murphy). Con- tributions to the WG discussions were enabled through the NRAO Science Forum,13 with material also posted on the NRAO Public Wiki, 14 along with other methods of group communication such as Google Groups, as defined by the co-chairs of the individual WGs. In this way, contributions to the discussion on these WGs was expanded well beyond the original authorship of the WPs. The process by which the VLASS survey definition proceeded from this point is worth docu- menting, as it may serve to guide the development of future surveys. Initially, the three scientific WGs (Galactic, Extragalactic, and Transients/Variability) were asked by the SSG Council co-chairs to specify their “ideal” survey designs, supported by key science goals. A “virtual face-to-face” meeting was then used to assess areas of commonality between the elements of the proposed sur- veys (frequency band, array configuration) and to identify areas that needed further discussion (number of epochs, depth of each epoch, monolithic vs. tiered). At this stage, the focus of the

13https://science.nrao.edu/forums/viewforum.php?f=59 14https://safe.nrao.edu/wiki/bin/view/VLA/VLASS

60 What is the VLA Sky Survey?  Captures a set of snapshots of the radio sky unique in me & “space”  Enables : focused radio, mul-λ, stascal, me domain & polarizaon studies VLASS rms depth (1σ)  S-Band (2 – 4 GHz), B/BnA configuraons single epoch, 2.5” res.  Wide Bandwidth – connuum images + spectral cubes 1500MHz – 120 µJy/beam  2048/128/10/2MHz resoluon 128MHz – 410 µJy/beam  Full Polarizaon – Improved RM Synthesis Imaging 10MHz – 1.5 mJy/beam  All-sky visible to VLA (δ>-40° ~34kdeg2) – including Galacc plane and bulge  Synopc – 3 epochs, 120µJy/beam per epoch, 32 month cadence  High Time Resoluon 0.45sec  OTFM scanning at ~3.3’/s (~24deg2/hr)  High Angular Resoluon (2.5”) Density Total -2  locate hosts and locaon within hosts Tier (deg ) Detecons All-Sky 290 9,700,000  ~5400 hr investment over ~7yr (~900h/cycle)  ~15% impact on PI me 10x FIRST yield, ~5x NVSS

Area Resoluon Rms Time Tier (deg2) (’’,robust) (µJy/bm) (hr) Epochs All-Sky 33,885 (δ > -40°) 2.5 69 5436 3 5 B&E Caltech July 2016 VLA Sky Survey Science VLASS Science Drivers ¥ Key science from VLASS proposal (SSG): Ð “Hidden explosions” and other radio transients. Ð Faraday tomography of the magnetic sky. Ð AGN and galaxy evolution in concert with new optical/IR surveys. Ð Peering through our dusty Galaxy. Ð “Missing physics”, rare objects, opening new discovery space ¥ What defines a sky survey? – Survey Speed (SS) & “depth” Ð covering an area of sky in given time (e.g. deg2/hour) # &2 Ωtot ΩB  2 λ SS = = =θ θrow ΩB = 0.5665θFWHM ∝% ( ttot tint $ D '

6 B&E Caltech July 2016 On-the-Fly (OTF) Mosaicking with JVLA Enabling new surveys with old telescopes ¥ Scan telescopes across sky while taking array data Ð Lose no data while scanning instead of 3-7s per step in standard mode Ð Efficient when dwell times on-sky are <25s (VLASS ~4.8s) ¥ High scanning rates & survey speeds (VLASS: 23.83 deg2/hr) Ð limited by data rates (25MB/s) for fast correlation dump times Ð >100 deg2/hr possible at L-band (1-2GHz) Ð VLASS: ¥ 3.31’/s scan rate ¥ 7.2’ row sep. ×× × ×

See hps://science.nrao.edu/facilies/vla/docs/manuals/obsguide/modes/mosaicking

7 B&E Caltech July 2016 OTFM: Considerations Primary Beam Smearing ¥ Antenna motions over a visibility integration time cause the effective primary beam to be smeared along the motion axis. ¥ For VLASS at 3.31’/s and 0.45s integrations, this is 1.49’ which is 14.4% of the 10.34’ primary beam FWHM at 3.95GHz, and 10.6% of the 14’ beam at 3GHz. This leads to a sensitivity loss of <1%.

8 B&E Caltech July 2016 OTFM: Considerations Mosaic Uniformity ¥ For a given survey speed, one chooses an OTFM row separation which then prescribes a scan rate. The row separation should be choosen to limit non- uniformity. For VLASS this is 7.2’, giving 2%-5% expected non-uniformity.

for θrow=7.2’

3.56GHz 3GHz 3.95GHz

9 B&E Caltech July 2016 Status & Plans What are we doing now? ¥ Establishment of the VLASS Survey Team underway Ð Project Director: Claire Chandler & Project Scientist: Mark Lacy Ð Technical Lead: Steve Myers & Architect: Rafael Hiriart Ð Project Manager: Demian Arancibia & Lead Analyst: Drew Medlin Ð 2 new science staff in Socorro (Amy Kimball, Frank Schinzel) Ð other staff at NRAO & community (SSG) ¥ Carrying out VLASS Pilot Project (June-Sep 2016) Ð 196 hours covering scientifically interesting regions Ð observations ongoing & available (TVPILOT, TSKY0001) Ð prototype pipelines and scripts for calibration and imaging Ð benchmarking for processing needs Ð testing & verification for upcoming design reviews (PDR: Sept. 2016) Ð system development to address Challenges

10 B&E Caltech July 2016 VLA Sky Survey Pilot (Summer 2016) Advanced scoung report ¥ Plan developed with Survey Science Group (VLASS Memo #2) Ð early science opportunities plus testing priorities Ð transient considerations from Caltech workshop April 2016 ¥ Program: total 196 hours (+other testing), 3920deg2 total images Ð 2480 unique deg2 covered ¥ include key deep fields ¥ 240 deg2 in galactic plane/bulge Ð 720 deg2 repeat 3x ¥ consistency & variability ¥ Cepheus 10x Ð 2160 deg2 in SDSS/FIRST footprint ¥ for transient finding

11 B&E Caltech July 2016 Pilot Survey Tiling Based on 4hr blocks, 80deg2

12 B&E Caltech July 2016 Science Example Transients with Pilot Survey ¥ Figure from Kunal Mooley, 2160 deg2 in FIRST footprint

Pilot

13 B&E Caltech July 2016 What will the VLASS produce? Data & Data Products Publically Available! ¥ Basic Data Products (BDP): NRAO produced & verified Product Timescale Notes Raw Data immediate no proprietary period Calibrated Data 48 h. – 1 week same, served from archive Quick-Look Images 48 h. – 2 weeks (?) connuum only, simple QA Single-Epoch Images & Cubes 6 mos. (12 mos. pol beer quality assurance, includes (2048/128MHz) cubes) polarizaon, coarse cubes Single-Epoch Catalog w/SEI more object parameters Mul-Epoch Combined Images 12 mos. (16 mos. pol produced aer each epoch aer & Cubes (2048/128/10MHz) cubes) first, increased depth MEC Catalog w/CI more detailed ¥ Enhanced Data Products (EDP): community provided Ð opportunity for NSF funding (MSIP), international partners (e.g. AU,CA,ZA)

14 B&E Caltech July 2016 Challenge : Storage So much data, so lile disk ¥ Must be live reliable disk: use NGAS + new Archive Access Tool (under dev.) ¥ Raw data: ~490TB (at max GO allowed rate 25MB/s) ¥ Images products: ~440TB (~8% of projected archive 2020) Ð Quick Look : 1”/pix, 440GPix/plane (1.8TB), 3epochs (5.4TB)

¥ continuum : I,σI = 10.8TB Ð Single Epoch : 0.6”/pix, 1.2Tpix/plane (4.8TB), 3 epochs (14.4TB)

¥ continuum I,σI,Iα,σIα = 58TB

¥ 128MHz cubes I,Q,U,σI,σQU x 14planes = 1PB x 10% = 100TB Ð Multi-Epoch Combined (same as SE, single best version, 4.8TB/image)

¥ continuum I,σI,Iα,σIα,Iβ,σIΒ = 29TB

¥ 128MHz cubes I,Q,U,σI,σQU x 14 planes = 336TB x 10% = 100TB

¥ 10MHz cubes I,Q,U,σI,σQU x 168 planes = 4PB x 2.5% = 100TB ¥ plus tapered images to 7.5” = +10% to the above Ð Object catalogs: negligible storage

15 B&E Caltech July 2016 Challenge : Pipeline Processing So much sky, so lile me ¥ Workflow : set of connected pipelines run from database Ð calibration using CIPL, plus custom imaging Ð need stages of QA to proceed to next steps (auto and interactive) Ð must be robust with small failure rate Ð version control, can regenerate products at later date Ð deployable on NRAO cluster & externally on AWS or other Ð still in prototype deployment & testing stage ¥ Current benchmarking: in CASA release 4.6 and 4.7 dev ¥ Targeted Performance: Ð QL within 48 hours of observation = doable w/improved CIPL Ð SE within 180 days = continuum need 48 threads ~ 4 nodes = OK? ¥ cubes: scaling ok, but ~80 nodes?

16 B&E Caltech July 2016 Example Workflow Schematic As of 20 July 2016 (Rafael Hiriart) ¥ under development by Rafael Hiriart

17 B&E Caltech July 2016 Challenge: Radio Frequency Interference So many people, so much RFI: It’s everywhere! ¥ But worse in the “Clarke” satellite belt (δ~-5°) Ð below 2.2GHz and above 3.6GHz heavy interference Ð investigating performance of auto-flagging (e.g. TFCROP/RFLAG)

+4h

Hour Angle

-5h 4GHz 2GHz Frequency

18 B&E Caltech July 2016 1 8 Challenge: Scope Control So many possibilies, so few resources ¥ There are many things NRAO could do for VLASS Ð faster pipelines, more/better computing nodes, get BDP sooner Ð bigger archive, more data products, keep infinite versions, EDP Ð faster scanning, higher data rates, more epochs, better time res. Ð more ambitious processing, catalogs, RM, Stokes V, etc. Ð community groups (when you ask) want more! ¥ But, we have to deliver VLASS on time, within budget, on scope Ð with reasonable level of Quality Assurance by the people we have ¥ Project management & docs is added overhead (under-budgeted) & risk Ð increasing (NSF asking for this), takes key people’s time (e.g. mine) Ð adds 50% or more to cost in my estimate, warranted if risk is this level ¥ Opportunity: community “Boutique & Experiment” style EDP Ð take on some of the riskier value-added targets

19 B&E Caltech July 2016 Enhanced Data Products & Services Community led effort

¥ Transient Object Catalogs, Marshals, & Alerts Opportunities for ¥ Multi-Wavelength Catalogs for VLASS sources seeking funding for ¥ Rotation Measure Images and Catalogs EDP from NSF (MSIP) and international ¥ Light Curves (IQU) sources (AU,CA,ZA) ¥ A hosted VLASS Archive with Image and Catalog Service  e.g., as currently available by IPAC/IRSA allowing for VLASS data to be integrated with Spitzer/Planck/WISE/Euclid/ etc… ¥ VLITE (NRL): Ð commensal 350MHz ¥ commensal optical surveys? ¥ many possibilities …

20 B&E Caltech July 2016 Challenge: People It’s full of stars ¥ We are now taking data for the pilot, which will produce images at the rate of ~720Mpix/hour (continuum) and potentially 10Gpix/hour (coarse spectral cubes), all of which must be analyzed, with at least one pass by human eyes. Currently (as of 20 July 2016) we have 60 hours covering a total of 1200deg2. This needs people. ¥ The NRAO VLASS science/technical team itself is small (<3 FTE), so we strongly encourage community involvement, particularly by students and post- docs! NRAO has REU program Reber student fellowships, Jansky post-docs. ¥ For example, REU summer student Amy Cavanaugh (WCU of PA) is working on VLASS pilot Stripe-82 to evaluate performance (against CNSS) and compile multiwavelength catalogs. We hope other students will do projects with VLASS pilot data. Contact us if you want more info. ¥ VLASS pipelines can be used to process other VLA mosaicked data, such as follow-up of LIGO GW events (Kulkarni,Ho,Mooley,&al.). This might work for you.

21 B&E Caltech July 2016 The VLASS and You! Select your level of engagement 1. “I want to know what’s happening and help guide the survey” Ð visit VLASS website and join the SSG googlegroup discussion 2. “I just want to use final products for my science” Ð download images & catalogs from archive (BDP & EDP) 3. “I want to modify the basic data and produce something others may use.” Ð join SSG working groups and teams to provide EDP 4. “I want to build my own pipelines and reduce the survey data in real time my way.” Ð download raw data, develop custom pipeline 5. “I want to be part of the VLASS team!” Ð student PhD support, visit NRAO (e.g. RSRO, visitor programs)

22 B&E Caltech July 2016 Progress: Images! Results from test & pilot observaons ¥ Pilot observations June 2016, 1°x1° subtile Stripe-82 at 2100+0000

VLASS QuickLook Image 1°x1° subimage (1°.56x1°.56) 1” pixel size, 2.5” resolution 416 phase-centers (2 integ) ~10GB vis. data MTMFS nterm=2 autoboxing

σI=106µJy/beam Imaging Time: 4h4m single-thread

23 B&E Caltech July 2016 2 3 Progress: Images! Results from test & pilot observaons ¥ Pilot observations June 2016, 1x1 Stripe-82 at 2100+0000

24 B&E Caltech July 2016 2 4 Progress: Images! Results from test & pilot observaons ¥ Pilot observations June 2016, 1x1 Stripe-82 at 2100+0000

25 B&E Caltech July 2016 2 5 Progress: Images! Results from test & pilot observaons ¥ Pilot observations June 2016, 1x1 Stripe-82 at 2100+0000

26 B&E Caltech July 2016 2 6 Progress: Images! Results from test & pilot observaons ¥ Pilot observations June 2016, 1°x1° subtile Stripe-82 at 2100+0000

VLASS QuickLook Image 1°x1° subimage (1°.56x1°.56) 1” pixel size, 2.5” resolution 416 fields (2 vis per center) MTMFS nterm=2 autoboxing Imaging Time: 4h4m single-thread

(left) original 2.5” resolution; (right)uv-tapered to 7.5” resolution to enhance low surface-brightness extended emission

27 B&E Caltech July 2016 2 7 Progress: Images! Results from test & pilot observaons ¥ Pilot observations June 2016, 1°x1° subtile Stripe-82 at 2100+0000

VLASS QuickLook Image 1°x1° subimage (1°.56x1°.56) 1” pixel size, 2.5” resolution 416 fields (2 vis per center) MTMFS nterm=2 autoboxing Imaging Time: 4h4m single-thread

(left) NVSS at 45” resolution; (right) VLASS uv-tapered to 7.5” resolution to enhance low surface-brightness extended emission

28 B&E Caltech July 2016 2 8 Progress: Images! Results from test & pilot observaons ¥ Pilot observations June 2016, 1°x1° subtile Stripe-82 at 2100+0000

(left) original 2.5” resolution; (right)uv-tapered to 7.5” resolution to enhance low surface-brightness extended emission

29 B&E Caltech July 2016 2 9 VLASS: Take away message A new era of wide-area high-resoluon radio synopc surveys is about to begin! ¥ A modern large-area synopc mul-use public legacy survey Ð Support broad community & enable wide range of science and discovery Ð Hidden explosions, Faraday Tomography, Galaxies & AGN Everywhere ¥ Building on the past, looking towards the future Ð Snapshots of our Universe unique in me & space! Ð Soon to be joined by ASKAP, MeerKAT, Aperf Ð A springboard into the LSST and SKA science era Ð A substanal real world test-bed for SKA science and processing ¥ Look us up online: hps://science.nrao.edu/science/surveys/vlass Ð Follow links to Survey Proposal and the Technical Implementaon Plan Ð Read the VLASS Memo series Ð Join the VLASS SSG googlegroup and parcipate in the discussion Ð Use VLASS data & data products!

30 B&ECaltech July 2016 B&E Caltech July 2016 VLASS Milestones Notional schedule (as of July 2016)

Date Activity 2016 May 27 – Sep 5 2016A B-config (VLASS pilot observations) 2016 Sep 13-15 VLASS Preliminary Design Review (PDR) 2016 Oct – 2017 Sep Demonstration of Basic and Enhanced Data Products from pilot 2017 Jan-Apr (TBD) VLASS Critical Design Review (CDR), final go/no-go 2017 Sep VLASS epoch 1 part 1 observing begins (B-config) 2018 June Delivery of Epoch 1 part 1 BDP (6 months: Stokes I only) 2018 Dec Delivery of Epoch 1 part 1 BDP (12 months: Pol. Cubes) 2020 Jan VLASS epoch 2 part 1 observing begins (B-config)

31 B&E Caltech July 2016 www.nrao.edu science.nrao.edu public.nrao.edu

The National Observatory is a facility of the National Science Foundation operated under cooperative agreement by Associated Universities, Inc.

32 B&E Caltech July 2016