! Square Kilometre Array as a Co-Design Vehicle

Tim Cornwell, ASKAP Computing Project Lead Ben Humphreys, ASKAP Computing Project Engineer Australian Square Kilometre Array Pathfinder

Monday, 18 October 2010 SKA Science •Key Science Projects for the SKA • Hydrogen survey – dark energy • Pulsar survey – strong field tests of gravity • Cosmic magnetism – origin of B fields • Cradle of Life • Epoch of Re-ionization • Exploration of the Unknown

IESP SKA October 2010

Monday, 18 October 2010 The Square Kilometre Array

• 2020 era radio telescope • International project • Very large collecting area (km2) • Telescope sited in or South • Very large field of view Africa • Wide frequency range (70MHz - 25 • Headquarters in Europe or US GHz) • Multiple pathfinders and precursors • Large physical extent (3000+ km) now being built around the world

IESP SKA October 2010

Monday, 18 October 2010 LOFAR

• Constructed by ASTRON in the Netherlands • Low Frequency ARray • Sparse Aperture Array • 10 - 80 MHz, 120 - 240 MHz • Correlator implemented in Blue Gene/L/P

• Opened June 12 2010

IESP SKA October 2010

Monday, 18 October 2010 USA SKA Technology Development Program

• http://skatdp.astro.cornell.edu/ • Computer Processing Group • Concentration on calibration and imaging data processing REFERENCES 16 • Mathematical analysis of processing methods

IESP SKA October 2010

Monday, 18 October 2010Figure 3: Same as Fig. 1 except spectral line imaging costs are considered instead of continuum imaging costs.

[5] Perley, R. & Clark, B. 2003, ”Scaling Relations for Interferometric Post-Processing”, EVLA Memo 63, http://www.nrao.edu.

[6] Perley, R. 2004, ”Wide-Field Imaging with the EVLA: WIDAR Correlator Modes and Output Data Rates”, EVLA Memo 64, http://www.nrao.edu.

[7] Bhatnagar, S. 2009, ”Report on the findings of the CASA Terabyte Initiative: Single- node tests”, EVLA Memo 132, http://www.nrao.edu.

[8] Kemball, A. 2008, ”Petascale Computing Challenges for the SKA”, TDP Calibration and Processing Group Memo 1

[9] Lonsdale, C., Doleman, S., and Oberoi, D. 2004, ”Efficient Imaging Strategies for Next-Generation Radio Arrays ”, Experimental Astronomy.17,345. Australian SKA Pathfinder = 1% SKA

• Fastest survey radio telescope in the world • 150MA$ • Sited at Boolardy, • Early test observations 2011 • 36 antennas compared to ~ 3600 for SKA • Full observing 2013 • First 6 antennas installed • Demonstrates wide field of view

IESP SKA October 2010 technology for SKA

Monday, 18 October 2010 Australian SKA Pathfinder = 1% SKA

• Fastest survey radio telescope in the world • 150MA$ • Sited at Boolardy, Western Australia • Early test observations 2011 • 36 antennas compared to ~ 3600 for SKA • Full observing 2013 • First 6 antennas installed • Demonstrates wide field of view

IESP SKA October 2010 technology for SKA

Monday, 18 October 2010 ASKAP data flow

Murchison Radioastronomical Observatory Pawsey High Performance Centre for SKA

Thirty six Virtual antennas Filterbanks Beamformers Observatory

1.9Tb/s

PAF filterbank Beamformed samples filterbank ASKAP samples Science Data 18Tflop/s 27Tflop/s Archive Central Facility Correlator processor 1.9Tb/s 0.6Tb/s 2.5GB/s ASKAP 0.6Tb/s Science products via 18Tflop/s 27Tflop/s MRO- 0.5 - 1 Pflop/s VO protocols link 340Tflop/s 10GB/s

1.9Tb/s 0.6Tb/s

Operations Astronomers 18Tflop/s 27Tflop/s data archive

• From observing to archive with no human decision making T. Cornwell, July 9 2010 • Calibrate automatically • Image automatically • Form science oriented catalogues automatically

IESP SKA October 2010

Monday, 18 October 2010 Pawsey High Performance Computing Centre for SKA Science, Perth, Western Australia

• A$80M, funded by Australian Federal 7+( government L9(& 3$:6(< &(175( • 100 TF machine being commissioned &DOOIRUH[SUHVVLRQVRILQWHUHVWIRUHDUO\DGRSWHUXVHRI • HP cluster in a box WKH3DZVH\6WDJH$V\VWHP L9(&LVFXUUHQWO\LQWKHSURFHVVRILQVWDOOLQJ3DZVH\6WDJH$D+HZOHWW3DFNDUG32' • Full scale ASKAP test during EDVHGV\VWHPWKDWZLOOSURYLGHDFFHVVWRFRUHVEDVHGDWL9(�XUGRFK commissioning 7KLVV\VWHPZLOOEHFRPPLVVLRQHGLQWKHQH[WIHZPRQWKVDQGL9(&LVVHHNLQJ H[SUHVVLRQVRILQWHUHVWIURPWKHL9(&XVHUFRPPXQLW\IRUHDUO\DGRSWHUXVHGXULQJ WKHZHHNFRPPLVVLRQLQJSKDVH • Used in 2011 for early telescope testing $FFHVVGXULQJWKLVSHULRGLVDLPHGDWSURYLGLQJWLPHRQWKHV\VWHPVRXVHUVFDQ H[SHULPHQWDQGDLGL9(&LQFRQÀJXULQJDQGYDOLGDWLQJWKHV\VWHP • Petascale system by 2013 7REHHOLJLEOHDVHDUO\DGRSWHUVXVHUV PXVWEHDEOHWRQRPLQDWHDSURMHFW • 25% for radio astronomy FDSDEOHRI ‡ JHQHUDWLQJDKLJKLPSDFW UHVHDUFKRXWFRPHZLWKLQWKH FRPPLVVLRQLQJSKDVHE\WDNLQJ • Would like to access to 1PF before DGYDQWDJHRIWKH3DZVH\6WDJH that for scaling! $FRPSXWHFDSDFLW\RU ‡ ZDQWLQJWRLQYHVWLJDWH SHUIRUPDQFHRIDQH[LVWLQJRU QHZDSSOLFDWLRQDVSUHSDUDWLRQ IRUIXWXUHZRUN

3UHIHUHQFHZLOOEHJLYHQWRSURMHFWV WKDWXWLOLVHWKRXVDQGVRIFRUHV

7HFKQLFDODVVLVWDQFHIURPWKHL9(&VWDIIZLOOEHDYDLODEOHGXULQJWKHFRPPLVVLRQLQJSKDVH

7RUHJLVWHU\RXULQWHUHVWSOHDVHVHQGDVKRUWGHVFULSWLRQ OHVVWKDQRQHSDJH RI\RXU SURMHFWDQGLWVUHTXLUHPHQWVWRHDUO\DGRSWHUV#LYHFRUJTXHVWLRQVFDQDOVREHVHQW WRWKLVDGGUHVV7KHGHDGOLQHIRUH[SUHVVLRQVRILQWHUHVWLVQG2FWREHUDQG ZHH[SHFWWRDQQRXQFHWKHVXFFHVVIXOSURMHFWVE\WKHWKRI1RYHPEHU

$IWHUWKHFRPPLVVLRQLQJSKDVHWKH3DZVH\6WDJH$V\VWHPZLOOEHDYDLODEOHIRU PRUHJHQHUDODFFHVVWKURXJKWKH3DZVH\$OORFDWLRQ&RPPLWWHH

CSIRO

Monday, 18 October 2010 Radio interferometry

• Measured data = Fourier/Fresnel transform of sky I(l,m) j2" (ul +vm+w 1!l2 ! m2 ) Vij, p = e dl dm # 1! l 2 ! m2 ! • Also must include directivity of receptors • Summary paper: • Rau, U., Bhatnagar, S., Voronkov, M.A., Cornwell, T.J. (2009). Advances in Calibration and Imaging Techniques in Radio Interferometry. Proceedings of the IEEE, Vol 97 (8), 1472–1481. • Intrinsically all-to-all or many-to-many • Depending on receptor type • No closed form solution • Iterative solutions exist • Low arithmetic intensity in core algorithm • SKA may allow single pass solution

IESP SKA October 2010

Monday, 18 October 2010 (Some) Approaches to wide field imaging

Approach How it works Memory CPU

Faceting Divide sky into small patches for which the 2D Low High transform is valid W stacking Apply w-dependent phase screen by High Low multiplication in image space W projection Apply w-dependent phase screen by Low High convolution in Fourier space Snapshot Partition by hour angle, zenith angle, FFT, Low High imaging interpolate to same image plane A projection Apply antenna beams by convolution in Low Low Fourier space AW projection Antenna beam and phase screen by Low High convolution in Fourier space A projection/W Phase screen in image/antenna beam in High Low stack Fourier space

IESP SKA October 2010

Monday, 18 October 2010 !"#$%&$'"(&)$*+,$&-"'./&0$!&,1"2#$",,"3$(&425#4$'"3$6"7&$!+'.812#5$!+414$16"1$&-!&&($"#$"99+,("%/&$ 9,"!12+#$+9$16&$1+1"/$.,+:&!1$!+41$&/+.&)$

;6&$4!2Ԍ!$.&,9+,'"#!&$+9$,"(2+$2#1&,9&,+'&1&,4$!"#$%&$"44&44&($"5"2#41$"$42'./&$6256&216&,$2'"52#5$+,$#+#<2'"52#5?$/2'21&($+#/3$%3$16&$2#6&,$ 4�$+9$16&$",,"3$@$AB827"//30$24$16&$#+24&$9/++,$2#$16&$92#"/$4!2&#!&$.,+(8!14$(&92#&($%3$16&$ WKHUPDO QRLVH OLPLW Ƴ160$ (&,27&($ 9,+'$ 16&$ 4341&'$ &B827"/$ 9/8-$ (ၵ$ >CA*D?$ +,$ EF;0$ +7&,$ ./"842%/&$2#1&5,"12+#4$2#$%"#(G2(16$"#($12'&0$+,$24$"$6256&,$%",,2&,$4&1$%3$!"#$%%&#'&($4341&'"12!$&,,+,4$@$$$

*+,$ 2'"52#5$ "../2!"12+#40$ 29$16&$.&"H$%,2561#&44$2#$"#$2'"5&$92&/($24$(&#+1&($%3 Sm 0$ 16&#$ 16&$ 1",5&1$16&,'"/$(3#"'2!$,"#5&$24=$

Sm 2''tf $ DRth ~ Sm K $ >I)I?$ V th SEFD

G6&,&$ 't 24$16&$2#1&5,"12+#$2#1&,7"/$+7&,$12'&0$ 'f 24$16&$%"#(G2(160$"#($Ƨa2  $24$"$4!"/2#5$ 9"!1+,$9+,$(25212J"12+#$+,$425#"/$.,+!&442#5$&99&!14)$K#$.,"!12!&0$1624$(3#"'2!$,"#5&$/2'21$G2//$#+1$%&$ "!62&7&($ 29$ 16&$ ("1"$ ",&$ !+,,8.1&($ %3$ !"#$%%&#'&($ 4341&'"12!$ &,,+,4$ ",242#5$ 9,+'$ 8#<'+(&/&($ .,+."5"12+#$+,$!"/2%,"12+#$&,,+,4$"/+#5$16&$425#"/$."16$>!+#7 +#"//3$(&92#&($"4$16&$."16$%&1G&&#$ 16&$2#!2($.+2#1$+9$16&$"41,+#+'2!"/$,"(2"12+#$"%+7&$A",16L4$"1'+4.6&,&$1+$16&$+Co-design in SKA 81.81$+9$16&$92#"/$ 2#41,8'"/$ %"!H<&#($ +#$ 16&$ ",,"3?$ +,$ %3$ 2'"5&$ 9+,'"12+#0$ 4+8,!&$ '+(&/$ ,&.,&4"12+#0$ +,$ (&!+#7+/812+#$&,,+,4)$E/16+856$16&3$",&$2'.+,1"#10$G&$(+$#+1$!+#42(&,$16&$/"11&,$16,&&$9"!1+,4$6&,&0$ 527&#$ 16&$• Kemball, 2#41,8'"/$ Cornwell, 9+!84$ +9$ 1624$ and '&'+$ Yashar +#$ !"/2%,"12+#$TDP CPG "#($ memo .,+!&442#5$ # 4 !+#41,"2#14$ 16"1$ "99&!1$ "#1&##"$"#($9&&($(&425#$.","'&1&,4)$

$ Antenna design elements e.g. $ SKA reference receptor, design, e.g. SKA technical receivers, receptor, array specifications, antenna etc.. $ configuration, SKA science e.g. sensitivity, receptor and requirements survey speed, receivers, signal $ time, spectral, Calibration and transport and and angular processing correlation, cal & resolution, etc. design elements $ processing, data e.g. calibration management and imaging !! algorithms, scalability, etc..

Design drivers

"#$%&'!CSIRO()*E$42'./292&($G"1&,9"//$(&.2!12+#$+9$16&$+7&,"//$4341&'$(&425#$.,+!&44$9+,$16&$CME)*;6&$ Monday, 18 October!+8./2#5$%&1G&&#$"#1&##"$(&425#$&/&'$"#($!"/2%,"12+#$"#($.,+!&442#5$(&425#$&/&' 2010 $24$46+G#$ &-./2!21/30$"4$24$16&2,$,+/&$"4$.+1 "/$(&425#$(,27&,4)$

C341&'"12!$ 2#41,8'"/$ &,,+,4$ (+$ #+1$ 2#1&5,"1&$ (+G#$ 2#$ "$ N 4&N$ 16&3$ !+'.,+'24&$ 16&$ "!62&7&($/&7&/$+9$,&42(8"/$#+24&$"#($!6"#5&$214$(241,2%812+#$1+$#+#

• ASKAP has taken multiple steps to reduce computing costs and risk:

• ASKAP controls entire end-to-end software stack • Focus on survey observations reduces complexity in software • The phased array feed decouples radiation from different parts of sky - changes all-to-all to many-to- many • The phased array feeds are kept fixed on the sky by a novel three-axis design for the antenna, thus obviating the need for software sky de-rotation and correction of scattering from the antenna focus supports. • The phased array feed elements are calibrated at the front-end using radiators on the antenna surface. • Astronomical calibration information is fed back to the telescope in real time, reducing some first order effects to second order.

IESP SKA October 2010

Monday, 18 October 2010 HPC challenges and lessons learned from ASKAP

• Streaming vs Batch processing • Telescope produces streams of data • No suitable battle-hardened, mature framework for handling high volume streaming data • Adopted traditional batch system controlled by DRMAA • Batch (or any buffered processing) definitely will not work for SKA • Processor-memory gap • Older processors limited by memory latency • Newer processors can saturate memory bus • Forced to shift from ~ 4GB/core to 1GB/core • ~ 2012 timescale • Also important to investigate low-power systems GPU, BG/? • But (rate of gridding)/Watts ~ constant • Vital to have multiple algorithms for same purpose

IESP SKA October 2010

Monday, 18 October 2010 Comparison of ASKAP and SKA

ASKAP SKA Observing model Principally surveys Surveys and targeted observations Data per hour 10 TB 10 - 100 PB Scale of 100TF – 1PF 10PF – 1EF processing Type of Iterative, almost Embarrassingly Single pass, almost EP with single parallelism Parallel (EP) with occasional reconciliation reconciliation Major costs Convolutional resampling, Convolutional resampling (or better) iteration Arithmetic Low Low Intensity Data flow nature Intrinsically streamed, buffering Intrinsically streamed, no buffering, no and strategy to fast file system, barriers at barriers, best try reconcilation points Development MPI/C++ ???? and deployment environment Calibration Multiple stages, iterative – Multiple stages, iterative in image space requires data/image transforms only

IESP SKA October 2010

Monday, 18 October 2010 Climbing Mount Exaflop

SKA phase 2

SKA phase 1

ASKAP

Pawsey Centre tests

NCI NF tests NCI Altix tests

Note that Flops numbers are not ASKAP dev cluster achieved - we actually get much lower efficiency because of memory bandwidth - so scaling is relative

IESP SKA October 2010

Monday, 18 October 2010 Does SKA meet the criteria for CDV

Petascale or near-petascale application today •~ 100TF by end 2010 with a demonstrated need for exascale •~ 100TF, 100PB/year by 2012 performance •~ 1PF by 2014 Significant scientific goals in an area that is •Astronomy is widely recognised as a societal expected to be a scientific or societal driver for good and technology driver exascale computing •Demonstrated by successful funding

Realistic and a definable set of steps to •Pathfinders (e.g. ASKAP, LOFAR) exascale that can be mapped out over 10 •SKA1 years or less •SKA2 Community experienced in algorithm, software •Community is experienced and/or hardware developments and is willing to •Exascale is necessary for science goals engage in the exascale co-design process. •Strongly motivated to participate in co-design Modular and open enough to stimulate the •Astronomy is very open, nearly all OSS development of additional modules addressing •Design, software shared across community related questions in the area

Fills a slot in the portfolio of extreme scale •Algorithms (near EP and non EP) application needed to test all these dimensions •Data graphs, flow, and management

IESP SKA October 2010

Monday, 18 October 2010 Conclusion

• Next steps • SKA now in initial costing phase, ending late 2011 • Enter design and development phase in 2012 • Construction of SKA1 starts 2016: 350M€ globally • Construction of SKA2 starts ~ 2020: 2G€ globally

• SKA yearly meeting next week in Oxford • I will report on IESP

• Why is SKA an important co-design vehicle? • Data-centric • Data management • Substantial opportunities for tailoring algorithms to hardware • Multiple opportunities for prototyping on real systems • One of the largest, most visible scientific instruments in this decade • Timescale matched to IESP

IESP SKA October 2010

Monday, 18 October 2010 ATNF/ASKAP Tim Cornwell ASKAP Computing Project Lead

Phone: +61 2 9372 4261 Email: tim.cornwell@.au Web: www.atnf.csiro.au

Thank you

Contact Us Phone: 1300 363 400 or +61 3 9545 2176 Email: [email protected] Web: www.csiro.au

Monday, 18 October 2010 Radio Astronomy: Square Kilometre Array Tim Cornwell – CSIRO Astronomy and Space Science [email protected]

Scientific and computational challenges Software issues – 2012-2020

• Formation of the first stars; Cosmology and • Management of very large data flows galaxy evolution; Tests of GR; Cosmic • Software/algorithm solutions to widening magnetism; Cradle of life flops vs memory-bandwidth gap • Imaging, source finding, correlation, beam • Development of new algorithms or methods forming of processing data • 2012 (ASKAP): 36 dishes, 6192 feeds • Scaling existing techniques to 100,000,000+ • 2016-2020 (SKA1): ~ 250 dishes cores is not viable •“One-touch processing”

Software issues – 2010-2012 Impact of last machine change (10’s of Gflops to 100 Tflops)

• Adapting software stack aimed at batch/ • Existing algorithms still useful, only simulation type jobs to real-time data adaptations necessary (1GB memory/core) processing • Required multi-disciplinary teams (HPC + • Low computational intensity of algorithms, science domain expertise) limited by memory-bandwidth • Problem is mostly embarrassingly parallel to • High input data rates (O)10,000 cores • Managing large datasets (parallel-IO, • Optimizing I/O patterns critical archiving, transfer)

IESP AprilSKA 2010October 2010

Monday, 18 October 2010 Demonstration of full range of SKA imaging

10 kpc 1 pc

Burns & Clark Tingay et al. 600 kpc

LBA + ASKAP + Warkworth observations of nuclear region of Centaurus A at 21cm.

Feain, Cornwell, Ekers IESP SKA October 2010

Monday, 18 October 2010