Abstract
The Square Kilometre Array Observatory (SKAO) will explore the radio sky to new depths in order to conduct transformational science. SKAO data products made available to astronomers will be correspondingly large and complex, requiring the application of advanced analysis techniques to extract key science findings. To this end, SKAO is conducting a series of Science Data Challenges, each designed to familiarize the scientific community with SKAO data and to drive the development of new analysis techniques. We present the results from Science Data Challenge 2 (SDC2), which invited participants to find and characterize 233 245 neutral hydrogen (H i) sources in a simulated data product representing a 2000 h SKA-Mid spectral line observation from redshifts 0.25-0.5. Through the generous support of eight international supercomputing facilities, participants were able to undertake the Challenge using dedicated computational resources. Alongside the main challenge, 'reproducibility awards' were made in recognition of those pipelines which demonstrated Open Science best practice. The Challenge saw over 100 participants develop a range of new and existing techniques, with results that highlight the strengths of multidisciplinary and collaborative effort. The winning strategy - which combined predictions from two independent machine learning techniques to yield a 20 per cent improvement in overall performance - underscores one of the main Challenge outcomes: that of method complementarity. It is likely that the combination of methods in a so-called ensemble approach will be key to exploiting very large astronomical data sets.
Original language | English |
---|---|
Pages (from-to) | 1967-1993 |
Number of pages | 27 |
Journal | Monthly Notices of the Royal Astronomical Society |
Volume | 523 |
Issue number | 2 |
DOIs | |
Publication status | Published - Aug 2023 |
Fingerprint
Dive into the research topics of 'SKA Science Data Challenge 2: analysis and results'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver
}
In: Monthly Notices of the Royal Astronomical Society, Vol. 523, No. 2, 08.2023, p. 1967-1993.
Research output: Contribution to journal › Article › peer-review
TY - JOUR
T1 - SKA Science Data Challenge 2
T2 - analysis and results
AU - Hartley, P.
AU - Bonaldi, A.
AU - Braun, R.
AU - Aditya, J. N.H.S.
AU - Aicardi, S.
AU - Alegre, L.
AU - Chakraborty, A.
AU - Chen, X.
AU - Choudhuri, S.
AU - Clarke, A. O.
AU - Coles, J.
AU - Collinson, J. S.
AU - Cornu, D.
AU - Darriba, L.
AU - Delli Veneri, M.
AU - Forbrich, J.
AU - Fraga, B.
AU - Galan, A.
AU - Garrido, J.
AU - Gubanov, F.
AU - Hakansson, H.
AU - Hardcastle, M. J.
AU - Heneka, C.
AU - Herranz, D.
AU - Hess, K. M.
AU - Jagannath, M.
AU - Jaiswal, S.
AU - Jurek, R. J.
AU - Korber, D.
AU - Kitaeff, S.
AU - Kleiner, D.
AU - Lao, B.
AU - Lu, X.
AU - Mazumder, A.
AU - Moldón, J.
AU - Mondal, R.
AU - Ni, S.
AU - Önnheim, M.
AU - Parra, M.
AU - Patra, N.
AU - Peel, A.
AU - Salomé, P.
AU - Sánchez-Expósito, S.
AU - Sargent, M.
AU - Semelin, B.
AU - Serra, P.
AU - Shaw, A. K.
AU - Shen, A. X.
AU - Sjöberg, A.
AU - Smith, L.
AU - Soroka, A.
AU - Stolyarov, V.
AU - Tolley, E.
AU - Toribio, M. C.
AU - van der Hulst, J. M.
AU - Vafaei Sadr, A.
AU - Verdes-Montenegro, L.
AU - Westmeier, T.
AU - Yu, K.
AU - Yu, L.
AU - Zhang, L.
AU - Zhang, X.
AU - Zhang, Y.
AU - Alberdi, A.
AU - Ashdown, M.
AU - Bom, C. R.
AU - Brüggen, M.
AU - Cannon, J.
AU - Chen, R.
AU - Combes, F.
AU - Conway, J.
AU - Courbin, F.
AU - Ding, J.
AU - Fourestey, G.
AU - Freundlich, J.
AU - Gao, L.
AU - Gheller, C.
AU - Guo, Q.
AU - Gustavsson, E.
AU - Jirstrand, M.
AU - Jones, M. G.
AU - Józsa, G.
AU - Kamphuis, P.
AU - Kneib, J. P.
AU - Lindqvist, M.
AU - Liu, B.
AU - Liu, Y.
AU - Mao, Y.
AU - Marchal, A.
AU - Meshcheryakov, A.
AU - Olberg, M.
AU - Oozeer, N.
AU - Pandey-Pommier, M.
AU - Pei, W.
AU - Peng, B.
AU - Sabater, J.
AU - Sorgho, A.
AU - Starck, J. L.
AU - Tasse, C.
AU - Wang, A.
AU - Wang, Y.
AU - Xi, H.
AU - Yang, X.
AU - Zhang, H.
AU - Zhang, J.
AU - Zhao, M.
AU - Zuo, S.
AU - Marquez, I
N1 - Funding Information: We would like to thank members of the SKAO HI Science Working Group for useful feedback. We are grateful for helpful discussions with the Software Sustainability Institute. The simulations make use of data from WSRT HALOGAS-DR1. The Westerbork Synthesis Radio Telescope is operated by ASTRON (Netherlands Institute for Radio Astronomy) with support from the Netherlands Foundation for Scientific Research NWO. The work also made use of ‘THINGS’, the HI Nearby Galaxy Survey (Walter et al. ), data products from which were kindly provided to us by Erwin de Blok after multiscale beam deconvolution performed by Elias Brinks. We would like to thank INAF for the hosting of SDC2 data products. LA is grateful for the support from UK STFC via the CDT studentship grant ST/P006809/1. This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 679627; project name FORNAX). JMvdH and KMH acknowledge support from the European Research Council under the European Union 7th Framework Programme (FP/2007–2013)/ERC Grant Agreement no. 291531 (HIStoryNU). SSI. The works of the NAOC-Tianlai team members have been supported by the National Key R&D Program grants 2018YFE0120800,2017YFA0402603, 2018YFA0404504, 2018YFA9494691, The National Natural Science Foundation of China (NSFC) grants 11633004, 11975072, 11835009, 11890691, 12033008, the Chinese Academy of Science (CAS) QYZDJ-SSW-SLH017, JCTD-2019-05, and the China Manned Space Projects CMS-CSST-2021-A03, CMS-CSST-2021-B01. Team FORSKA-Sweden acknowledges support from Onsala Space Observatory for the provisioning of its facilities support. The Onsala Space Observatory national research infrastructure is funded through Swedish Research Council (grant No. 2017–00648). Team FORSKA-Sweden also acknowledges support from the Fraunhofer Cluster of Excellence Cognitive Internet Technologies. CH, MB acknowledge support by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC 2121 ‘Quantum Universe’ – 390833306. MP acknowledges the support of the CEFIPRA foundation under project 6504–3. We acknowledge financial support from SEV-2017-0709, CEX2021-001131-S, AEI/ 10.13039/501100011033. LD, JG, KMH, JM, MP, SSE, LVM, AS from RTI2018-096228-B-C31, PID2021-123930OB-C21 AEI/ 10.13039/501100011033 FEDER, UE. LVM, JG, SSE acknowledge The European Science Cluster of Astronomy and Particle Physics ESFRI Research Infrastructures project that has received funding from the European Union’s Horizon 2020 research and innovation program under Grant Agreement No. 824064. LVM, JG, and JM RED2018-102587-T AEI/ 10.13039/501100011033. LVM, JG, SSE, JM acknowledge financial support from the grant IAA4SKA (Ref. R18-RT-3082) from the Economic Transformation, Industry, Knowledge and Universities Council of the Regional Government of Andalusia and the ERDF from the EU, TED2021-130231B-I00 AEI/ 10.13039/501100011033 EU NextGenerationEU/PRTR. LVM, JG, KMH acknowledges financial support from the coordination of the participation in SKA-SPAIN, funded by the Ministry of Science and Innovation (MCIN). LD from PTA2018-015980-I AEI/ 10.13039/501100011033. MP from the grant DOC01497 funded by the Economic Transformation, Industry, Knowledge and Universities Council of the Regional Government of Andalusia and by the Operational Program ESF Andalucía 2014–2020. MTS acknowledges support from a Scientific Exchanges visitor fellowship (IZSEZO_202357) from the Swiss National Science Foundation. AVS thanks Martin Kunz and Bruce Bassett for the valuable discussions. Team Spardha would like to acknowledge SKA India Consortium, IUCAA and Raman Research Institute for providing the support with the computing facilities. Team Spardha would also acknowledge National Supercomputing Mission (NSM) for providing computing resources of ‘PARAM Shakti’ at IIT Kharagpur, which is implemented by C-DAC and supported by the Ministry of Electronics and Information Technology (Meity) and Department of Science and Technology (DST), Government of India. Funding Information: We would like to make a special acknowledgment of the very generous support from the SDC2 computing partner facilities (Section ), without which a realistic and accessible Challenge would not have been possible. We acknowledge support from the Australian SKA Regional Centre (AusSRC) and the Pawsey Supercomputing Centre. This work was granted access to the HPC/AI resources of IDRIS under the allocations AP010412412, AP010412365, and AP010412404 made by GENCI. The authors acknowledge use of IRIS ( https://www.iris.ac.uk ) resources delivered by the SCD Cloud at STFC’s Rutherford Appleton Laboratory ( https://www.scd.stfc.ac.uk/Pages/STFC-Cloud-Operations.aspx ). This work was supported by a grant from the Swiss National Supercomputing Centre (CSCS). This work used resources of China SKA Regional Centre prototype (An et al. ; ) funded by the National Key R&D Programme of China (2018YFA0404603) and Chinese Academy of Sciences (114231KYSB20170003). The Enabling Green E-science for the Square Kilometre Array Research Infrastructure (ENGAGE-SKA) team acknowledges financial support from grant POCI-01-0145- FEDER022217, funded by Programa Operacional Competitividade e Internacionalização (COMPETE 2020) and the Fundação para a Ciência e a Tecnologia (FCT), Portugal. This work was also funded by FCT and Ministério da Ciência, Tecnologia e Ensino Superior (MCTES) through national funds and when applicable co-funded EU funds under the project UIDB/50008/2020-UIDP/50008/2020 and UID/EEA/50008/2019. The authors acknowledge the Laboratory for Advanced Computing at University of Coimbra for providing HPC, computing, consulting resources that have contributed to the research results reported within this paper or work. This work used the Spanish Prototype of an SRC (SPSRC) at IAA-CSIC, which is funded by SEV-2017- 0709, CEX2021-001131-S, RTI2018-096228-B-C31 AEI/ 10.13039/501100011033, EQC2019- 005707-P AEI/ 10.13039/501100011033 ERDF, EU, TED2021-130231B-I00 AEI/ 10.13039/501100011033EU NextGenerationEU, PRTR SOMM17_5208_IAAfunded by the Regional Government of Andalusia. We acknowledge the computing infrastructures of INAF, under the coordination of the ICT office of Scientific Directorate, for the availability of computing resources and support. ACKNOWLEDGEMENTS Publisher Copyright: © 2023 The Author(s) Published by Oxford University Press on behalf of Royal Astronomical Society.
PY - 2023/8
Y1 - 2023/8
N2 - The Square Kilometre Array Observatory (SKAO) will explore the radio sky to new depths in order to conduct transformational science. SKAO data products made available to astronomers will be correspondingly large and complex, requiring the application of advanced analysis techniques to extract key science findings. To this end, SKAO is conducting a series of Science Data Challenges, each designed to familiarize the scientific community with SKAO data and to drive the development of new analysis techniques. We present the results from Science Data Challenge 2 (SDC2), which invited participants to find and characterize 233 245 neutral hydrogen (H i) sources in a simulated data product representing a 2000 h SKA-Mid spectral line observation from redshifts 0.25-0.5. Through the generous support of eight international supercomputing facilities, participants were able to undertake the Challenge using dedicated computational resources. Alongside the main challenge, 'reproducibility awards' were made in recognition of those pipelines which demonstrated Open Science best practice. The Challenge saw over 100 participants develop a range of new and existing techniques, with results that highlight the strengths of multidisciplinary and collaborative effort. The winning strategy - which combined predictions from two independent machine learning techniques to yield a 20 per cent improvement in overall performance - underscores one of the main Challenge outcomes: that of method complementarity. It is likely that the combination of methods in a so-called ensemble approach will be key to exploiting very large astronomical data sets.
AB - The Square Kilometre Array Observatory (SKAO) will explore the radio sky to new depths in order to conduct transformational science. SKAO data products made available to astronomers will be correspondingly large and complex, requiring the application of advanced analysis techniques to extract key science findings. To this end, SKAO is conducting a series of Science Data Challenges, each designed to familiarize the scientific community with SKAO data and to drive the development of new analysis techniques. We present the results from Science Data Challenge 2 (SDC2), which invited participants to find and characterize 233 245 neutral hydrogen (H i) sources in a simulated data product representing a 2000 h SKA-Mid spectral line observation from redshifts 0.25-0.5. Through the generous support of eight international supercomputing facilities, participants were able to undertake the Challenge using dedicated computational resources. Alongside the main challenge, 'reproducibility awards' were made in recognition of those pipelines which demonstrated Open Science best practice. The Challenge saw over 100 participants develop a range of new and existing techniques, with results that highlight the strengths of multidisciplinary and collaborative effort. The winning strategy - which combined predictions from two independent machine learning techniques to yield a 20 per cent improvement in overall performance - underscores one of the main Challenge outcomes: that of method complementarity. It is likely that the combination of methods in a so-called ensemble approach will be key to exploiting very large astronomical data sets.
KW - galaxies: statistics
KW - methods: data analysis
KW - radio lines: galaxies
KW - software: simulations
KW - surveys
KW - techniques: imaging spectroscopy
UR - http://www.scopus.com/inward/record.url?scp=85162093450&partnerID=8YFLogxK
U2 - 10.1093/mnras/stad1375
DO - 10.1093/mnras/stad1375
M3 - Article
AN - SCOPUS:85162093450
SN - 0035-8711
VL - 523
SP - 1967
EP - 1993
JO - Monthly Notices of the Royal Astronomical Society
JF - Monthly Notices of the Royal Astronomical Society
IS - 2
ER -