TY - UNPB
T1 - Optimising the Processing and Storage of Visibilities using lossy compression
AU - Dodson, Richard
AU - Williamson, Alex
AU - Gong, Qian
AU - Elahi, Pascal
AU - Wicenec, Andreas
AU - Rioja, Maria J.
AU - Chen, Jieyang
AU - Podhorszki, Norbert
AU - Klasky, Scott
N1 - 10 figures
PY - 2024/10/21
Y1 - 2024/10/21
N2 - The next-generation radio astronomy instruments are providing a massive increase in sensitivity and coverage, through increased stations in the array and frequency span. Two primary problems encountered when processing the resultant avalanche of data are the need for abundant storage and I/O. An example of this is the data deluge expected from the SKA Telescopes of more than 60PB per day, all to be stored on the buffer filesystem. Compressing the data is an obvious solution. We used MGARD, an error-controlled compressor, and applied it to simulated and real visibility data, in noise-free and noise-dominated regimes. As the data has an implicit error level in the system temperature, using an error bound in compression provides a natural metric for compression. Measuring the degradation of images reconstructed using the lossy compressed data, we explore the trade-off between these error bounds and the corresponding compression ratios, as well as the impact on science quality derived from the lossy compressed data products through a series of experiments. We studied the global and local impacts on the output images. We found relative error bounds of as much as $10\%$, which provide compression ratios of about 20, have a limited impact on the continuum imaging as the increased noise is less than the image RMS. For extremely sensitive observations and for very precious data, we would recommend a $0.1\%$ error bound with compression ratios of about 4. These have noise impacts two orders of magnitude less than the image RMS levels. At these levels, the limits are due to instabilities in the deconvolution methods. We compared the results to the alternative compression tool DYSCO. MGARD provides better compression for similar results, and has a host of potentially powerful additional features.
AB - The next-generation radio astronomy instruments are providing a massive increase in sensitivity and coverage, through increased stations in the array and frequency span. Two primary problems encountered when processing the resultant avalanche of data are the need for abundant storage and I/O. An example of this is the data deluge expected from the SKA Telescopes of more than 60PB per day, all to be stored on the buffer filesystem. Compressing the data is an obvious solution. We used MGARD, an error-controlled compressor, and applied it to simulated and real visibility data, in noise-free and noise-dominated regimes. As the data has an implicit error level in the system temperature, using an error bound in compression provides a natural metric for compression. Measuring the degradation of images reconstructed using the lossy compressed data, we explore the trade-off between these error bounds and the corresponding compression ratios, as well as the impact on science quality derived from the lossy compressed data products through a series of experiments. We studied the global and local impacts on the output images. We found relative error bounds of as much as $10\%$, which provide compression ratios of about 20, have a limited impact on the continuum imaging as the increased noise is less than the image RMS. For extremely sensitive observations and for very precious data, we would recommend a $0.1\%$ error bound with compression ratios of about 4. These have noise impacts two orders of magnitude less than the image RMS levels. At these levels, the limits are due to instabilities in the deconvolution methods. We compared the results to the alternative compression tool DYSCO. MGARD provides better compression for similar results, and has a host of potentially powerful additional features.
KW - astro-ph.IM
M3 - Preprint
BT - Optimising the Processing and Storage of Visibilities using lossy compression
PB - arXiv
CY - USA
ER -