The performance impact of data modelling for large scale astronomical data analysis on distributed and parallel platforms

Geoffrey Duniam

Research output: ThesisDoctoral Thesis

21 Downloads (Pure)


The new generation of large-scale telescopes will produce thousands of data products in hundreds of GBs to multiple TBs, increasing in complexity as they increase in size. Large scale data products will present several challenges that will need to be addressed to enable the science outputs from this data. This thesis investigates new techniques to extract, transform, model and load data from large scale raw data products into efficient data structures for storage and retrieval on an open-source parallel storage and processing platform. These approaches are investigated and demonstrated by exploring three separate use cases working with astronomical data.
Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • The University of Western Australia
  • Wu, Chen, Supervisor
  • Kitaeff, Slava, Supervisor
  • Wicenec, Andreas, Supervisor
Thesis sponsors
Award date4 Oct 2023
Publication statusUnpublished - 2023

Cite this