Skip to main navigation Skip to search Skip to main content

Comparison of three statistical classification techniques for maser identification

  • Ellen M. Manning
  • , Barbara R. Holland
  • , Simon P. Ellingsen
  • , Shari L. Breen
  • , Xi Chen
  • , Melissa Humphries

Research output: Contribution to journalArticlepeer-review

Abstract

We applied three statistical classification techniques-linear discriminant analysis (LDA), logistic regression, and random forests-to three astronomical datasets associated with searches for interstellar masers.We compared the performance of these methods in identifying whether specific mid-infrared or millimetre continuum sources are likely to have associated interstellar masers. We also discuss the interpretability of the results of each classification technique. Non-parametric methods have the potential tomake accurate predictions when there are complex relationships between critical parameters. We found that for the small datasets the parametric methods logistic regression and LDA performed best, for the largest dataset the non-parametric method of random forests performed with comparable accuracy to parametric techniques, rather than any significant improvement. This suggests that at least for the specific examples investigated here accuracy of the predictions obtained is not being limited by the use of parametric models.We also found that for LDA, transformation of the data to match a normal distribution led to a significant improvement in accuracy. The different classification techniques had significant overlap in their predictions; further astronomical observations will enable the accuracy of these predictions to be tested.

Original languageEnglish
Article numbere015
JournalPublications of the Astronomical Society of Australia
Volume33
DOIs
Publication statusPublished - 1 Jan 2016
Externally publishedYes

Fingerprint

Dive into the research topics of 'Comparison of three statistical classification techniques for maser identification'. Together they form a unique fingerprint.

Cite this