Fast and scalable subspace clustering of high dimensional data

Amardeep Kaur

    Research output: ThesisDoctoral Thesis

    460 Downloads (Pure)


    This thesis is focussed on finding clusters and outliers in the subspaces of high-dimensional data. A subspace is a subset of the data dimensions. The number of subspaces increases exponentially with the increase in the data dimensionality, which poses challenges. We propose SUBSCALE, a scalable and efficient subspace clustering algorithm, to find non-redundant clusters without using expensive indexing structures or performing multiple data scans. Using parallel and distributed implementations, we bring further improvements in the performance of the SUBSCALE algorithm. Finally, we extend the SUBSCALE algorithm to find subspace outliers and rank them by strength of their outlying behaviour.
    Original languageEnglish
    QualificationDoctor of Philosophy
    Awarding Institution
    • The University of Western Australia
    • Datta, Amitava, Supervisor
    • McDonald, Chris, Supervisor
    Award date5 Oct 2016
    Publication statusUnpublished - 2016


    Dive into the research topics of 'Fast and scalable subspace clustering of high dimensional data'. Together they form a unique fingerprint.

    Cite this