Fast and scalable subspace clustering of high dimensional data

    Research output: ThesisDoctoral Thesis

    421 Downloads (Pure)

    Abstract

    This thesis is focussed on finding clusters and outliers in the subspaces of high-dimensional data. A subspace is a subset of the data dimensions. The number of subspaces increases exponentially with the increase in the data dimensionality, which poses challenges. We propose SUBSCALE, a scalable and efficient subspace clustering algorithm, to find non-redundant clusters without using expensive indexing structures or performing multiple data scans. Using parallel and distributed implementations, we bring further improvements in the performance of the SUBSCALE algorithm. Finally, we extend the SUBSCALE algorithm to find subspace outliers and rank them by strength of their outlying behaviour.
    Original languageEnglish
    QualificationDoctor of Philosophy
    Awarding Institution
    • The University of Western Australia
    Supervisors/Advisors
    • Datta, Amitava, Supervisor
    • McDonald, Chris, Supervisor
    Award date5 Oct 2016
    Publication statusUnpublished - 2016

    Fingerprint

    Dive into the research topics of 'Fast and scalable subspace clustering of high dimensional data'. Together they form a unique fingerprint.

    Cite this