Abstract
© 2014 IEEE. The aim of subspace clustering is to find groups of similar data points in all possible subspaces of a dataset. Since the number of subspaces is exponential in dimensions, subspace clustering is usually computationally very expensive. The performance of existing algorithms deteriorates drastically with the increase in number of dimensions. Most of them use bottom-up search strategy and there are two main reasons for their inefficiency: (1) Multiple database scans. (2) Either implicit or explicit generation of trivial subspace clusters during the process. We present SUBSCALE, a novel algorithm to directly find the non-trivial subspace clusters with minimal cost and it requires only k database scans for a k-dimensional data set. Our algorithm scales very well with the dimensionality and is highly parallelizable. The experimental evaluation has shown promising results.
Original language | English |
---|---|
Title of host publication | Proceedings 2014 IEEE International Conference on Data Mining Workshop (ICDMW) |
Place of Publication | New Jersey, USA |
Publisher | IEEE, Institute of Electrical and Electronics Engineers |
Pages | 621-628 |
Volume | N/A |
ISBN (Electronic) | 9781479942756 |
ISBN (Print) | 9781479942749 |
DOIs | |
Publication status | Published - 2015 |
Event | 2014 IEEE International Conference on Data Mining Workshop - Shenzhen, China Duration: 14 Dec 2014 → 14 Dec 2014 |
Workshop
Workshop | 2014 IEEE International Conference on Data Mining Workshop |
---|---|
Abbreviated title | ICDMW |
Country/Territory | China |
City | Shenzhen |
Period | 14/12/14 → 14/12/14 |