TY - JOUR
T1 - Efficient Decomposition Selection for Multi-class Classification
AU - Chen, Yawen
AU - Wen, Zeyi
AU - He, Bingsheng
AU - Chen, Jian
PY - 2023/4/1
Y1 - 2023/4/1
N2 - Choosing a decomposition method for multi-class classification is an important trade-off between efficiency and predictive accuracy. Trying all the decomposition methods to find the best one is too time-consuming for many applications, while choosing the wrong one may result in large loss on predictive accuracy. In this paper, we propose an automatic decomposition method selection approach called ``D-Chooser", which is lightweight and can choose the best decomposition method accurately. D-Chooser is equipped with our proposed difficulty index which consists of sub-metrics including distribution divergence, overlapping regions, unevenness degree and relative size of the solution space. The difficulty index has two intriguing properties: 1) fast to compute and 2) measuring multi-class problems comprehensively. Extensive experiments on real-world multi-class problems show that D-Chooser achieves an accuracy of 83.3% in choosing the best decomposition method. It can choose the best method in just a few seconds, while existing approaches verify the effectiveness of a decomposition method often takes a few hours. We also provide case studies on Kaggle competitions and the results confirm that D-Chooser is able to choose a better decomposition method than the winning solutions.
AB - Choosing a decomposition method for multi-class classification is an important trade-off between efficiency and predictive accuracy. Trying all the decomposition methods to find the best one is too time-consuming for many applications, while choosing the wrong one may result in large loss on predictive accuracy. In this paper, we propose an automatic decomposition method selection approach called ``D-Chooser", which is lightweight and can choose the best decomposition method accurately. D-Chooser is equipped with our proposed difficulty index which consists of sub-metrics including distribution divergence, overlapping regions, unevenness degree and relative size of the solution space. The difficulty index has two intriguing properties: 1) fast to compute and 2) measuring multi-class problems comprehensively. Extensive experiments on real-world multi-class problems show that D-Chooser achieves an accuracy of 83.3% in choosing the best decomposition method. It can choose the best method in just a few seconds, while existing approaches verify the effectiveness of a decomposition method often takes a few hours. We also provide case studies on Kaggle competitions and the results confirm that D-Chooser is able to choose a better decomposition method than the winning solutions.
KW - Codes
KW - Decomposition method
KW - Indexes
KW - Kernel
KW - Machine Learning
KW - Matrix decomposition
KW - Multi-class classification
KW - Probability distribution
KW - Support vector machines
KW - Training
UR - http://www.scopus.com/inward/record.url?scp=85121382110&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2021.3130239
DO - 10.1109/TKDE.2021.3130239
M3 - Article
AN - SCOPUS:85121382110
SN - 1041-4347
VL - 35
SP - 3751
EP - 3764
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 4
ER -