TY - JOUR
T1 - Anticancer Peptides Classification Using Kernel Sparse Representation Classifier
AU - Fazal, Ehtisham
AU - Ibrahim, Muhammad Sohail
AU - Park, Seongyong
AU - Naseem, Imran
AU - Wahab, Abdul
PY - 2023/2/20
Y1 - 2023/2/20
N2 - Cancer is one of the most challenging diseases because of its complexity, variability, and diversity of causes. It has been one of the major research topics over the past decades, yet it is still poorly understood. To this end, multifaceted therapeutic frameworks are indispensable. Anticancer peptides (ACPs) are the most promising treatment option, but their large-scale identification and synthesis require reliable prediction methods, which is still a problem. In this paper, we present an intuitive classification strategy that differs from the traditional black-box method and is based on the well-known statistical theory of sparse-representation classification (SRC). Specifically, we create over-complete dictionary matrices by embedding the composition of the K-spaced amino acid pairs (CKSAAP). Unlike the traditional SRC frameworks, we use an efficient matching pursuit solver instead of the computationally expensive basis pursuit solver in this strategy. Furthermore, the kernel principal component analysis (KPCA) is employed to cope with non-linearity and dimension reduction of the feature space whereas the synthetic minority oversampling technique (SMOTE) is used to balance the dictionary. The proposed method is evaluated on two benchmark datasets for well-known statistical parameters and is found to outperform the existing methods. The results show the highest sensitivity with the most balanced accuracy, which might be beneficial in understanding structural and chemical aspects and developing new ACPs. The Google-Colab implementation of the proposed method is available on the GitHub page (https://github.com/ehtisham-Fazal/ACP-Kernel-SRC).
AB - Cancer is one of the most challenging diseases because of its complexity, variability, and diversity of causes. It has been one of the major research topics over the past decades, yet it is still poorly understood. To this end, multifaceted therapeutic frameworks are indispensable. Anticancer peptides (ACPs) are the most promising treatment option, but their large-scale identification and synthesis require reliable prediction methods, which is still a problem. In this paper, we present an intuitive classification strategy that differs from the traditional black-box method and is based on the well-known statistical theory of sparse-representation classification (SRC). Specifically, we create over-complete dictionary matrices by embedding the composition of the K-spaced amino acid pairs (CKSAAP). Unlike the traditional SRC frameworks, we use an efficient matching pursuit solver instead of the computationally expensive basis pursuit solver in this strategy. Furthermore, the kernel principal component analysis (KPCA) is employed to cope with non-linearity and dimension reduction of the feature space whereas the synthetic minority oversampling technique (SMOTE) is used to balance the dictionary. The proposed method is evaluated on two benchmark datasets for well-known statistical parameters and is found to outperform the existing methods. The results show the highest sensitivity with the most balanced accuracy, which might be beneficial in understanding structural and chemical aspects and developing new ACPs. The Google-Colab implementation of the proposed method is available on the GitHub page (https://github.com/ehtisham-Fazal/ACP-Kernel-SRC).
KW - Amino acid composition (AAC)
KW - Amino acids
KW - Cancer
KW - Dictionaries
KW - Encoding
KW - Peptides
KW - Principal component analysis
KW - anticancer peptide (ACP)
KW - composition of the K-spaced amino acid pairs (CKSAAP)
KW - kernel sparse reconstruction classification (KSRC) matching pursuit (MP)
KW - over-complete dictionary (OCD)
KW - Sample-specific classification
UR - https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=uwapure5-25&SrcAuth=WosAPI&KeyUT=WOS:000945278000001&DestLinkType=FullRecord&DestApp=WOS
U2 - 10.1109/ACCESS.2023.3246927
DO - 10.1109/ACCESS.2023.3246927
M3 - Article
SN - 2169-3536
VL - 11
SP - 17626
EP - 17637
JO - IEEE Access
JF - IEEE Access
ER -