TY - JOUR
T1 - Enhancing object recognition
T2 - The role of object knowledge decomposition and component-labeled datasets
AU - Xiong, Nuoye
AU - Wang, Ning
AU - Li, Hongsheng
AU - Zhu, Guangming
AU - Zhang, Liang
AU - Shah, Syed Afaq Ali
AU - Bennamoun, Mohammed
PY - 2025/2/7
Y1 - 2025/2/7
N2 - Deep learning models' decision-making processes can be elusive, often raising concerns about their reliability. To address this, we have introduced the Object Knowledge Decomposition and Components Label Dataset (OKD-CL), designed to improve the interpretability and accuracy of object recognition models. This dataset includes 99 categories from PartImageNet, each detailed with clear physical structures that align with human visual concepts. Ina hierarchical structure, every category is described by Abstract Component Knowledge (ACK) descriptions and each image instance comes with Explicit Visual Knowledge (EVK) masks, highlighting the visual components' appearance. By evaluating multiple deep neural networks guided with ACK and EVK (dual-knowledge-guidance approach), we saw better accuracy and a higher Foreground Reasoning Ratio (FRR), confirming our knowledge-guided method's effectiveness. When used on the Hard-ImageNet dataset, this approach reduced the model's reliance on incorrect feature assumptions without sacrificing classification accuracy. This hierarchical comprehension encouraged by OKD-CL is crucial in minimizing incorrect feature associations and strengthening model robustness. The entire code and dataset are available on: https://github. com/XiGuaBo/OKD-CL.
AB - Deep learning models' decision-making processes can be elusive, often raising concerns about their reliability. To address this, we have introduced the Object Knowledge Decomposition and Components Label Dataset (OKD-CL), designed to improve the interpretability and accuracy of object recognition models. This dataset includes 99 categories from PartImageNet, each detailed with clear physical structures that align with human visual concepts. Ina hierarchical structure, every category is described by Abstract Component Knowledge (ACK) descriptions and each image instance comes with Explicit Visual Knowledge (EVK) masks, highlighting the visual components' appearance. By evaluating multiple deep neural networks guided with ACK and EVK (dual-knowledge-guidance approach), we saw better accuracy and a higher Foreground Reasoning Ratio (FRR), confirming our knowledge-guided method's effectiveness. When used on the Hard-ImageNet dataset, this approach reduced the model's reliance on incorrect feature assumptions without sacrificing classification accuracy. This hierarchical comprehension encouraged by OKD-CL is crucial in minimizing incorrect feature associations and strengthening model robustness. The entire code and dataset are available on: https://github. com/XiGuaBo/OKD-CL.
KW - Classification enhancement
KW - Human knowledge
KW - Image classification
KW - Neural networks explained
UR - https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=uwapure5-25&SrcAuth=WosAPI&KeyUT=WOS:001367559600001&DestLinkType=FullRecord&DestApp=WOS_CPL
U2 - 10.1016/j.neucom.2024.128969
DO - 10.1016/j.neucom.2024.128969
M3 - Article
SN - 0925-2312
VL - 617
JO - Neurocomputing
JF - Neurocomputing
M1 - 128969
ER -