TY - GEN
T1 - Spatial Consistency and Feature Diversity Regularization in Transfer Learning for Fine-Grained Visual Categorization
AU - Dai, Zhigang
AU - Chen, Junying
AU - Mian, Ajmal
N1 - Funding Information:
This work was supported in part by the National Natural Science Foundation of China under Grant 61802130, and in part by the Guangdong Natural Science Foundation under Grant 2019A1515012152 and 2021A1515012651. Ajmal Mian is the recipient of an Australian Research Council Future Fellowship (project number FT210100268) funded by the Australian Government. *Corresponding author.
Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Fine-grained visual categorization is challenged by limited training data by localizing discriminative regions and learning diverse features. We propose an effective regularization method that simultaneously imposes spatial consistency and feature diversity on CNN feature maps from a unified perspective. The former guides different feature map channels to concentrate collaboratively on the discriminative areas while the latter ensures that the feature maps are diverse. The proposed method does not require additional supervision, and leverages the covariance matrix of multi-channel feature maps to regularize the loss at the last convolutional layer where the semantic information is the richest. This allows the influence to be backpropagated to update all convolutional layers. We perform experiments using four network architectures for transfer learning from two source domains to three target domains, and demonstrate that our regularization method improves accuracy in all different settings. The proposed regularization method achieves state-of-the-art performance on CUB-200-2011, Stanford-Cars, and Stanford-Dogs datasets with 89.8%, 94.6%, and 88.5% accuracy, respectively.
AB - Fine-grained visual categorization is challenged by limited training data by localizing discriminative regions and learning diverse features. We propose an effective regularization method that simultaneously imposes spatial consistency and feature diversity on CNN feature maps from a unified perspective. The former guides different feature map channels to concentrate collaboratively on the discriminative areas while the latter ensures that the feature maps are diverse. The proposed method does not require additional supervision, and leverages the covariance matrix of multi-channel feature maps to regularize the loss at the last convolutional layer where the semantic information is the richest. This allows the influence to be backpropagated to update all convolutional layers. We perform experiments using four network architectures for transfer learning from two source domains to three target domains, and demonstrate that our regularization method improves accuracy in all different settings. The proposed regularization method achieves state-of-the-art performance on CUB-200-2011, Stanford-Cars, and Stanford-Dogs datasets with 89.8%, 94.6%, and 88.5% accuracy, respectively.
KW - feature maps
KW - fine-grained visual categorization
KW - Regularization
KW - transfer learning
UR - http://www.scopus.com/inward/record.url?scp=85142751387&partnerID=8YFLogxK
U2 - 10.1109/SMC53654.2022.9945315
DO - 10.1109/SMC53654.2022.9945315
M3 - Conference paper
AN - SCOPUS:85142751387
VL - 2022
T3 - Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
SP - 522
EP - 529
BT - 2022 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2022 - Proceedings
PB - IEEE, Institute of Electrical and Electronics Engineers
T2 - 2022 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2022
Y2 - 9 October 2022 through 12 October 2022
ER -