Abstract
Autonomous X-ray baggage security screening has shown significant strides recently, proving itself a viable solution to the flaws in manual screening, thanks to advancements in deep learning. However, these data-hungry techniques feed on extensively annotated data involving strenuous labor, impeding their advances in baggage screening. Consequently, we present a context-aware transformer for weakly supervised localization to relieve the annotation burden and provide visual interpretability that aids screeners in threat recognition and researchers in identifying the pitfalls of existing systems. The proposed approach can generalize and localize different types of contraband with only cost-effective binary labels without explicit training on item detection. Context extraction block, integrated into the dual-token framework, generates threat-aware context maps, while the token scoring block focuses on minimizing partial activations. Experimental results surpass state of the art (SOTA) methods in terms of classification and localization accuracies. Furthermore, we analyze failures to determine current vulnerabilities and provide new insights for future research.
Original language | English |
---|---|
Article number | 10401259 |
Pages (from-to) | 6563-6572 |
Number of pages | 10 |
Journal | IEEE Transactions on Industrial Informatics |
Volume | 20 |
Issue number | 4 |
Early online date | 17 Jan 2024 |
DOIs | |
Publication status | Published - 1 Apr 2024 |