TY - JOUR

T1 - Topology-learnable graph convolution for skeleton-based action recognition

AU - Zhu, Guangming

AU - Zhang, Liang

AU - Li, Hongsheng

AU - Shen, Peiyi

AU - Shah, Syed Afaq Ali

AU - Bennamoun, Mohammed

PY - 2020/7

Y1 - 2020/7

N2 - Graph convolutional networks (GCNs) generalize convolutional neural networks into irregular graph-like structures. Generally, graph topologies are set by hand and fixed over all layers. Handcrafted connections may not be optimal and cannot fully use the self-learning ability of deep learning. In this work, we explore a topology-learnable graph convolution for skeleton-based action recognition. Specifically, a spatial graph convolution can be decomposed into a feature learning component that evolves the features of each graph vertex, and a graph vertex fusion component in which the latent graph topologies can be learned adaptively. Different initialization strategies for the learnable fusion matrix are evaluated. Experimental results that are based on the spatial-temporal GCNs for skeleton-based action recognition, demonstrate that convolution can work on graphs like on images, even if only a specific fusion matrix initialization that uses adjacency matrices is applied. Moreover, the self-learning process can learn the latent topology of a graph beyond the handcrafted topology, thereby making graph convolution flexible and universal.

AB - Graph convolutional networks (GCNs) generalize convolutional neural networks into irregular graph-like structures. Generally, graph topologies are set by hand and fixed over all layers. Handcrafted connections may not be optimal and cannot fully use the self-learning ability of deep learning. In this work, we explore a topology-learnable graph convolution for skeleton-based action recognition. Specifically, a spatial graph convolution can be decomposed into a feature learning component that evolves the features of each graph vertex, and a graph vertex fusion component in which the latent graph topologies can be learned adaptively. Different initialization strategies for the learnable fusion matrix are evaluated. Experimental results that are based on the spatial-temporal GCNs for skeleton-based action recognition, demonstrate that convolution can work on graphs like on images, even if only a specific fusion matrix initialization that uses adjacency matrices is applied. Moreover, the self-learning process can learn the latent topology of a graph beyond the handcrafted topology, thereby making graph convolution flexible and universal.

KW - Action recognition

KW - Graph convolution

KW - Graph topology

KW - Skeleton

UR - http://www.scopus.com/inward/record.url?scp=85084675068&partnerID=8YFLogxK

U2 - 10.1016/j.patrec.2020.05.005

DO - 10.1016/j.patrec.2020.05.005

M3 - Article

AN - SCOPUS:85084675068

VL - 135

SP - 286

EP - 292

JO - Pattern Recogniton Letters

JF - Pattern Recogniton Letters

SN - 0167-8655

ER -