Modeling Sub-Event Dynamics in First-Person Action Recognition

Hasan Firdaus Bin Mohd Zaki, Faisal Shafait, Ajmal Saeed Mian

Research output: Chapter in Book/Conference paperConference paper

14 Citations (Scopus)

Abstract

First-person videos have unique characteristics such as heavy egocentric motion, strong preceding events, salient transitional activities and post-event impacts. Action recognition methods designed for third person videos may not optimally represent actions captured by first-person videos. We propose a method to represent the high level dynamics of sub-events in first-person videos by dynamically pooling features of sub-intervals of time series using a temporal
feature pooling function. The sub-event dynamics are then temporally aligned to make a new series. To keep track of how the sub-event dynamics evolve over time, we recursively employ the Fast Fourier Transform on a pyramidal temporal structure. The Fourier coefficients of the segment define the overall video representation. We perform experiments on two existing benchmark first-person video datasets which have been captured in a controlled environment. Addressing this gap, we introduce a new dataset collected from YouTube which has a larger number of classes and a greater diversity of capture conditions thereby more closely depicting real-world challenges in first-person video analysis. We compare our method to state-of-the-art first person and generic video recognition algorithms. Our method consistently outperforms the nearest competitors by 10.3%, 3.3% and 11.7% respectively on the three datasets.
Original languageEnglish
Title of host publicationProceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017
Place of PublicationUnited States
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages1619-1628
ISBN (Print)9781538604571
Publication statusPublished - 2017
Event30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017 - Honolulu, United States
Duration: 21 Jul 201726 Jul 2017

Conference

Conference30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017
CountryUnited States
CityHonolulu
Period21/07/1726/07/17

Fingerprint Dive into the research topics of 'Modeling Sub-Event Dynamics in First-Person Action Recognition'. Together they form a unique fingerprint.

  • Cite this

    Mohd Zaki, H. F. B., Shafait, F., & Mian, A. S. (2017). Modeling Sub-Event Dynamics in First-Person Action Recognition. In Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017 (pp. 1619-1628). IEEE, Institute of Electrical and Electronics Engineers. http://openaccess.thecvf.com/content_cvpr_2017/papers/Zaki_Modeling_Sub-Event_Dynamics_CVPR_2017_paper.pdf