Self-Supervised Video Object Segmentation by Motion-Aware Mask Propagation

Research output: Chapter in Book/Conference paperConference paperpeer-review

3 Citations (Scopus)

Abstract

We propose a self-supervised spatio-temporal matching method, coined Motion-Aware Mask Propagation (MAMP), for video object segmentation. MAMP leverages the frame reconstruction task for training without the need for annotations. During inference, MAMP builds a dynamic memory bank and propagates masks according to our proposed motion-aware spatio-temporal matching module, which is able to handle fast motion and long-term matching scenarios. Evaluation on DAVIS-2017 and YouTube-VOS datasets show that MAMP achieves state-of-the-art performance with stronger generalization ability compared to existing self-supervised methods, i.e., 4.2% higher mean \mathcal{J} & \mathcal{F} on DAVIS-2017 and 4.85% higher mean \mathcal{J} & \mathcal{F} on the unseen categories of YouTube-VOS than the nearest competitor. Moreover, MAMP performs at par with many supervised video object segmentation methods. Our code is available at: https://github.com/bo-miao/MAMP.

Original languageEnglish
Title of host publicationICME 2022 - IEEE International Conference on Multimedia and Expo 2022, Proceedings
PublisherIEEE, Institute of Electrical and Electronics Engineers
ISBN (Electronic)9781665485630
DOIs
Publication statusPublished - 2022
Event2022 IEEE International Conference on Multimedia and Expo, ICME 2022 - Taipei, Taiwan, Province of China
Duration: 18 Jul 202222 Jul 2022

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
Volume2022-July
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Conference

Conference2022 IEEE International Conference on Multimedia and Expo, ICME 2022
Country/TerritoryTaiwan, Province of China
CityTaipei
Period18/07/2222/07/22

Fingerprint

Dive into the research topics of 'Self-Supervised Video Object Segmentation by Motion-Aware Mask Propagation'. Together they form a unique fingerprint.

Cite this