Simultaneous Detection and Tracking with Motion Modelling for Multiple Object Tracking

Shijie Sun, Naveed Akhtar, Xiangyu Song, Huansheng Song, Ajmal Mian, Mubarak Shah

Research output: Chapter in Book/Conference paperConference paperpeer-review

21 Citations (Scopus)


Deep learning based Multiple Object Tracking (MOT) currently relies on off-the-shelf detectors for tracking-by-detection. This results in deep models that are detector biased and evaluations that are detector influenced. To resolve this issue, we introduce Deep Motion Modeling Network (DMM-Net) that can estimate multiple objects’ motion parameters to perform joint detection and association in an end-to-end manner. DMM-Net models object features over multiple frames and simultaneously infers object classes, visibility and their motion parameters. These outputs are readily used to update the tracklets for efficient MOT. DMM-Net achieves PR-MOTA score of 12.80 @ 120+ fps for the popular UA-DETRAC challenge - which is better performance and orders of magnitude faster. We also contribute a synthetic large-scale public dataset Omni-MOT for vehicle tracking that provides precise ground-truth annotations to eliminate the detector influence in MOT evaluation. This 14M+ frames dataset is extendable with our public script (Code at Dataset, Dataset Recorder, Omni-MOT Source). We demonstrate the suitability of Omni-MOT for deep learning with DMM-Net, and also make the source code of our network public.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2020 - 16th European Conference, 2020, Proceedings
EditorsAndrea Vedaldi, Horst Bischof, Thomas Brox, Jan-Michael Frahm
PublisherSpringer Science + Business Media
Number of pages18
ISBN (Print)9783030585853
Publication statusPublished - 2020
Event16th European Conference on Computer Vision, ECCV 2020 - Glasgow, United Kingdom
Duration: 23 Aug 202028 Aug 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12369 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference16th European Conference on Computer Vision, ECCV 2020
Country/TerritoryUnited Kingdom
Internet address


Dive into the research topics of 'Simultaneous Detection and Tracking with Motion Modelling for Multiple Object Tracking'. Together they form a unique fingerprint.

Cite this