Small Object Augmentation of Urban Scenes for Real-Time Semantic Segmentation

Zhengeng Yang, Hongshan Yu, Mingtao Feng, Wei Sun, Xuefei Lin, Mingui Sun, Zhi Hong Mao, Ajmal Mian

Research output: Contribution to journalArticlepeer-review

46 Citations (Scopus)


Semantic segmentation is a key step in scene understanding for autonomous driving. Although deep learning has significantly improved the segmentation accuracy, current high-quality models such as PSPNet and DeepLabV3 are inefficient given their complex architectures and reliance on multi-scale inputs. Thus, it is difficult to apply them to real-time or practical applications. On the other hand, existing real-time methods cannot yet produce satisfactory results on small objects such as traffic lights, which are imperative to safe autonomous driving. In this paper, we improve the performance of real-time semantic segmentation from two perspectives, methodology and data. Specifically, we propose a real-time segmentation model coined Narrow Deep Network (NDNet) and build a synthetic dataset by inserting additional small objects into the training images. The proposed method achieves 65.7% mean intersection over union (mIoU) on the Cityscapes test set with only 8.4G floating-point operations (FLOPs) on 1024\times 2048 inputs. Furthermore, by re-training the existing PSPNet and DeepLabV3 models on our synthetic dataset, we obtained an average 2% mIoU improvement on small objects.

Original languageEnglish
Article number9040271
Pages (from-to)5175-5190
Number of pages16
JournalIEEE Transactions on Image Processing
Publication statusPublished - 2020


Dive into the research topics of 'Small Object Augmentation of Urban Scenes for Real-Time Semantic Segmentation'. Together they form a unique fingerprint.

Cite this