Feature Pyramid Encoding Network for Real-time Semantic Segmentation

Mengyu Liu, Hujun Yin

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


Although current deep learning methods have achieved impressive results for semantic segmentation, they incur high computational costs and have a huge number of parameters. For real-time applications, inference speed and memory usage are two important factors. To address the challenge, we propose a lightweight feature pyramid encoding network (FPENet) to make a good trade-off between accuracy and speed. Specifically, we use a feature pyramid encoding block to encode multi-scale contextual features with depthwise dilated convolutions in all stages of the encoder. A mutual embedding upsample module is introduced in the decoder to aggregate the high-level semantic features and low-level spatial details efficiently. The proposed network outperforms existing realtime methods with fewer parameters and improved inference speed on the Cityscapes and CamVid benchmark datasets. Specifically, FPENet achieves 68.0% mean IoU on the Cityscapes test set with only 0.4M parameters and 102 FPS speed on one NVIDIA TITAN V card.
Original languageEnglish
Title of host publicationBritish Machine Vision Conference
Publication statusAccepted/In press - 1 Jul 2019


Dive into the research topics of 'Feature Pyramid Encoding Network for Real-time Semantic Segmentation'. Together they form a unique fingerprint.

Cite this