Deep Learning for Semantic Feature Extraction in Aerial Imagery

  • Ananya Gupta

Student thesis: Phd


Remote sensing provides image and LiDAR data that can be useful for a number of tasks such as disaster mapping and surveying. Deep learning (DL) has been shown to provide good results in extracting knowledge from input data sources by the means of learning intermediate representation features. However, popular DL methods require large scaled datasets for training which are costly and time-consuming to obtain. This thesis investigates semantic knowledge extraction from remote sensing data using DL methods in regimes with limited labelled data. Firstly, semantic segmentation methods are compared and analysed on the task of aerial image segmentation. It is shown that pretraining on ImageNet improves the segmentation results despite the domain shift between ImageNet images and aerial images. A framework for mapping road networks in disaster struck areas is proposed. It uses pre and post disaster imagery and labels from OpenStreetMaps (OSM), forgoing the need for costly manually labelled data. Graph-based methods are used to update the pre-existing road maps from OSM. Experiments on a disaster dataset from Palu, Indonesia show the efficacy of the proposed method. A method for semantic feature extraction from aerial imagery is proposed which is shown to work well for multitemporal high resolution image registration. These feature are able to deal with temporal variations caused by seasonal changes. Methods for tree identification in LiDAR data have been proposed to overcome the need for manually labelled data. The first method works on high density point clouds and uses certain LiDAR data attributes for tree identification, achieving almost 90% accuracy. The second uses a voxel based 3D Convolutional Neural Network on low density LiDAR datasets and is able to identify most large trees. The third method is a scaled version of PointNet++ and achieves an F_score of 82.1 on the ISPRS benchmark, comparable to the state of the art methods but with increased efficiency. Finally, saliency methods used for explainability in image analysis are extended to work on 3D point clouds and voxel-based networks to help aid explainability in this area. It is shown that edge and corner features are deemed important by these networks for classification. These features are also demonstrated to be inherently sparse and pruned easily.
Date of Award1 Aug 2021
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorHujun Yin (Supervisor) & Simon Watson (Supervisor)


  • Aerial Imagery
  • Point Cloud
  • Satellite Imagery
  • Semantic Segmentation
  • Deep Learning
  • LiDAR

Cite this