Exploring Visual SLAM Acceleration with High-Level Synthesis

Student thesis: Phd

Abstract

Visual Simultaneous Localisation And Mapping (SLAM) is the problem of creating a map of an unknown environment using only visual sensors (e.g. camera) while trying to determine the pose of the agent within said environment at the same time. A vi- sual SLAM is typically a pipeline with two modules, localisation and mapping, which are executed in parallel. The pipeline receives one image at a time and synchronises between these modules. The localisation module usually includes two computational kernels: image feature detection and pose estimation. Applications of visual SLAMs are deployed on edge platforms, such as robots, drones and Augmented Reality (AR) / Virtual Reality (VR) headsets. Such platforms are typically battery-powered and thus, energy efficiency is an important factor in designing computing systems for visual SLAMs. System-on-Chips (SoCs) with low-power processors and tightly coupled accelera- tors, such as Application-Specific Integrated Circuits (ASICs) and Graphics Processing Units (GPUs), are considered for visual SLAM systems on edge platforms. Although these SoCs enable visual SLAMs to reach real-time performance, ASICs are typically inflexible to changes from the software applications. While GPUs are more flexi- ble than ASICs due to their programming models, their performance usually comes with high power consumption. Thus, filling the gap, Field Programmable Gate Arrays (FPGAs) offer the opportunity to develop application-specific accelerators without the complexity of ASICs. Due to their reconfigurability, bespoke hardware accelerators can be created and updated during deployment for various applications. Further, FP- GAs can offer better energy efficiency than processors and GPUs. Traditionally, FPGA developers need to be proficient in Hardware Description Lan- guages (HDLs) which tend to be low-level. HDLs make it difficult for SLAM develop- ers (normally software developers) to utilise FPGAs. However, with the development of High-Level Synthesis (HLS), it is now possible to generate hardware designs for FP- GAs by writing high-level C-based or Python-based code. HLS makes FPGAs more accessible to software developers. This thesis explores the HLS acceleration of visual SLAMs, addressing SoCs with integrated FPGAs. A comparative study is first conducted to determine: What are the differences between state-of-the-art GPU- and FPGA-accelerated Features from Accelerated Segment Test (FAST) detector, considering a visual SLAM pipeline? The study considers the following metrics: run-time performance, energy efficiency and accuracy. This first study shows that a visual SLAM system integrated with a state-of- the-art FAST GPU accelerator achieves the best run-time performance on an Nvidia Jetson Orin platform, while a visual SLAM system integrated with a state-of-the-art FAST FPGA accelerator achieves the best energy efficiency and comparable accuracy on an AMD VCK190 platform. The improvement in energy efficiency is up to 2× compared to the GPU counterpart. After the comparative study, this thesis explores the HLS acceleration of a well- known Visual Odometry (VO) system, Semi-direct monocular Visual Odometry (SVO). The thesis answers the question: How can semi-direct pose estimation kernels with sparse computations and random memory access be accelerated using HLS to obtain better run-time performance and energy efficiency while retaining similar accuracy? Sparse computations and random memory access patterns lead to large data transfer overhead between the processors and the FPGA accelerators. The modern HLS tools are not able to optimise this type of overhead automatically and efficiently. Therefore, this thesis proposes and evaluates three methods to optimise the data transfer overhead: algorithmic approximation with domain-specific knowledge, lossless image compres- sion and on-the-fly computation. With the proposed methods, the FPGA-accelerated SVO can achieve up to 2.4× and 8.3× improvements in run-
Date of Award1 Aug 2024
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorGraham Riley (Supervisor) & Mikel Luján (Supervisor)

Keywords

  • SLAM
  • HLS
  • Software Performance Optimisation
  • Domain-specific Acceleration

Cite this

'