Exploring Sparse Visual Odometry Acceleration with High-Level Synthesis

Research output: Contribution to journalArticlepeer-review

2 Downloads (Pure)


Visual Odometry (VO) systems are widely used to determine the position and orientation of a robot or camera in an unknown environment. They are deployed on resource-constrained platforms, such as drones, and virtual reality or augmented reality headsets. VO systems harnessing modern System-on-Chip (SoCs) with integrated Field Programmable Gate Array (FPGA) have the potential to improve overall performance. This paper explores the FPGA acceleration of sparse semi-direct VO kernels using High-level Synthesis (HLS). The selected sparse Semi-direct VO (SVO) system, since its conception, was developed to execute efficiently on low-power processors. We show that both computational and data transfer overheads between the processing cores and the accelerators on the reconfigurable fabric need to be optimized to obtain better end-to-end performance. The additional data movement incurred when using an FPGA accelerator is due to the sparse computational nature together with random memory access patterns of the kernels. This paper shows that state-of-the-art HLS tools are not yet able to perform the required optimizations automatically. These tools usually target successful application kernels with dense computational patterns and regular memory access. In this paper we propose three, potentially general, methods to reduce the data transfer between the processing cores and the customised hardware kernels on the FPGA; these methods are: (a) approximation based on domain-specific knowledge, (b) lossless image compression, and (c) the use of on-the-fly computation. We present a case study of the use of these methods on SVO, a state-of-the-art sparse VO system with a semi-direct front-end. We demonstrate that our proposed methods can reduce data transfer overhead to achieve better end-to-end performance and that they can be applied not only when using standard Xilinx tools, but also with other state-of-the-art HLS tools, such as HeteroFlow. Compared to the baseline performance of the original SVO software on Arm processors, our proposed methods enable the Xilinx SDSoC and HeteroFlow designs to achieve a speedup of 2.4x and 2.14x, respectively, without noticeable accuracy loss. The Xilinx SDSoC and HeteroFlow designs also achieve a 1.85x and 1.89x improvement in energy efficiency, respectively, on a Xilinx Zynq Ultrascale+ SoC with Arm A53 cores and integrated FPGA. Compared to the SVO software baseline running on the Intel Xeon system, our proposed methods enable the Xilinx SDSoC and HeteroFlow designs to achieve 8.2x and 8.3x improvement in energy efficiency, respectively.
Original languageEnglish
Pages (from-to)70741 - 70763
Number of pages23
JournalIEEE Access
Early online date26 Apr 2023
Publication statusPublished - 17 Jul 2023


  • SLAM
  • FPGA
  • High Level Synthesis
  • Performance optimization
  • hardware-software codesign


Dive into the research topics of 'Exploring Sparse Visual Odometry Acceleration with High-Level Synthesis'. Together they form a unique fingerprint.

Cite this