Recent years have seen increased interest from the HPC community in Field Programmable Gate Arrays (FPGAs) as an alternative/additional accelerator. This has been largely due to the slowdown in the transistor scaling and the difficulty of gaining performance improvement and energy efficiency from the current processing solutions. General (scientific) software programmers have shied away from the FPGA technology because of their perceived lack of programmability. However, various academic and commercial vendors have now developed High-Level Synthesis (HLS) tools, such as Xilinx SDSoC OpenCL, SDSoC C++, Vivado HLS, Intel Altera SDK and solutions from Maxeler, which enable the generation of FPGA hardware configurations from higher-level descriptions. Even though HLS tools aim to minimize the hardware knowledge gap between the software programmers and FPGAs, HLS tool programming methodologies are still challenging for the software programmers aiming to achieve high performance. These HLS tools impose many choices for mapping the concurrency in HPC applications to FPGAs. The choices are complex as they include different options available at the programming language level and the HLS tool level for designing the host code, the kernel code, and the FPGA hardware itself. Furthermore, a wide choice of optimization methods controls the final design of the FPGA hardware. The many options and different parameter settings available can severely affect a design's performance and also the programmer's productivity and lead to a large design space exploration problem. The HPC software programmer has to spend much time finding the appropriate options and parameter settings that provide the best possible design. Furthermore, choosing a suitable HLS tool from those available is another complexity in utilizing HLS tools. This thesis explores and compares the options and techniques for mapping the concurrency levels of two weather and climate applications using two high-level HLS tools, Xilinx SDSoC OpenCL and SDSoC C++, and a low-level HLS tool, Xilinx Vivado HLS, to a single Xilinx Ultrascale+ FPGA board. Two exploratory and two comparison studies have been conducted, involving many experiments, to provide insight into the best mapping techniques for performance, resources usage, and programmability in single and multiple kernel solutions. In the exploratory studies of the design space, data was collected from various implementations and configurations of the two weather and climate applications. This data is then utilized and analyzed, using multiple metrics, in the two comparison studies, providing insights for traditional HPC programmers considering using FPGAs in their applications and also contributing evidence to support the possible future development of an efficient methodology for their use.
|Date of Award||31 Dec 2022|
- The University of Manchester
|Supervisor||Graham Riley (Supervisor) & Dirk Koch (Supervisor)|