FINESSD: Near-Storage Feature Selection with Mutual Information for Resource-Limited FPGAs

Research output: Contribution to conferencePaperpeer-review

28 Downloads (Pure)

Abstract

Feature selection is the data analysis process that selects a smaller and curated subset of the original dataset by filtering out data (features) which are irrelevant or redundant. The most important features can be ranked and selected based on statistical measures, such as mutual information. Feature selection not only reduces the size of dataset as well as the execution time for training Machine Learning (ML) models, but it can also improve the accuracy of the inference. This paper analyses mutual-information-based feature selection for resource-constrained FPGAs and proposes FINESSD, a novel approach that can be deployed for near-storage acceleration. This paper highlights that the Mutual Information Maximization (MIM) algorithm does not require multiple passes over the data while being a good trade-off between accuracy and FPGA resources, when approximated appropriately. The new FPGA accelerator for MIM generated by FINESSD can fully utilize the NVMe bandwidth of a modern SSD and perform feature selection without requiring full dataset transfers onto the main processor. The evaluation using a Samsung SmartSSD over small, large and out-of-core datasets shows that, compared to the mainstream multiprocessing Python ML libraries and an optimized C library, FINESSD yields up to 35x and 19x speedup respectively while being more than 70x more energy efficient for large, out-of-core datasets.
Original languageEnglish
Publication statusAccepted/In press - 2024
Event32nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, FCCM 2024 - Orlando, United States
Duration: 5 May 20248 May 2024

Conference

Conference32nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, FCCM 2024
Abbreviated titleFCCM 2024
Country/TerritoryUnited States
CityOrlando
Period5/05/248/05/24

Fingerprint

Dive into the research topics of 'FINESSD: Near-Storage Feature Selection with Mutual Information for Resource-Limited FPGAs'. Together they form a unique fingerprint.

Cite this