Abstract
Sorting has tremendous usage in the applications that handle massive amount of data. Existing techniques accelerate sorting using multiprocessors or GPGPUs where a data set is partitioned into disjunctive subsets to allow multiple sorting threads working in parallel. Hardware sorters implemented in FPGAs have the potential of providing high-speed and low-energy solutions but the partition algorithms used in software systems are so data dependent that they cannot be easily adopted. The speed of most current sequential sorters still hangs around 1 number/cycle. Recently a new hardware merge sorter broke this speed limit by merging a large number of sorted sequences at a speed proportional to the number of sequences. This paper significantly improves its area and speed scalability by allowing stalls and variable sorting rate. A 32-port parallel merge-tree that merges 32 sequences is implemented in a Virtex-7 FPGA. It merges sequences at an average rate of 31.05 number/cycle and reduces the total sorting time by 160 times compared with traditional sequential sorters.
Original language | English |
---|---|
Title of host publication | 24th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2016 |
Publisher | IEEE |
Pages | 95-102 |
Number of pages | 8 |
ISBN (Electronic) | 9781509023561 |
DOIs | |
Publication status | Published - 16 Aug 2016 |
Event | 24th IEEE International Symposium on Field-Programmable Custom Computing Machines - Washington, United States Duration: 1 May 2016 → 3 May 2016 |
Conference
Conference | 24th IEEE International Symposium on Field-Programmable Custom Computing Machines |
---|---|
Abbreviated title | jump trading |
Country/Territory | United States |
City | Washington |
Period | 1/05/16 → 3/05/16 |
Keywords
- FPGA
- Merge sort
- parallel sorting
- sorting network