Abstract
In the framework of fully permutable loops, tiling has been studied extensively as a source-to-source program transformation. We build upon recent results by Högsted. Carter, and Ferrante [12], who aimed at determining the cumulated idle time spent by all processors while executing the partitioned (tiled) computation domain. We propose new, much shorter proofs of all their results and extend these in several important directions. More precisely, we provide an accurate solution for all values of the rise parameter that relates the shape of the iteration space to that of the tiles, and for all possible distributions of the tiles to processors. In contrast, the authors in [12] dealt only with a limited number of cases and provided upper bounds rather than exact formulas.
Original language | English |
---|---|
Pages (from-to) | 167-190 |
Number of pages | 23 |
Journal | Journal of Information Science and Engineering |
Volume | 14 |
Issue number | 1 |
Publication status | Published - Mar 1998 |
Keywords
- Block-cyclic distribution
- Distributed arrays
- HFP (High Performance Fortran)
- MPI (Message Passing Interface)
- Redistribution
- Scheduling