Handling Physical-Layer Deadlock Caused by Permanent Faults in Quasi-Delay-Insensitive Networks-on-Chip

Guangda Zhang, Wei Song, James Garside, Javier Navaridas, Zhiying Wang

Research output: Contribution to journalArticlepeer-review

Abstract

Networks-on-Chip (NoCs) are promising fabrics to
provide scalable and efficient on-chip communication for largescale
many-core systems. In place of the well-studied synchronous
NoCs, the event-driven asynchronous ones have emerged as
promising replacement thanks to their strong timing-robustness
especially when implemented in Quasi-Delay-Insensitive (QDI)
circuits. However, their fault-tolerance has rarely been studied.
The QDI NoCs show complicated failure scenarios and behave
differently from synchronous ones. As the scaling semiconductor
technology is expected with the accelerated aging process,
permanent faults become more likely to happen at runtime.
These faults can break the handshake, leading to physical-layer
deadlocks which can spread and paralyze the whole QDI NoC.
This physical-layer deadlock cannot be resolved using conventional fault-tolerant or deadlock management techniques. This paper systematically studies the impact of permanent faults on QDI NoCs, and presents novel deadlock detection and recovery techniques to handle the fault-caused physical-layer deadlock. The proposed detection technique has been implemented to
protect the NoC data paths where 90% of the logic is covered. Employing the detection and recovery techniques to protect interrouter links (60% of the logic), a permanently faulty link is precisely located and the network function can be recovered with graceful performance degradation.
Original languageEnglish
Pages (from-to)3152-3165
Number of pages14
JournalIEEE Transactions on Very Large Scale Integration (VLSI) Systems
Early online date15 Aug 2017
DOIs
Publication statusPublished - Nov 2017

Keywords

  • Network-on-chip
  • Asynchronous
  • QDI
  • deadlock
  • permanent fault
  • Spatial division multiplexing

Fingerprint

Dive into the research topics of 'Handling Physical-Layer Deadlock Caused by Permanent Faults in Quasi-Delay-Insensitive Networks-on-Chip'. Together they form a unique fingerprint.

Cite this