Fine-grained energy profiling for deep convolutional neural networks on the Jetson TX1

Crefeda Rodrigues, Graham Riley, Mikel Luján

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

398 Downloads (Pure)


Energy-use is a key concern when migrating current deep learning applications onto low power heterogeneous devices such as a mobile device. This is because deep neural networks are typically designed and trained on high-end GPUs or servers and require additional processing steps to deploy them on low power devices. Such steps include the use of compression techniques to scale down the network size or the provision of efficient device-specific software implementations. Migration is further aggravated by the lack of tools and the inability to measure power and performance accurately and consistently across devices. We present a novel evaluation framework for measuring energy and performance for deep neural networks using ARMs Streamline Performance Analyser integrated with standard deep learning frameworks such as Caffe and CuDNNv5. We apply the framework to study the execution behaviour of SqueezeNet on the Maxwell GPU of the NVidia Jetson TX1, on an image classification task (also known as inference) and demonstrate the ability to measure energy of specific layers of the neural network.
Original languageEnglish
Title of host publication IEEE International Symposium on Workload Characterization (IISWC), 2017
Number of pages2
Publication statusPublished - 2017


Dive into the research topics of 'Fine-grained energy profiling for deep convolutional neural networks on the Jetson TX1'. Together they form a unique fingerprint.

Cite this