TY - GEN
T1 - Compiler fuzzing through deep learning
AU - Cummins, Chris
AU - Petoumenos, Pavlos
AU - Murray, Alastair
AU - Leather, Hugh
PY - 2018/7/12
Y1 - 2018/7/12
N2 - Random program generation-fuzzing-is an effective technique for discovering bugs in compilers but successful fuzzers require extensive development effort for every language supported by the compiler, and often leave parts of the language space untested. We introduce DeepSmith, a novel machine learning approach to accelerating compiler validation through the inference of generative models for compiler inputs. Our approach infers a learned model of the structure of real world code based on a large corpus of open source code. Then, it uses the model to automatically generate tens of thousands of realistic programs. Finally, we apply established differential testing methodologies on them to expose bugs in compilers. We apply our approach to the OpenCL programming language, automatically exposing bugs with little effort on our side. In 1,000 hours of automated testing of commercial and open source compilers, we discover bugs in all of them, submitting 67 bug reports. Our test cases are on average two orders of magnitude smaller than the state-of-the-art, require 3.03× less time to generate and evaluate, and expose bugs which the state-of-the-art cannot. Our random program generator, comprising only 500 lines of code, took 12 hours to train for OpenCL versus the state-of-the-art taking 9 man months to port from a generator for C and 50,000 lines of code. With 18 lines of code we extended our program generator to a second language, uncovering crashes in Solidity compilers in 12 hours of automated testing.
AB - Random program generation-fuzzing-is an effective technique for discovering bugs in compilers but successful fuzzers require extensive development effort for every language supported by the compiler, and often leave parts of the language space untested. We introduce DeepSmith, a novel machine learning approach to accelerating compiler validation through the inference of generative models for compiler inputs. Our approach infers a learned model of the structure of real world code based on a large corpus of open source code. Then, it uses the model to automatically generate tens of thousands of realistic programs. Finally, we apply established differential testing methodologies on them to expose bugs in compilers. We apply our approach to the OpenCL programming language, automatically exposing bugs with little effort on our side. In 1,000 hours of automated testing of commercial and open source compilers, we discover bugs in all of them, submitting 67 bug reports. Our test cases are on average two orders of magnitude smaller than the state-of-the-art, require 3.03× less time to generate and evaluate, and expose bugs which the state-of-the-art cannot. Our random program generator, comprising only 500 lines of code, took 12 hours to train for OpenCL versus the state-of-the-art taking 9 man months to port from a generator for C and 50,000 lines of code. With 18 lines of code we extended our program generator to a second language, uncovering crashes in Solidity compilers in 12 hours of automated testing.
KW - Compiler Fuzzing
KW - Deep Learning
KW - Differential Testing
UR - http://www.scopus.com/inward/record.url?scp=85051566880&partnerID=8YFLogxK
UR - https://www.pure.ed.ac.uk/ws/files/63745798/2018_issta.pdf
U2 - 10.1145/3213846.3213848
DO - 10.1145/3213846.3213848
M3 - Conference contribution
AN - SCOPUS:85051566880
T3 - ISSTA 2018 - Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis
SP - 95
EP - 105
BT - ISSTA 2018 - Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis
A2 - Bodden, Eric
A2 - Tip, Frank
PB - Association for Computing Machinery
T2 - 27th ACM SIGSOFT International Symposium on Software Testing and Analysis
Y2 - 16 July 2018 through 21 July 2018
ER -