Finding 1000+ Bugs in GCC and LLVM in 3 Years
Recorded 06 October 2016 in Lausanne, Vaud, Switzerland
Event: IC Colloquia - EPFL IC School Colloquia
In this talk, I will describe equivalence modulo inputs (EMI), a general methodology for validating optimizing compilers. The key insight is to exploit the close interplay between (1) dynamically executing a program on some test input and (2) statically compiling the program to work on all possible input. Indeed, the test input induces a natural collection of the original program's EMI variants, which help test any compiler and specifically target the difficult-to-find miscompilations.
We have developed a series of gradually sophisticated techniques to generate EMI variants by (1) profiling a program's test executions, and (2) stochastically deleting/inserting/mutating code. Our extensive testing to date has led to 1000+ confirmed bug reports for GCC and LLVM alone, of which 600+ have already been fixed. The EMI concept is widely applicable --- beyond testing compilers, it can be adapted to validate program transformations and analyses, and software in general.
Watched 969 times.Watch