Stony Brook University Logo Department of Computer Science Stony Brook Search Button
Secure Systems Lab

LISC: Learning Instruction Semantics from Code Generators

See our ASPLOS 2016 paper for an overview of this approach. An early version of the approach was sketched here.


Translating low-level machine instructions into higher-level ntermediate language (IL) is one of the central steps in many binary analysis and instrumentation systems. Existing systems build such translators manually, which is a labor-intensive ask. As a result, it takes a great deal of effort to support new architectures. Even for widely deployed architectures, full instruction sets may not be modeled, e.g., mature systems such as Valgrind still lack support for AVX, FMA4 and SSE4.1 for x86 processors. To overcome these difficulties, we have developed a novel approach that leverages knowledge about instruction et semantics that is already embedded into modern compilers such as GCC and LLVM. In particular, we have developed a learning-based approach for automating the translation of assembly instructions to a compiler's architecture-neutral IL. Our experimental evaluation that demonstrates the ability of our approach to easily support many architectures (x86, ARM and AVR), including their advanced instruction sets.


An tarball of the learning algorithm implementation and a README file are available.


This work was supported in part by NSF grants CNS-1319137, CNS-0831298, an AFOSR grant FA9550-09-1-0539, and an ONR grant N000140710928.

Home Contact NSI Computer Science Stony Brook University

Copyright © 1999-2013 Secure Systems Laboratory, Stony Brook University. All rights reserved.