Memory Safe C Compiler |
MSCC (Memory Safe C Compiler) is a tool to ensure both temporal and spatial memory safety in C programs through a source-to-source transformation.
MSCC was developed with the following goals:
MSCC detects memory access errors at the level of memory blocks, which correspond to the least units of memory allocation, such as a global or local variable, or memory returned by a single invocation of malloc. It flags memory errors only at a point where a pointer is dereferenced. For each pointer p, metadata that describe the temporal and spatial attributes of the memory block pointed by p are maintained at runtime and used for detecting memory errors before dereferences of p. Metadata is stored separately from a pointer for better backwards-compatibility with untransformed libraries. MSCC handles most common forms of pointer arithmetic as well as type casts that can be classified into upcasts (cast from subtype to supertype) or downcasts (casting from a supertype to a subtype), which account for most casts in typical C programs.
The current MSCC software still has some limitations. The most important ones in practice are (a) use of user-defined memory management functions that have different semantics compared to malloc/free, (b) failure to use type-casting to cast a pointer returned by malloc into the type using which the memory block is subsequently accessed, and (c) programs that convert pointers to integers and then back to pointers. These limitations mean that MSCC still may require some user-effort and a few source-code modifications to compile. More details about the MSCC limitations can be found in our paper and the package README file.
MSCC is implemented in Objective Caml (http://caml.inria.fr) and uses CIL (http://manju.cs.berkeley.edu/cil/) as the front end to manipulate C constructs.
MSCC is alpha software. It is provided only for the research and evaluation purpose.
It has been tested on several benchmark programs from SPECINT and Olden, and several GNU utilities such as bc, gzip, patch, and tar. These programs range from about 400 to 30,000 lines of code.
Please contact the first author of our paper to report bugs or seek clarifications.
MSCC is distributed under the terms of GNU GPL. The current version is 0.2.2.
This work is supported by NSF grant CCR-0098154 and an ONR grant N000140110967.