Codex is a library of reusable components that are needed to implement a new static analyzer. The main component of any static analysis based on abstract interpretation are:
Single_value_abstraction which abstract unbounded integers \mathbb Z, fixed sized bitvectors (of any arbitrary size), and booleans (Lattices.Quadrivalent);Domains, that perform the basic analysis operations. In the case of non-relational domains, domains are built using Single_value_abstraction.Fixpoint engine, that iterates analysis operations until everything was computed;Fixpoint engine and translating instructions and expressions to Domains operations. This depends on the language to be analyzed, so this is not provided by the Codex library; but see Analyzers for analyzers that use the library.The Codex repository is decomposed in several sub-parts (that are in different directories), whose dependency graph is as follows:
Each analyzer must set up its stack of abstract domain, then translate the analysis of instructions and expressions in calls to abstract domains. See for instance:
Tracelog), and C2Codex which interprets the C AST using the Operator implemented inside the abstract Domains.Domains, and handles the translation form BINSEC's DBA intermediate language to Codex Operator.Note that using Tracelog you can observe the recursive calls to the different domains, i.e. trace how the analysis of a source expression is handled by Codex.
Operator : Definition of the concrete semantics of base objects like booleans, integers, bitvectors; and utilities around them (like function symbols).Lattices : Abstraction of a set; lattices representing abstraction of a single value the main way to exchange information in Codex.Single_value_abstraction : Combination of Lattices abstracting a single value by Operator.Terms : is the symbolic terms that is the target of our SSA-translation by abstract interpretation.Types : AST and parser used by the type-based analysis.Domains : The SSA/value-based domains, as well as the memory domains.Fixpoint : A library that helps perform fixpoint computation.Smtbackend : Operates a translation of Terms to an SMT formula.Codex_config : Contains knobs allowing to change the behavior of some abstract domains. We generally avoid doing this, but it is useful for benchmarking.Tracelog The logger. Instantiate it on top of your module, then use Log.debug(fun p -> p ...) where p behaves like Format.printf.PatriciaTree : Now a stand-alone submodule, but works specially well for abstract interpretation purposes.Codex_log is the old, deprecated, logger.We present a tutorial on a simple imperative while language, and demonstrates how to statically analyze programs written in it using Codex, a modular abstract interpretation library. This also serves as a nice introduction to various codex components (Lattices, Single_value_abstraction, Domains...). It is mostly meant for developers who wish to use and extend Codex.
if and while). It also provides a concrete interpreter for the while language;Lattices and Single_value_abstraction with simpler implementations;This tutorial provides a user guide for the static analysis based on the dependent nominal type system presented in the OOPLSA 2024 paper and implemented using Codex. It covers all the steps necessary to use our tool to check if a C or machine code program is exempt of spatial memory safety errors, such as null pointer dereferences, buffer overflows, or type confusion errors. It covers in particular: