Benchpress

build status

Tool to run one or more logic programs, on a set of files, and collect the results.

License: BSD.

Basic usage

$ benchpress run -c foo.sexp dir_a/ dir_b/ -p z3
…

this tells benchpress to run the prover z3 on directories dir_a and dir_b. foo.sexp contains additional configuration parameters as described below.

System dependencies

Logitest relies on a bunch of utilities, besides OCaml libraries:

TODO use cgroups or similar for CPU affinity

Options

CLI options

Most of the commands accept -c <config file> to specify which config files to use.

ENV var options

Some internal parameters of benchpress can be set using environment variables:

Web interface

Config File

Benchpress ships with a builtin config that contains, roughly:

Builtin config

; read smtlib status
(prover
  (name smtlib-read-status)
  (cmd "grep :status $file")
  (unknown ":status unknown")
  (sat ":status sat")
  (unsat ":status unsat"))

(prover
  (name minisat)
  (unsat "UNSATISFIABLE")
  (sat "^SATISFIABLE")
  (cmd "minisat -cpu-lim=$timeout $file"))

(prover
  (name z3)
  (cmd "z3 $file")
  (version "cmd:z3 --version")
  (unsat "unsat")
  (sat "^sat"))

The configuration is based on stanzas that define available provers, available sets of benchmarks (based on directories that contain them), and tasks. For now the only kind of supported task is to run provers on problems, but it should get richer as we go (e.g. run proof checkers, do some basic CI, run a task daily, etc.).

In this default file we also define a pseudo-prover, "smtlib-read-status", which is used to parse SMTLIB benchmarks and find an annotation (set-info :status <…>). This is useful when running provers later because it makes it easy to find bugs (if a prover reports a wrong answer).

We also define provers minisat and z3 as common reference points, providing info on how to run them (with cmd …) and how to parse their results using regexes.

Example of config file

A more complete example, taken from mc2:

; from https://github.com/c-cube/mc2
(prover
  (name mc2)
  (cmd "ulimit -t $timeout; mc2 --time $timeout $file")
  (unsat "^Unsat")
  (sat "^Sat")
  (unknown "Unknown")
  (timeout "Timeout"))

(dir
  (path "$HOME/workspace/smtlib")
  (pattern ".*.smt2")
  (expect (run smtlib-read-status)))

(task
  (name glob-all-smtlib)
  (synopsis "run all SMT solvers on smtlib")
  (action
   (run_provers
    (dirs "$HOME/workspace/smtlib")
    (provers mc2 z3)
    ;(memory 100000000)  ; TODO: parse "10G"
    (timeout 10))))

(task
  (name glob-all-smtlib-QF_UF)
  (synopsis "run all SMT solvers on QF_UF")
  (action
    (run_provers
      (dirs "$HOME/workspace/smtlib/QF_UF")
      (provers mc2 z3)
      (timeout 10))))

Then one can run, say,

$ benchpress run -c the_file.sexp --task glob-all-smtlib-QF_UF -t 30

to run mc2 and z3 on the QF_UF problems in the SMTLIB directory. The task stanza defines a pre-packaged task that can be launched easily from the command line or the embedded web server (a bit like a makefile target).

List of stanzas

The variable $cur_dir evaluates to the config file's directory. This allows the config file to refer to provers that are installed locally (e.g. in the same repository).

Actions