test/regress/README.md

   1 # Regression
   2
   3 ## Regression Levels and Running Regression Tests
   4
   5 CVC4's regression tests are divided into 5 levels (level 0 to 4). Higher
   6 regression levels are reserved for longer running regressions. To run
   7 regressions up to a certain level use `make regressX` where `X` designates the
   8 level. `make regress` by default runs levels 0 and 1. On Travis, we only run
   9 regression level 0 to keep the time short. During our nightly builds, we run
  10 regression level 2.
  11
  12 To only run some benchmarks, you can use the `TEST_REGEX` environment variable.
  13 For example:
  14
  15 ```
  16 TEST_REGEX=quantifiers make regress0
  17 ```
  18
  19 This runs regression tests from level 0 that have "quantifiers" in their name.
  20
  21 By default, each invocation of CVC4 is done with a 10 minute timeout. To use a
  22 different timeout, set the `TEST_TIMEOUT` environment variable:
  23
  24 ```
  25 TEST_TIMEOUT=0.5s make regress0
  26 ```
  27
  28 This runs regression tests from level 0 with a 0,5 second timeout.
  29
  30 ## Adding New Regressions
  31
  32 To add a new regression file, add the file to git, for example:
  33
  34 ```
  35 git add regress/regress0/testMyFunctionality.cvc
  36 ```
  37
  38 Also add it to [Makefile.tests](Makefile.tests) in this directory.
  39
  40 A number of regressions exist under test/regress that are not listed in
  41 [Makefile.tests](Makefile.tests). These are regressions that may someday be
  42 included in the standard suite of tests, but are not yet included (perhaps they
  43 test functionality not yet supported).
  44
  45 The following types of regression files are supported:
  46
  47 - `*.smt`: An [SMT1.x](http://smtlib.cs.uiowa.edu/papers/format-v1.2-r06.08.30.pdf) benchmark
  48 - `*.smt2`: An [SMT 2.x](http://smtlib.cs.uiowa.edu/papers/smt-lib-reference-v2.6-r2017-07-18.pdf) benchmark
  49 - `*.cvc`: A benchmark that uses [CVC4's native input language](https://github.com/CVC4/CVC4/wiki/CVC4-Native-Input-Language)
  50 - `*.sy`: A [SyGuS](http://sygus.seas.upenn.edu/files/SyGuS-IF.pdf) benchmark
  51 - `*.p`: A [TPTP](http://www.cs.miami.edu/~tptp/TPTP/SyntaxBNF.html) benchmark
  52
  53 ## Expected Output, Error, and Exit Codes
  54
  55 In the regression file, you can specify expected stdout, stderr, and exit codes
  56 with the following directives:
  57
  58 ```
  59 % EXPECT: stdout
  60 % EXPECT-ERROR: stderr
  61 % EXIT: 0
  62 ```
  63
  64 This example expects an exit status of 0 from CVC4, the single line "stderr" on
  65 stderr, and the single line "stdout" on stdout. You can repeat `EXPECT` and
  66 `EXPECT-ERROR` lines as many times as you like, and at different points of the
  67 file.  This is useful for multiple queries:
  68
  69 ```
  70 % EXPECT: INVALID
  71 QUERY FALSE;
  72 % EXPECT: VALID
  73 QUERY TRUE;
  74 % EXPECT-ERROR: CVC4 Error:
  75 % EXPECT-ERROR: Parse Error: regress.cvc:7.13: Unexpected token: 'error'.
  76 syntax error;
  77 % EXIT: 1
  78 ```
  79
  80 Note that the directives are in comments, so if the benchmark file is an smt2
  81 file for example, the first character would be `;` instead of `%`.
  82
  83 Benchmark files can also specify the command line options to be used when
  84 executing CVC4, for example:
  85
  86 ```
  87 % COMMAND-LINE: --incremental
  88 ```
  89
  90 Sometimes, the expected output or error output may need some processing. This
  91 is done with the `SCRUBBER` and `ERROR-SCRUBBER` directives. The command
  92 specified by the `SCRUBBER`/`ERROR-SCRUBBER` directive is applied to the output
  93 before the the output is matched against the `EXPECT`/`EXPECT-ERROR` directives
  94 respectively. For example:
  95
  96 ```
  97 ; SCRUBBER: sed -e 's/The fact in question: .*$/The fact in question: TERM/' -e 's/in a linear logic: .*$/in a linear logic: TERM/'
  98 ; EXPECT: (error "A non-linear fact was asserted to arithmetic in a linear logic.
  99 ; EXPECT: The fact in question: TERM
 100 ; EXPECT: ")
 101 ```
 102
 103 The `SCRUBBER` directive in this example replaces the actual term by a fixed
 104 string `TERM` to make the regression test robust to the actual term printed
 105 (e.g. there could be multiple non-linear facts and it is ok if any of them is
 106 printed).
 107
 108 Sometimes it is useful to keep the directives separate. You can separate the
 109 benchmark from the output expectations by putting the benchmark in `<benchmark
 110 file>.smt` and the directives in `<benchmark file>.smt.expect`, which is looked
 111 for by the regression script. Note that `*.expect` files should be added to the
 112 `EXTRA_DIST` variable in [Makefile.am](Makefile.am).