Add check that result matches benchmark status (#3028)
This commit adds a check to make sure that the result of a `(check-sat)`
call matches the expected result set via `(set-info :status ...)`. In
doing so, it also fixes an issue where CVC4 would crash if asked for the
unsat core after setting the status to `unsat` but before calling
`(check-sat)` (see regression for concrete example). This happened
because CVC4 was storing the expected result and the computed result
both in the same variable (the expected result wasn't really being used
though). This commit keeps track of the expected result and the computed
result in separate variables to fix that issue.