summaryrefslogtreecommitdiff
path: root/test/regress/run_regression
AgeCommit message (Collapse)Author
2018-04-05 Python regression script (#1662)Andres Noetzli
Until now, we have been using a bash script for the running the regressions. That script had several issues, e.g. it was creating many temprorary files and it was calling itself recursively, making it difficult to maintain. This commit replaces the script with a Python script. The Python script uses the TAP protocol, which allows us to display the results of multiple tests per file (e.g. with --check-models and without) separately and with more details. It also outputs whenever it starts running a test, which allows finding "stuck" regression tests on Travis. As recommended in the automake documentation [0], the commit copies the TAP driver to enable TAP support for the test driver. Note: the commit also changes some of the regressions slightly because we were using "%" for the test directives in SMT files which does not correspond to the comment character ";". Using the comment character simplifies the handling and makes it easier for users to reproduce a failure (they can simply run CVC4 on the regression file). [0] https://www.gnu.org/software/automake/manual/html_node/Use-TAP-with-the-Automake-test-harness.html
2018-03-21Fix various regression tests (#1657)Andres Noetzli
While reorganizing the regression tests, I found three tests with a wrong status, one that CVC4 reported unknown for and some regression tests that had command line options set in the Makefile.am instead of the tests themselves. This commit fixes the status of the three regression tests (verified the status with Z3), adds command line options to make the previously "unknown" test work, and moves the command line options from the Makefile.am into the individual regression tests. The latter also required some minor changes to the regression script because it would not try to determine the expected output from the (set-info :status ...) command if there was one directive (such as EXPECT/COMMAND-LINE/etc.). This was an issue with the tests that now have a COMMAND-LINE directive.
2018-03-09Skip (get-unsat-assumptions) tests not supported (#1656)Andres Noetzli
(get-unsat-assumptions) requires the proof infrastructure,so we can't run the regression tests if CVC4 has not been configured for it. This commit changes the regression script to skip tests containing (get-unsat-assumptions) when CVC4 has not been compiled with proof support.
2018-02-02Option to check solutions produced by SyGuS solver (#1553)Haniel Barbosa
2017-09-19Fix issue #1074, improve non-fatal error handling (#1075)Andres Noetzli
Commit 54d24c786d6a843cc72dfb5e377603349ea5e420 was changing CVC4 to handle certain non-fatal errors (such as calling get-unsat-core without a proceding unsat check-sat command) without terminating the solver. In the case of get-unsat-cores, the changes lead to the solver crashing because it was trying to print an unsat core initialized with the default constructor, so the member variable d_smt was NULL, which lead to a dereference of a null pointer. One of the issues of the way non-fatal errors were handled was that the error reporting was done in the invoke() method of the command instead of the printResult() method, which lead to the error described above and other issues such as a call to get-value printing a value after reporting an error. This commit aims to improve the design by adding a RecoverableModalException for errors that the solver can recover from and CommandRecoverableFailure to communicate that a command failed in a way that does not prohibit the solver from continuing to execute. The exception RecoverableModalException is thrown by the SMTEngine and the commands catch it and turn it into a CommandRecoverableFailure. The commit changes all error conditions from the commit above and adds a regression test for them.
2017-08-14Build and test suite fixes for Windows (#186)Mark Laws
- Build fixes for Windows - Make proof checking tempfile handling portable - Test suite fixes for Windows
2017-08-07Optionally split regression tests into test groups (#207)Andres Noetzli
To prevent timing out on Travis, this commit adds the option to split the regression tests into groups and run each group in a separate job. The group of a test is determined by computing a checksum of its name.
2017-03-16Parsing support for SMT LIB 2.6. Minor fixes for printing datatypes. Fix for ↵ajreynol
mkGroundTerm for parametric datatypes. Minor change to run_regression to allow regressions to override input language. Minor refactoring to Cvc.g.
2017-01-10Adding regression test scrubbing.Tim King
2016-01-26Merged bit-vector and uf proof branch.Liana Hadarean
2015-10-11fix regression tests, support fallback mode for proofsKshitij Bansal
2015-04-23Added option for --check-unsat-cores and various core bug fixes (merge of ↵Liana Hadarean
Morgan's proof branch).
2015-01-14sygus input language and benchmarkMorgan Deters
2014-11-09Merge branch '1.4.x'Morgan Deters
2014-11-09Increase stack size when running regressions (fixes some regression crashes ↵Morgan Deters
on Mac).
2014-07-21run_regression using valgrind by setting VALGRIND=1Kshitij Bansal
2014-04-28travis, please!Kshitij Bansal
2014-03-07Fix run_regression on Mac.Morgan Deters
2014-03-05Array smtlib compliance testsKshitij Bansal
2013-12-23Proof-checking code; fixups of segfaults and missing functionality in proof ↵Morgan Deters
generation; fix bug 285. * segfaults/assert-fails in proof-generation fixed, including bug 285 * added --check-proofs to automatically check proofs, like --check-models (but only for UF/SAT at present) * proof generation now works in portfolio (but *not* --check-proofs, since LFSC code uses globals) * proofs are *not* yet supported in incremental mode * added --dump-proofs to dump out proofs, like --dump-models * run_regression script now runs with --check-proofs where appropriate * options scripts now support :link-smt for SMT options, like :link for command-line
2013-12-13Fix stack size on in-tree regressions.Morgan Deters
2013-12-13Some minor cleanup.Morgan Deters
2013-11-11Change exit status to be more consistent with other command-line tools: 0 ↵Morgan Deters
success, nonzero error
2013-11-05fixed proof regression script and added a new uf test caselianah
2013-09-13Move some regress benchmarks around that took too long, other test cleanup.Morgan Deters
2013-07-24Regressions now checking models on unknown too. But quantifiers don't have ↵Morgan Deters
to be simplified by check-model in that case.
2013-07-11Support for TPTP's TFF0 (with arithmetic)Morgan Deters
This commit reverses an "SZS ontology compliance hack" that was done for CASC-24 this year, and adds a TPTP pretty-printer which is capable of outputting results in the TPTP way (rather than the SMT way). This commit includes minor changes to the Expr package to add obvious missing functionality, and to fix the way expressions with builtin operators are made. These changes are truly a _fix_, the implementation had not been properly aligned with the design vision for some corner cases.
2013-04-30add decision_attributes.h for make distKshitij Bansal
2013-01-28some fixes for win32, including ability to "make check" win32 builds via wineMorgan Deters
2012-11-27First chunk of boolean-terms support.Morgan Deters
Passes simple tests and doesn't break existing functionality. Still need some work merged in for models. This version enables BV except for pure arithmetic (since we might otherwise need Boolean term support, which uses BV). Tonight's nightly regression run should tell us if/how that hurts performance. (this commit was certified error- and warning-free by the test-and-commit script.)
2012-11-10Change run-regression script to *additionally* run a second test with ↵Morgan Deters
--check-models *if* the benchmark is sat/invalid (or is incremental, with at least one sat/invalid result). This increases significantly the time to do a "make regress", as more tests are running. We need to run both *with* and *without* --check-models, since that option disables certain preprocessing. To turn off these extra tests during a make regress, you can set CVC4_REGRESSION_ARGS=--no-check-models. To turn off the extra test for a *particular* regression benchmark, you can put this in the benchmark: ; COMMAND-LINE: --no-check-models That might be necessary in e.g. nonlinear arithmetic benchmarks. (this commit was certified error- and warning-free by the test-and-commit script.)
2012-09-27* Rename SMT parts (printer, parser) to SMT1Morgan Deters
* Change --lang smt to mean SMT-LIBv2 * --lang smt1 now means SMT-LIBv1 * SMT-LIBv2 parser now gives helpful error if input looks like v1 * SMT-LIBv1 parser now gives helpful error if input looks like v2 * CVC presentation language parser now gives helpful error if input looks like either SMT-LIB v1 or v2 * Other associated changes (this commit was certified error- and warning-free by the test-and-commit script.)
2012-06-22TPTP: add parser for cnf and fofFrançois Bobot
- include directive works - no keyword : 'fof', 'cnf', ... can be used for symbols name - real -> unsorted -> real (for the one that appear, so no bijection bitween real and unsorted) - same thing for string But: - string not distinct by projection to real, not sure if the current state of string theory make them distinct - filtering in include is not done - the result is not printed in the TPTP way (currently SMT2 way)
2012-04-06* Smt2 printer for datatypesFrançois Bobot
* Fix DefineFunctionCommand when a constant is defined
2012-04-05Support to test the "dumper" mechanism in regressions (feeding dump output ↵Morgan Deters
back in) by doing "make regress RUN_REGRESSION_ARGS=--dump"
2012-03-01Partial merge from kind-backend branch, including Minisat and CNF work toMorgan Deters
support incrementality. Some clean-up work will likely follow, but the CNF/Minisat stuff should be left pretty much untouched. Expected performance change negligible; slightly better on memory: http://church.cims.nyu.edu/regress-results/compare_jobs.php?job_id=3705&reference_id=3697&mode=&category=&p=5 Note that there are crashes, but that these are exhibited in the nightly regression run too!
2011-10-29Support for SMT-LIBv2 (get-proof), CVC-style DUMP_PROOF command, ↵Morgan Deters
SmtEngine::getProof(), a few other things..
2011-10-28proof regressionsMorgan Deters
2011-05-13* fix for Mac OS (includes some ThreadLocal stuff copied in from portfolioMorgan Deters
branch) * add Theory::isSharedTermFact() -- it currently always returns false, pending theory combination work * Add "unknown" cardinalities to Cardinality class * Fix run_regression script to handle CRLF line terminators on Macs (where sed is non-GNU) * Convert CRLF line terminators in datatypes regressions to LF
2011-04-23* reviewed BooleanSimplification, added documentation & unit testMorgan Deters
* work around a lexer ambiguity in CVC grammar * add support for tracing antlr parser/lexer * add parsing support for more language features * initial parameterized types parsing work to support Andy's work
2011-04-23make run_regression script robust to DOS newlines :(Morgan Deters
2011-03-30improve recent low-coverage complaintsMorgan Deters
2011-03-15small fixes for run_regression script to workaround bug in old mktemp, was ↵Morgan Deters
causing a hang in bug220.smt2
2011-03-14change to the run_regression script to better manage temporary files; ↵Morgan Deters
hopefully this will somewhat alleviate the problem with all the junk files in /tmp
2010-11-16SmtEngine now fails with a ModalException if --incremental is not enabledMorgan Deters
but a push/pop or multiple query is attempted (previously it could give incorrect answers) Also, fix some multi-query and push-pop tests that had wrong answers, and support a new "% COMMAND-LINE: " gesture in regression tests so that a test can pass additional, specific command line flags it wants to run with (here, --incremental). Also fix mkbuilddir script for when it's called from contrib/switch-config.
2010-10-10additional model gen and SMT-LIBv2 compliance work: (get-assignment) now ↵Morgan Deters
supported; work on Result type (biggest noticeable change is that CVC4 now outputs lowercase "sat" and "unsat"), Options class moved to src/smt, to allow for future work on runtime configuration via (set-option) command
2010-10-04fix gdb issues (at least for static builds); resolves bug 194Morgan Deters
2010-08-20turn off extra-checking (which does extra theory-rewriter checking); it was ↵Morgan Deters
enabled by mistake in the last commit
2010-08-19UF theory bug fixes, code cleanup, and extra debugging output.Morgan Deters
Enabled new UF theory by default. Added some UF regressions. Some work on the whole equality-over-bool-removed-in-favor-of-IFF thing. (Congruence closure module and other things have to handle IFF as a special case of equality, etc..) Added pre-rewriting to TheoryBool which rewrites: * (IFF true x) => x * (IFF false x) => (NOT x) * (IFF x true) => x * (IFF x false) => (NOT x) * (IFF x x) => true * (IFF x (NOT x)) => false * (IFF (NOT x) x) => false * (ITE true x y) => x * (ITE false x y) => y * (ITE cond x x) => x Added post-rewriting that does all of the above, plus normalize IFF and ITE: * (IFF x y) => (IFF y x), if y < x * (ITE (NOT cond) x y) => (ITE cond y x) (Note: ITEs survive the removal-of-ITEs pass only if they are Boolean-valued.) A little more debugging output from CNF stream, context pushes/pops, ITE removal. Some more documentation. Fixed some typos.
2010-07-04don't do extra-checking for all regressions; that's probably a bad defaultMorgan Deters
generated by cgit on debian on lair
contact matthew@masot.net with questions or feedback