summaryrefslogtreecommitdiff
path: root/src/theory/strings
AgeCommit message (Collapse)Author
2020-09-22Update copyright header script to support CMake and Python files (#5067)Mathias Preiner
This PR updates the update-copyright.pl script to also update/add copyright headers to CMake specific files. It further fixes a small typo in the header.
2020-09-18[Strings] Fix extended equality rewriter (#5092)Andres Noetzli
Fixes #5090. Our extended equality rewriter was performing the following unsound rewrite: (= (str.++ Str13 Str5 Str16 Str13) (str.++ Str5 Str16 Str13 "cjyfqapanogtdznjbtqlfrmmfauwjdpvnhfpfhzsxaarlfvlvs")) ---> (and (= (str.++ Str13 Str5) Str5) (= (str.++ Str16 Str13) (str.++ Str16 Str13 "cjyfqapanogtdznjbtqlfrmmfauwjdpvnhfpfhzsxaarlfvlvs"))) The rule being applied was SPLIT_EQ_STRIP_R. The rewrite was applied due to the following circumstances: The rewriter found that (str.++ Str13 Str5) is strictly shorter than (str.++ Str5 Str16 Str13 "cjyfqapanogtdznjbtqlfrmmfauwjdpvnhfpfhzsxaarlfvlvs") The rewriter stripped the symbolic length of the former from the latter stripSymbolicLength() was only able to strip the first component, so there was a remaining length of (str.len Str13) The rule SPLIT_EQ_STRIP_R, however, was implicitly assuming that the symbolic length can either be stripped completly or not at all and was not considering the case where only a part of the length can be stripped. The commit adds a flag to stripSymbolicLength() that makes the function only return true if the whole length can be stripped from the input. The commit also refactors the code in stripSymbolicLength() slightly. Note: It is not necessary to try to do something smart in the case where only a partial prefix can be stripped because the rewriter tries to apply the rule to all the different prefix combinations anyway.
2020-09-16Only rewrite replace_re(_all) if regexp is const (#5075)Andres Noetzli
Fixes #5074. This commit fixes an issue in the rewriter of str.replace_re and str.replace_re_all. For both operators, we do two kinds of rewrites: (1) If the first argument is a constant, then we check whether the regular expression appears in that constant and (2) we check whether the regular expression matches the empty string. Both of those checks were assuming a constant regular expression but there was no guard in place for it. This commit adds the missing guard.
2020-09-09(proof-new) Generalize single step helper in eager proof generator (#5046)Andrew Reynolds
This allows single step proofs to have premises, closed by a SCOPE. This will be useful for array lemmas.
2020-09-02Make ExtTheory independent of Theory (#5010)Andrew Reynolds
This makes it so that ExtTheory uses a generic callback instead of relying on Theory. The primary purpose of this commit is to eliminate the connection of TheoryBV and ExtTheory. This commit moves all things related to ExtTheory in BV into CoreSolver. It also refactors the use of ExtTheory in strings and arithmetic.
2020-09-01Removes old proof code (#4964)Haniel Barbosa
This deletes much of the old proof code. Basically everything but the minimal necessary infra-structure for producing unsat cores. That includes dependency tracking in preprocessing, the prop engine proof and the unsat core computation code in the old proof manager. These should also go once we fully integrate into master the new proof infrastructure. It also cleans interfaces that were using old-proof-code-specific constructs (such as LemmaProofRecipe). When possible or when it made sense standalone local proof production code was kept, but deactivated (such is in the equality engine and in the arithmetic solver).
2020-08-31(new theory) Update TheoryStrings to new standard (#4963)Andrew Reynolds
This updates theory of strings to the new standard. This makes strings use the standard template for check and collectModelInfo. It also updates its inference manager to the standard and makes use of assertFactInternal for processing internal facts. This now enables preNotifyFact and notifyFact to be defined in TheoryStrings instead of inside inference manager, which is clearer and eliminates some dependencies within inference manager. Note that the inference manager of strings for now inherits from TheoryInferenceManager. Further standardization will make it inherit from the new InferenceManagerBuffered class. This design will be merged into proof-new, which also has significant changes to strings inference manager.
2020-08-31Simplify interface for computing relevant terms. (#4966)Andrew Reynolds
This is a followup to #4945 which simplifies the contract for computeRelevantTerms.
2020-08-28Replace Theory::Set with TheoryIdSet (#4959)Andrew Reynolds
This makes it so that equality_engine.h does not include theory.h. This is a bad dependency since Theory contains EqualityEngine. This dependency between equality engine and theory was due to the use of a helper (Theory::Set) for representing sets of theories that is inlined into Theory. This PR moves this definition and utilities to theory_id.h. It fixes the resulting include dependencies which are broken by changing the include theory.h -> theory_id.h in equality_engine.h. This avoids a circular dependency in the includes between Theory -> InferenceManager -> ProofEqualityEngine -> EqualityEngine -> Theory.
2020-08-24Extend the standard Theory template based on equality engine (#4929)Andrew Reynolds
Apart from { quantifiers, bool, builtin }, each Theory now has an official equality engine. This PR elaborates on the standard recommended template that Theory should follow, which applies to all theories, regardless of whether they have an equality engine. This includes: A standard check method. The Theory is now expected to implement 4 callbacks (preCheck, postCheck, preNotifyFact, notifyFact). A standard collectModelInfo method. The Theory is now expected to implement 2 callbacks (computeRelevantTerms, collectModelValues). Additionally, 2 more methods have an obvious default: (1) getEqualityStatus, which returns information based on an equality engine if it exists, (2) addSharedTerm, which adds the term as a trigger term to the equality engine of the theory if it exists. Notice that for the sake of more consistent naming, theories now implement notifySharedTerm (previously TheoryEngine called theory-independent addSharedTermInternal which called addSharedTerm, this is updated now to addSharedTerm/notifySharedTerm). Other methods will not be standardized yet e.g. preRegisterTerm, since they vary per theory. FYI @barrettcw Each theory on the branch https://github.com/ajreynol/CVC4/tree/centralEe conforms to this template (e.g. they do not override check or collectModelInfo). The next step will be to pull the new implementation of each Theory from that branch.
2020-08-21Remove spurious theory methods calls (#4931)Andrew Reynolds
This PR removes spurious theory method calls that are not implemented. It also renames a common "propagate(TNode lit)" pattern to "propagateLit(TNode lit)" to avoid confusion with "propagate(Effort e)".
2020-08-21Connect the relevance manager to TheoryEngine and use it in non-linear ↵Andrew Reynolds
arithmetic (#4930) This PR activates the use of the relevance manager in TheoryEngine and makes use of it (via Valuation) in the non-linear extension in arith. It removes a deprecated hack (addTautology) for doing this. This addresses CVC4/cvc4-projects#113. Note that the best method for relevance is interleaving, where roughly you gain on SMT-LIB: QF_NIA: +484-53 unsat +792-440 sat QF_NRA: +32-19 unsat +57-23 sat However, this PR does not (yet) enable this method by default. Note that more work is necessary to determine which lemmas require NEEDS_JUSTIFY, this PR identifies 2 cases of lemmas that need justification (skolemization and strings reductions). Regardless, the use of the relevance manager is limited to non-linear arithmetic for now, which is only able to answer "sat" when only arithmetic is present in assertions.
2020-08-19Simplify trigger notifications in equality engine (#4921)Andrew Reynolds
This is further work towards a centralized approach for equality engines. This PR merges the eqNotifyTriggerEquality callback with the eqNotifyTriggerPredicate callback, and adds assertions that capture the current behavior. It furthermore makes addTriggerEquality private in equality engine and invoked as a special case of addTriggerPredicate. Note this PR does not impact the internal implementation of these methods in equality engine, which indeed is different. There are two reasons to merge these callbacks: (1) all theories implement exactly the same method for the two callbacks, whenever they implement both. It would be trivial to do something different (by case splitting on the kind of predicate that is being notified), and moreover it is not recommended they do anything other than immediately propagate the predicate (regardless of whether it is an equality). (2) It leads to some confusion with eqNotifyTriggerTermEquality, which is invoked when two trigger terms are merged.
2020-08-19Make sets and strings solver states inherit from TheoryState (#4918)Andrew Reynolds
This is towards the new standard for theory solvers. This PR makes the custom states of sets and strings inherit from the standard base class TheoryState. It also makes a minor change to InferenceManager/SolverState to make sets more in line with the plan for a standard base class InferenceManager. Followup PRs will establish the official TheoryState classes for all other theories (which in most cases will be an instance of the base class).
2020-08-19Require `--strings-exp` when using `str.substr` (#4916)Andres Noetzli
Fixes #4915. Previously, `str.substr` did not require `--strings-exp`. However, when `--strings-exp` is not active, we do not send terms to the extended solver for registration, which meant that `str.substr` was never reduced. This commit adds `str.substr` to the operators that require `--strings-exp`.
2020-08-19Changes assertion (about maximum set cardinality) to an exception. (#4907)Gereon Kremer
Changes the assertion that checks for the maximum cardinality of set models to an exception, following #4374. Also cleans up the code around it: previously, the Rational was checked against LONG_MAX, converted to std::uint32_t and then stored into an unsigned. Now we use std::uint32_t all the way. Fixes #4374.
2020-08-17Dynamic allocation of equality engine in Theory (#4890)Andrew Reynolds
This commit updates Theory so that equality engines are allocated dynamically. The plan is to make this configurable based on the theory combination method. The fundamental changes include: - Add `d_equalityEngine` (raw) pointer to Theory, which is the "official" equality engine of the theory. - Standardize methods for initializing Theory. This is now made more explicit in the documentation in theory.h, and includes a method `finishInitStandalone` for users of Theory that don't have an associated TheoryEngine. - Refactor TheoryEngine::finishInit, including how Theory is initialized to incorporate the new policy. - Things related to master equality engine are now specific to EqEngineManagerDistributed and hence can be removed from TheoryEngine. This will be further refactored in forthcoming PRs. Note that the majority of changes are due to changing `d_equalityEngine.` to `d_equalityEngine->` throughout.
2020-08-17(proof-new) Prepare the theory of strings for proof reconstruction (#4885)Andrew Reynolds
This updates the internal data structure for strings inferences "InferInfo" such that it is amenable to proof reconstruction. Currently, the explanation for a conclusion is in two parts: (1) d_ant, the antecedents that can be explained by the current equality engine, (2) d_antn, the antecedents that cannot. For proof reconstruction, the order of some antecedents matters. This is difficult with the above data structure since elements in these two vectors are not given an ordering relative to each other. After this PR, we store: (1) d_ant, all antecedants, which are ordered in a way that is amenable to proofs (to be introduced on following PRs), (2) d_noExplain, the subset of d_ant that cannot be explained by the current equality engine. This PR modifies calls to InferenceManager in preparation for extending it with a proof reconstructor InferProofCons which will convert strings::InferInfo into instructions for building ProofNode.
2020-08-16Add non-emptiness to conclusion of positive RE star unfolding. (#4899)Andrew Reynolds
A recent commit (77e9881) simplified the form of the conclusion in regular expression star unfolding for the sake of uniformity in our internal proof calculus. However, this led to a noticeable drop in performance on a few specific RE benchmarks (the Norn set). This preserves the old behavior by extending the core rule for RE unfolding. It also makes one minor change to the strings proof checker carried over from the proof-new branch.
2020-08-15Minor cleanup related to notifications (#4898)Andrew Reynolds
This includes eliminating TheoryBV's call to eqNotifyNewEqClass and fixing an issue with string's eqNotifyNewEqClass method, which was registering constant integers. It also removes some unnecessary methods in Theory.
2020-08-15(proof-new) Add the strings proof checker (#4858)Andrew Reynolds
It also adds enumeration for two new rules that have been recently added.
2020-08-14Simplify equality engine notifications (#4896)Andrew Reynolds
Previously, there was methods for being informed just before and just after equivalence classes are merged (eqNotifyPreMerge and eqNotifyPostMerge). The purpose of this was to allow e.g. the theory to inspect the equivalence classes in question before the equality engine modified them. However this is no longer used, and moreover is discouraged since Theory should generally buffer their usage of EqualityEngine while it is making internal calls. TheoryStrings was the only theory to use eqNotifyPreMerge (somewhat arbitrarily), all others used eqNotifyPostMerge. This makes post-merge the default, renames it to eqNotifyMerge and removes pre notifications. This will simplify the work of the new theory combination methods as well as leading to fewer spurious calls to callbacks in equality engine.
2020-08-12(proof-new) Proof support in the strings term registry. (#4876)Andrew Reynolds
Adds basic support for proofs in the strings term registry. This code is not yet active until further parts of proof-new are merged.
2020-08-09Make valuation class more robust to null underlying TheoryEngine. (#4864)Andrew Reynolds
In some use cases (unit tests, old proofs infrastructure), we use a Theory with no associated TheoryEngine. This PR makes the Valuation class more robust to this case. This includes making the "unevaluated kinds" a no-op in this case (this is necessary for Theory::finishInit with no TheoryEngine) and adding some assertions to cases that the Theory should never call without TheoryEngine. This is required for a new policy for dynamically configuring equality engine infrastructure in Theory.
2020-08-06(proof-new) Refactor regular expression operation (#4596)Andrew Reynolds
This refactors the regular expression operation class so that some of its key methods are static, so that they can used by both the regular expression solver and the strings proof checker. Notice that many cases of regular expression unfolding are deleted by this PR, since they are unnecessary. This is due to the fact that all regular expression memberships are rewritten except those whose RHS are re.++ or re.*.
2020-08-05[Strings] Add eager context-dependent evaluation (#4847)Andres Noetzli
This commit adds eager evaluation of string terms based on the current context. To do so, it declares the string kinds to be "interpreted" in the equality engine. This allows us to avoid making a series of decisions such as: ** (= "describe" (str.substr actionName 0 8)) :DE-DECISION: *** (= actionName "deletecertificate") :DE-DECISION: **** (= resource_partition "aws") :DE-DECISION: ***** (= resource_region "af-south-1") :DE-DECISION: ****** (= resource_account "") :DE-DECISION: ******* (= (str.len (str.substr actionName 0 3)) 0) :DE-DECISION: ******** (= (str.len (str.substr actionName 0 4)) 0) :DE-DECISION: ********* (= (str.len (str.substr actionName 0 6)) 0) :DE-DECISION: ********** (= (str.len (str.substr actionName 0 5)) 0) :DE-DECISION: *********** (= (str.len (str.substr actionName 0 9)) 0) :DE-DECISION: ************ (= (str.len (str.substr actionName 0 7)) 0) :DE-DECISION: ************* (= (str.len (str.substr actionName 0 10)) 0) :DE-DECISION: ************** (= (str.len (str.substr actionName 0 2)) 0) :DE-DECISION: *************** (= (str.len (str.substr actionName 0 13)) 0) :DE-DECISION: **************** (= (str.len (str.substr actionName 0 12)) 0) :DE-DECISION: ***************** (= (str.len resource_service) 0) :DE-DECISION: ****************** (= (str.len resource_account) 0) :DE-DECISION: In such a case, we can detect that there is a conflict after the first two decisions because (str.substr "deletecertificate" 0 8) is deletece which is different from describe. The equality engine uses the rewriter to evaluate interpreted kinds with constant arguments. This technique leads to a significant speedup on some of the newer Amazon benchmarks.
2020-08-01Ensure strict length constraint for decompose rule (#4821)Andres Noetzli
Fixes #4820. The performance issue was caused by 0988217562006d3f59e01dc261f39121df6d75f5. That commit introduced an option (`--strings-len-conc`) that optionally moves the length constraint for skolems to the conclusion of an inference instead of making it part of the term registration. However, for `DEQ_DISL_STRINGS_SPLIT`, we were only asserting that `LENGTH_GEQ_ONE` in the case where this option was not enabled, instead of asserting that the length of the skolem is equal to the component on the other side of the disequality. This lead to an infinite loop of inferences because we effectively were just splitting a component into two skolems and the only restriction was that the first one was non-empty. We guessed the second skolem to be empty, so the first skolem was equal to the component, the skolem was marked congruent, and we ended up with the same normal form as before. The commit fixes the issue by adding an argument to `getDecomposeConclusion()` that specifies whether to add the length constraint or not. This option is used to always add the length constraint in the case of `DEQ_DISL_STRINGS_SPLIT`.
2020-08-01Fix component contains for splicing due to substring. (#4705)Andrew Reynolds
Fixes #4701. That benchmark now times out.
2020-07-28Fix regular expression delta for complement (#4765)Andrew Reynolds
Fixes #4759 . Also refactors this method.
2020-07-28Supporting seq.nth (#4723)yoni206
This PR adds support for seq.nth operator by eliminating it during expandDefinitions, based on sub-sequences. Tests that use this operator are also included.
2020-07-17(proof-new) Updates to strings core solver (#4642)Andrew Reynolds
This updates the core strings solver in preparation for proofs. The main changes include: The addition of the strings PfRule enum values. The definition of a "getConclusion" static method, used by the core solver, and to be used in the future by the strings proof checker. This includes several optional forms of lemmas, which are added as options in this PR. Major simplifications to our inference schemas for disequality handling (a STRING_DECOMPOSE inference rule). Note this is the only significant intended behavioral change in this PR. Minor updates to the form of inferences send to inference manager, for instance to orient equalities in the expected way, and to reorder assumptions. These changes are done for uniformity and make the strings proof reconstruction from inference steps easier.
2020-07-16Make ExtTheory a utility and not a member of Theory (#4753)Andrew Reynolds
Previously, we assumed that ExtTheory, the module for doing context-dependent simplification, was one-to-one with Theory. This design is not necessary. This makes this class a utility, which can be used as needed. This makes e.g. the initialization of TheoryStrings much easier, since the ExtTheory object can be created first.
2020-07-14(proof-new) Skeleton proof support in the Rewriter (#4730)Andrew Reynolds
This adds support for skeleton proofs in the rewriter (REWRITE -> THEORY_REWRITE). It adds "extended equality rewrite" as a new method of the rewriter/theory rewriters. The unit test of this feature should be added on a followup PR. Co-authored-by: Andres Noetzli <andres.noetzli@gmail.com>
2020-07-12Add support for string/sequence update (#4725)Andrew Reynolds
This adds basic support for string/sequence updating, which has a semantics that is length preserving.
2020-07-12Remove ExprSequence (#4724)Andres Noetzli
Now that we can break the old API, we don't need an ExprSequence anymore. This commit removes ExprSequence and replaces all of its uses with Sequence. Note that I had to temporarily make sequence.h public because we currently include it in a "public" header because we still generate the code for Expr::mkConst<Sequence>().
2020-07-10(proof-new) Update Theory interface for proof-new (#4648)Andrew Reynolds
This includes 4 changes: Theory constructor takes a ProofNodeManager, Theory::explain returns a TrustNode (of kind PROP_EXP), Theory::expandDefinitions returns a TrustNode (of kind REWRITE), Theory::ppRewrite returns a TrustNode (of kind REWRITE). These are all currently planned updates to the interface of Theory. This PR also connects some of the existing proof rule checkers into the proof checker, if one is provided to the constructor. It updates TheoryEngine and other places to process TrustNode in trivial ways (converting them into Node). These calls will later be updated as needed for proof support. This PR is also contingent on the performance tests for proof-new on SMT-LIB.
2020-07-06Front end support for sequences (#4690)Andrew Reynolds
With this PR, we now support a preliminary draft of a theory of sequences. This PR adds front end support for sequences in the smt2 parser, in the new API, adds regressions and unit tests for them. As discussed offline, many of the string kinds are given a sequence version in the external API, but not in the internal one. This means that a special case for such kinds, getKindHelper was added to Term. We do not yet support proper smt2 printing of sequences, which will require access to this code for Kind conversions (to distinguish e.g. str.++ from seq.++).
2020-07-01 Inferences and model construction taking into account seq.unit (#4607)Andrew Reynolds
Towards theory of sequences. This makes the strings solver handle seq.unit, which requires two new inferences and updates to its model construction. It also fixes a bug in the best content heuristic, which previously failed to update the best score.
2020-06-30Fix normal form for re.comp (#4676)Andrew Reynolds
Fixes #4674.
2020-06-28Fix non-termination issues in simpleRegExpConsume (#4667)Andrew Reynolds
There were two issues related to RE in bodies of re.* that accepted the empty string that led to non-termination in the rewriter for regular expressions. This also improves trace messages for simpleRegExpConsume. Fixes #4662.
2020-06-19(proof-new) Make static methods in re-elim (#4623)Andrew Reynolds
In preparation for coarse-grained rule for re-elim to be used by the solver and proof checker.
2020-06-19(proof-new) Updates to strings term registry (#4599)Andrew Reynolds
This makes it so that methods for constructing term registration lemmas are made into static methods, so that they can be used by both the term registry and proof checker.
2020-06-19Convert more uses of strings to words (#4584)Andrew Reynolds
Towards theory of sequences. This PR also adds support for sequences in default sygus grammars. Also removes an interface for mkEmptyWord which doesn't have an equivalent for sequences.
2020-06-19Clean the header file of TheoryStrings (#4272)Andrew Reynolds
Now conforms to coding guidelines. For the sake of ensuring that the aspects related to the strategy were maintained in one place, I split this to its own file, strategy.h/cpp.
2020-06-16Update copyright headers.Aina Niemetz
2020-06-15 Do RE derivation inference only for concrete constant RE (#4609)Andrew Reynolds
The RE derive inference was not designed to handle re.comp. This makes the application of this inference more conservative.
2020-06-12(proof-new) Minor updates to strings base solver (#4606)Andrew Reynolds
In preparation for proofs, this PR involves flattening AND and removing spurious true literals in conjunctions. This also updates our policy for applying cardinality, where in some rare cases we were applying cardinality for pairs of string terms for length 0 (e.g. "there is at most 1 eqc of length 0"), which is spurious due to our term registry which always splits on emptiness.
2020-06-12Cardinality-related inferences per type in theory of strings (#4585)Andrew Reynolds
Towards theory of sequences. This updates various inference steps in the theory of strings that are based on collecting all equivalence classes to be per string-like type.
2020-06-11Add rewrite for str.replace_re. (#4601)Andrew Reynolds
This was discovered due to a proof checking abnormality, where the checker surprisingly succeeded in proving that the reduced form for a str.replace_re was equivalent for 2 different sets of skolems after rewriting.
2020-06-10Add support for str.replace_re/str.replace_re_all (#4594)Andres Noetzli
This commit adds support for the last remaining string operators from the new SMT-LIB standard for the theory of strings. The commit adds the kinds, type checking, reductions, and evaluation rewrites for `str.replace_re` and `str.replace_re_all`.
generated by cgit on debian on lair
contact matthew@masot.net with questions or feedback