DRAT-Optimization (#2971)

This commit enables DRAT-optimization, which consists of two sub-processes: 1. removing unnecessary instructions from DRAT-proofs and 2. not proving clauses which are not needed by DRAT proofs. These changes have the effect of dramatically shortening some some bit-vector proofs. Specifically, proofs using lemmas in the ER, DRAT, and LRAT formats, since proofs in any of these formats are derived from a (now optimized!) DRAT proof produced by CryptoMiniSat. What follows is a description of the main parts of this PR: ## DRAT Optimization The DRAT-optimization is done by `drat-trim`, which is bundled with `drat2er`. The (new) function `ClausalBitVectorProof::optimizeDratProof` is our interface to the optimization machinery, and most of the new logic in this PR is in that function. ## CNF Representation The ability to not prove unused clauses requires a slight architectural change as well. In particular, we need to be able to describe **which** subset of the original clause set actually needs to be proved. To facilitate this, when the clause set for CryptoMiniSat is first formed it is represented as a (a) map from clause indices to clauses and (b) a list of indices. Then, when the CNF is optimized, we temporarily store a new list of the clauses in the optimized formula. This change in representation requires a number of small tweaks throughout the code. ## Small Fixes to Signatures When we decided to check and accept two different kinds of DRAT, some of our DRAT-checking broke. In particular, when supporting one kind of DRAT, it is okay to `fail` (crash) when a proof fails to check. If you're supporting two kinds of DRAT, crashing in response to the first checker rejecting the proof denies the second checker an opportunity to check the proof. This PR tweaks the signatures slightly (and soundly!) to do something else instead of `fail`ing.
author: Alex Ozdemir <aozdemir@hmc.edu> 2019-06-05 12:16:46 -0700
committer: Andres Noetzli <noetzli@stanford.edu> 2019-06-05 12:16:46 -0700
commit: 9af5e9653582a18b1871dfc3774ab50dd24463ce (patch)
tree: 9bbe5cd5708dbd3475626cabd4d2c9711f0ac133 /src/proof/clausal_bitvector_proof.cpp
parent: c587235d29d2e3e1cd52a9f76dde8f58c89ae37e (diff)
1 files changed, 150 insertions, 33 deletions
diff --git a/src/proof/clausal_bitvector_proof.cpp b/src/proof/clausal_bitvector_proof.cpp
index eed295b1a..07589fd07 100644
--- a/src/proof/clausal_bitvector_proof.cpp
+++ b/src/proof/clausal_bitvector_proof.cpp
@@ -17,24 +17,35 @@
 #include "cvc4_private.h"
 
 #include <algorithm>
+#include <iostream>
 #include <iterator>
-#include <set>
+#include <unordered_set>
 
 #include "options/bv_options.h"
 #include "proof/clausal_bitvector_proof.h"
+#include "proof/dimacs.h"
 #include "proof/drat/drat_proof.h"
 #include "proof/er/er_proof.h"
 #include "proof/lfsc_proof_printer.h"
 #include "proof/lrat/lrat_proof.h"
 #include "theory/bv/theory_bv.h"
 
+#if CVC4_USE_DRAT2ER
+#include "drat2er_options.h"
+#include "drat_trim_interface.h"
+#endif
+
 namespace CVC4 {
 
 namespace proof {
 
 ClausalBitVectorProof::ClausalBitVectorProof(theory::bv::TheoryBV* bv,
                                              TheoryProofEngine* proofEngine)
-    : BitVectorProof(bv, proofEngine), d_usedClauses(), d_binaryDratProof()
+    : BitVectorProof(bv, proofEngine),
+      d_clauses(),
+      d_originalClauseIndices(),
+      d_binaryDratProof(),
+      d_coreClauseIndices()
 {
 }
 
@@ -69,33 +80,151 @@ void ClausalBitVectorProof::initCnfProof(prop::CnfStream* cnfStream,
 void ClausalBitVectorProof::registerUsedClause(ClauseId id,
                                                prop::SatClause& clause)
 {
-  d_usedClauses.emplace_back(id, clause);
+  d_clauses.emplace(id, clause);
+  d_originalClauseIndices.push_back(id);
 };
 
 void ClausalBitVectorProof::calculateAtomsInBitblastingProof()
 {
+  optimizeDratProof();
+
   // Debug dump of DRAT Proof
   if (Debug.isOn("bv::clausal"))
   {
     std::string serializedDratProof = d_binaryDratProof.str();
+    Debug("bv::clausal") << "option: " << options::bvOptimizeSatProof()
+                         << std::endl;
     Debug("bv::clausal") << "binary DRAT proof byte count: "
                          << serializedDratProof.size() << std::endl;
-    Debug("bv::clausal") << "Parsing DRAT proof ... " << std::endl;
-    drat::DratProof dratProof =
-        drat::DratProof::fromBinary(serializedDratProof);
-
-    Debug("bv::clausal") << "Printing DRAT proof ... " << std::endl;
-    dratProof.outputAsText(Debug("bv::clausal"));
+    Debug("bv::clausal") << "clause count: " << d_coreClauseIndices.size()
+                         << std::endl;
   }
 
   // Empty any old record of which atoms were used
   d_atomsInBitblastingProof.clear();
+  Assert(d_atomsInBitblastingProof.size() == 0);
 
   // For each used clause, ask the CNF proof which atoms are used in it
-  for (const auto& usedIndexAndClause : d_usedClauses)
+  for (const ClauseId usedIdx : d_coreClauseIndices)
   {
-    d_cnfProof->collectAtoms(&usedIndexAndClause.second,
-                             d_atomsInBitblastingProof);
+    d_cnfProof->collectAtoms(&d_clauses.at(usedIdx), d_atomsInBitblastingProof);
+  }
+}
+
+void ClausalBitVectorProof::optimizeDratProof()
+{
+  if (options::bvOptimizeSatProof()
+          == theory::bv::BvOptimizeSatProof::BITVECTOR_OPTIMIZE_SAT_PROOF_PROOF
+      || options::bvOptimizeSatProof()
+             == theory::bv::BvOptimizeSatProof::
+                    BITVECTOR_OPTIMIZE_SAT_PROOF_FORMULA)
+  {
+    Debug("bv::clausal") << "Optimizing DRAT" << std::endl;
+    char formulaFilename[] = "/tmp/cvc4-dimacs-XXXXXX";
+    char dratFilename[] = "/tmp/cvc4-drat-XXXXXX";
+    char optDratFilename[] = "/tmp/cvc4-optimized-drat-XXXXXX";
+    char optFormulaFilename[] = "/tmp/cvc4-optimized-formula-XXXXXX";
+    int r;
+    r = mkstemp(formulaFilename);
+    AlwaysAssert(r > 0);
+    close(r);
+    r = mkstemp(dratFilename);
+    AlwaysAssert(r > 0);
+    close(r);
+    r = mkstemp(optDratFilename);
+    AlwaysAssert(r > 0);
+    close(r);
+    r = mkstemp(optFormulaFilename);
+    AlwaysAssert(r > 0);
+    close(r);
+
+    std::ofstream formStream(formulaFilename);
+    printDimacs(formStream, d_clauses, d_originalClauseIndices);
+    formStream.close();
+
+    std::ofstream dratStream(dratFilename);
+    dratStream << d_binaryDratProof.str();
+    dratStream.close();
+
+#if CVC4_USE_DRAT2ER
+    int dratTrimExitCode =
+        drat2er::drat_trim::OptimizeWithDratTrim(formulaFilename,
+                                                 dratFilename,
+                                                 optFormulaFilename,
+                                                 optDratFilename,
+                                                 drat2er::options::QUIET);
+    AlwaysAssert(
+        dratTrimExitCode == 0, "drat-trim exited with %d", dratTrimExitCode);
+#else
+    Unimplemented(
+        "Proof production when using CryptoMiniSat requires drat2er.\n"
+        "Run contrib/get-drat2er, reconfigure with --drat2er, and rebuild");
+#endif
+
+    d_binaryDratProof.str("");
+    Assert(d_binaryDratProof.str().size() == 0);
+
+    std::ifstream lratStream(optDratFilename);
+    std::copy(std::istreambuf_iterator<char>(lratStream),
+              std::istreambuf_iterator<char>(),
+              std::ostreambuf_iterator<char>(d_binaryDratProof));
+
+    if (options::bvOptimizeSatProof()
+        == theory::bv::BvOptimizeSatProof::BITVECTOR_OPTIMIZE_SAT_PROOF_FORMULA)
+    {
+      std::ifstream optFormulaStream{optFormulaFilename};
+      std::vector<prop::SatClause> core = parseDimacs(optFormulaStream);
+      optFormulaStream.close();
+
+      // Now we need to compute the clause indices for the UNSAT core. This is a
+      // bit difficult because drat-trim may have reordered clauses, and/or
+      // removed duplicate literals. We use literal sets as the canonical clause
+      // form.
+      std::unordered_map<
+          std::unordered_set<prop::SatLiteral, prop::SatLiteralHashFunction>,
+          ClauseId,
+          prop::SatClauseSetHashFunction>
+          cannonicalClausesToIndices;
+      for (const auto& kv : d_clauses)
+      {
+        cannonicalClausesToIndices.emplace(
+            std::unordered_set<prop::SatLiteral, prop::SatLiteralHashFunction>{
+                kv.second.begin(), kv.second.end()},
+            kv.first);
+      }
+
+      d_coreClauseIndices.clear();
+      std::unordered_set<prop::SatLiteral, prop::SatLiteralHashFunction>
+          coreClauseCanonical;
+      for (const prop::SatClause& coreClause : core)
+      {
+        coreClauseCanonical.insert(coreClause.begin(), coreClause.end());
+        d_coreClauseIndices.push_back(
+            cannonicalClausesToIndices.at(coreClauseCanonical));
+        coreClauseCanonical.clear();
+      }
+      Debug("bv::clausal") << "Optimizing the DRAT proof and the formula"
+                           << std::endl;
+    }
+    else
+    {
+      Debug("bv::clausal") << "Optimizing the DRAT proof but not the formula"
+                           << std::endl;
+      d_coreClauseIndices = d_originalClauseIndices;
+    }
+
+    Assert(d_coreClauseIndices.size() > 0);
+    remove(formulaFilename);
+    remove(dratFilename);
+    remove(optDratFilename);
+    remove(optFormulaFilename);
+    Debug("bv::clausal") << "Optimized DRAT" << std::endl;
+  }
+  else
+  {
+    Debug("bv::clausal") << "Not optimizing the formula or the DRAT proof"
+                         << std::endl;
+    d_coreClauseIndices = d_originalClauseIndices;
   }
 }
 
@@ -120,10 +249,9 @@ void LfscClausalBitVectorProof::printBBDeclarationAndCnf(std::ostream& os,
   d_cnfProof->printAtomMapping(d_atomsInBitblastingProof, os, paren, letMap);
 
   os << "\n;; BB-CNF proofs\n";
-  for (const auto& idAndClause : d_usedClauses)
+  for (const ClauseId id : d_coreClauseIndices)
   {
-    d_cnfProof->printCnfProofForClause(
-        idAndClause.first, &idAndClause.second, os, paren);
+    d_cnfProof->printCnfProofForClause(id, &d_clauses.at(id), os, paren);
   }
 }
 
@@ -137,13 +265,8 @@ void LfscDratBitVectorProof::printEmptyClauseProof(std::ostream& os,
   os << "\n;; Proof of input to SAT solver\n";
   os << "(@ proofOfSatInput ";
   paren << ")";
-  std::vector<ClauseId> usedIds;
-  usedIds.reserve(d_usedClauses.size());
-  for (const auto& idAnd : d_usedClauses)
-  {
-    usedIds.push_back(idAnd.first);
-  };
-  LFSCProofPrinter::printSatInputProof(usedIds, os, "bb");
+
+  LFSCProofPrinter::printSatInputProof(d_coreClauseIndices, os, "bb");
 
   os << "\n;; DRAT Proof Value\n";
   os << "(@ dratProof ";
@@ -166,19 +289,13 @@ void LfscLratBitVectorProof::printEmptyClauseProof(std::ostream& os,
   os << "\n;; Proof of input to SAT solver\n";
   os << "(@ proofOfCMap ";
   paren << ")";
-  std::vector<ClauseId> usedIds;
-  usedIds.reserve(d_usedClauses.size());
-  for (const auto& idAnd : d_usedClauses)
-  {
-    usedIds.push_back(idAnd.first);
-  };
-  LFSCProofPrinter::printCMapProof(usedIds, os, "bb");
+  LFSCProofPrinter::printCMapProof(d_coreClauseIndices, os, "bb");
 
   os << "\n;; DRAT Proof Value\n";
   os << "(@ lratProof ";
   paren << ")";
-  lrat::LratProof pf =
-      lrat::LratProof::fromDratProof(d_usedClauses, d_binaryDratProof.str());
+  lrat::LratProof pf = lrat::LratProof::fromDratProof(
+      d_clauses, d_coreClauseIndices, d_binaryDratProof.str());
   pf.outputAsLfsc(os);
   os << "\n";
 
@@ -194,8 +311,8 @@ void LfscErBitVectorProof::printEmptyClauseProof(std::ostream& os,
          "the BV theory should only be proving bottom directly in the eager "
          "bitblasting mode");
 
-  er::ErProof pf =
-      er::ErProof::fromBinaryDratProof(d_usedClauses, d_binaryDratProof.str());
+  er::ErProof pf = er::ErProof::fromBinaryDratProof(
+      d_clauses, d_coreClauseIndices, d_binaryDratProof.str());
 
   pf.outputAsLfsc(os);
 }
author	Alex Ozdemir <aozdemir@hmc.edu>	2019-06-05 12:16:46 -0700
committer	Andres Noetzli <noetzli@stanford.edu>	2019-06-05 12:16:46 -0700
commit	9af5e9653582a18b1871dfc3774ab50dd24463ce (patch)
tree	9bbe5cd5708dbd3475626cabd4d2c9711f0ac133 /src/proof/clausal_bitvector_proof.cpp
parent	c587235d29d2e3e1cd52a9f76dde8f58c89ae37e (diff)