2020-01-29 16:54:48 (GMT)
Dear all,
I have found some time to start working on Avatar-lite or however we are going to call it in E.
I guess mostly Stephan, Simon and me will be discussing it, but if anyone wants they are more than free to join (especially people like @Gabriel Ebner who already implemented Avatar).
2020-01-29 17:5:0 (GMT)
ALright, that's very exciting.
Is Stephan around here?
Evatar (Abatar, like @Jasmin Blanchette likes to say) is a simplified version of Avatar where the SAT model isn't used to activate/freeze clauses, but simply to help make clauses shorter (and make branch simplifications easy, etc.). The SAT proofs of individual literals (literals assigned at level 0) can still be used to prune the superposition prover's set of boolean assumptions, so that's quite useful.
2020-01-29 17:5:51 (GMT)
@Simon Cruanes , I have quickly went through your implementation and it is quite impressive. So, if I understand correctly, your avatar works as in the following description.
Clause is augmented to be , where is a conjunction of labels (called assertion), each label standing for a component of (another) clause.
Whenever a generating inference is performed, resultant clause is asserted under the union of the assertions of parents.
Whenever a simplifying inference is performed : (1) if it is a unary simplifying inference then no changes are done to the assertion of the resulting clause; (2) if it is a binary simplifying inference, the clause that is simplifying the active (given) clause must have assertion for which
In the background, there is a SAT solver that contains all assertions.
Whenever we determine that clausal (non-assertion) part of a clause can be split into , where assertions assigned to each part are , we add the following boolean clauses to SAT state:
,
...
.
2020-01-29 17:7:26 (GMT)
Finally, when clause is created, we add the following boolean clause to the SAT state:
2020-01-29 17:14:39 (GMT)
That seems like an accurate description.
The whole thing is worth doing because the boolean assumptions are much cheaper to handle than anything else, even in the subsumption test for simplifying using (it's OCaml's set's subset check which is afaik).
This is also useful for custom case splitting (e.g. in the implementation of induction) because you can directly create a branch for each case by allocating a boolean literal for it, and if branches are incompatible you simply have to assert to the SAT solver.
2020-01-29 17:19:16 (GMT)
Now, I have two questions:
When a new clause which is unsplittable, do you check if is a component?
If is an component, then we can create boolean clause
where is a boolean encoding of C.
2020-01-29 17:23:22 (GMT)
2020-01-29 17:24:46 (GMT)
2020-01-29 17:29:11 (GMT)
2020-01-29 17:31:17 (GMT)
2020-01-29 17:33:20 (GMT)
2020-01-30 17:39:23 (GMT)
Some things we want to consider:
E already has code for detecting and reasonably efficiently maintain splittable clauses with boolean definitions for the "splitting without backtracking" that never really took off for me.
2020-01-30 17:46:23 (GMT)
To me that's still all in one proof state. Evatar/Evader(?) really is just a better alternative to "splitting without backtracking" (with, imho, no downside, only upsides, as you get small clauses and branch simplification for a reasonably low implementation cost and 0 calculus-level cost).
2020-01-30 18:26:7 (GMT)
@Simon Cruanes I have an addendum to that:)
Right now I am running some evaluations and there are cases where configuration with Evatar performs worse than configuration without Evatar
(I hate the name Evader)
Few remarks about Avatar:
Rewriting on incompatible assertions is not implemented. Question: Can we achieve something by staining the demodulated clause with the assertions from the demodulator?
If is derived, shall we add where is the negated (and skolemized) original component?
2020-01-30 18:27:19 (GMT)
Some things we want to consider:
- Is Avatar one strategy to pick, or should each branch use its own strategy?
- Do we run this of one proof state, or does each branch get one proof state and we have a new master proof state/proofobject?
@Stephan Schulz There are no real branches here since we have no model. This avatar is really more like native treatment of names as in splitting with backtracking (or better yet support for interpretation of the names)
2020-01-30 18:31:40 (GMT)
Rewriting on incompatible assertions is not implemented. Question: Can we achieve something by staining the demodulated clause with the assertions from the demodulator?
I don't think so, unless you make it an inference step and not a simplification step. The simplification is incorrect because you removed a clause that is not redundant in some models. Or am I missing something you said?
Are there cases where Evatar is clearly worse than E with splitting without backtracking ?
2020-01-31 18:3:50 (GMT)
What I meant was more like if you have f(X) = g(X) <- 1 and a clause ...f(a)... <- 2, you can rewrite it into ....g(a)... <- 1,2
2020-01-31 18:4:39 (GMT)
I am waiting for the latest results when I will get back with the list of problems on which Zip is clearly slower, but on smoketest of 1000 problems
there are ~10 on which no avatar is beter than avatar
2020-01-31 18:31:4 (GMT)
OK, the results are:
avatar solves 37 problems non-avatar cannot
non-avatar solves 43 avatar cannot
2020-01-31 19:8:33 (GMT)
What I meant was more like if you have f(X) = g(X) <- 1 and a clause ...f(a)... <- 2, you can rewrite it into ....g(a)... <- 1,2
But then you lose completeness.
2020-01-31 19:10:54 (GMT)
Is there a way to compare with forced splitting-without-backtracking? Evatar could be enabled or disabled depending on heuristics, the same way as old school splitting.
2020-01-31 19:11:4 (GMT)
(Also, that was a quick implementation!! Did you reuse picosat?)
2020-01-31 19:16:57 (GMT)
Is there a way to compare with forced splitting-without-backtracking? Evatar could be enabled or disabled depending on heuristics, the same way as old school splitting.
Do you have Splitting without backtracking in Zip?
2020-01-31 19:17:21 (GMT)
(Also, that was a quick implementation!! Did you reuse picosat?)
I am talking about implementation in Zip. I am still experimenting with it.
2020-01-31 19:17:43 (GMT)
What I meant was more like if you have f(X) = g(X) <- 1 and a clause ...f(a)... <- 2, you can rewrite it into ....g(a)... <- 1,2
But then you lose completeness.
Are you sure? Can you explain me how?
2020-01-31 22:44:35 (GMT)
Do you have Splitting without backtracking in Zip?
No, because it'd also require more work to make sure the ordering picks splitting atoms last (otherwise you get back to non-split clauses). Never got much use for that.
Re: completeness: by simplifying with a clause f(x)=g(x) <- 1 the clause C <- 2, you basically remove the clause C <- 2, ¬1 from the search space without good reason. In the boolean model ¬1, 2, this clause may have been needed to prove and now it's gone.
2020-02-01 12:0:21 (GMT)
Do you have Splitting without backtracking in Zip?
Re: completeness: by simplifying with a clause
f(x)=g(x) <- 1the clauseC <- 2, you basically remove the clauseC <- 2, ¬1from the search space without good reason. In the boolean model¬1, 2, this clause may have been needed to prove and now it's gone.
Yeah, I realized that a little bit after I wrote the message :(
Here are my findings from a test of running no-Evatar Zip on all TPTP for 10s vs Evatar Zip on TPTP for 10s:
Some ideas:
SplitName *i* where i is some fresh integer. Now Builtins are already smaller than everything else and when two Builtins are compared we compare them by their i. This would avoid all complications I think.2020-02-01 18:31:8 (GMT)
Yeah, I realized that a little bit after I wrote the messaquote
That happens :-).
Evatar solves 10 more than no-Evatar Zip (counting only refutations).
- Many times no-Evatar saturates, whereas Evatar does not.
- It seems like in some cases, when you split a clause your heuristics get drawn by smaller and smaller clauses created by that branch. Now, if it is hard to refute that branch, you might get stuck there for a long time. Usually that branch is not important for a refutation, and you lose :( When me and Jasmin were working on AVATAR we realized that there Vampire sometihs switches models to prevent this behavior. In AVATAR there is no remedy for this.
- A solution that I found is to create a clause priority function called PreferShortTrails which will prefer clauses with short trails and get Zipperposition out of rabbit hole maybe.
That's very interesting. I've never been too interested in saturations (they're too unpredictable and just adding something like X=a|X=b will make E or Zipper easily diverge). However it'd still be nice to not be worse than Zip-no-evatar.
The switching model behavior is interesting, I didn't know about that. Intuitively, Evatar splits one problem into lots of smaller problems that should be easier to refute (smaller clauses), but it does indeed mean that you have to saturate all of them for a SAT input! So we could think of ways to detect if one branch is saturated (with a satisfiable trail as well) and stop there. The heuristic to favor short trails sounds good to me, indeed, there should be some fairness in branches that are processed. Or maybe penalize branches that are obtained after a lot of case splits (which is not exactly the same, since trails can shrink too).
- What if we spilt clauses when they are infered and forward simplified rather than when they are chosen? This has the advantage that if many propositional clauses are derived SAT solver will know about them and possibly solve the problem without waiting for the clauses to get picked?
it's doable, it's just a matter of doing the multi-clause simplifications more eagerly (without switching full Otter loop on the subsumption side). Worth trying imho.
- Simon Cruanes : How about we implement spliting-without-backtracking like this: Names will be Builtins constructed by
SplitName *i*where i is some fresh integer. Now Builtins are already smaller than everything else and when two Builtins are compared we compare them by their i. This would avoid all complications I think.
Sure, you can try that as well. I'm not a big fan of splitting-without-backtracking as it seems no-one is enthusiastic about them, and Vampire just removed them entirely afaik. But for experimenting it's a good idea.
2020-02-07 23:18:4 (GMT)
Another interesting case where Evatar isn't that optimal: ./zipperposition.exe examples/don_t_use_omega.p with and without --no-avatar. Without evatar, 21 iterations with a short proof; with it, 5075 iterations with a big proof (but note that this is not a complete calculus anyway).