2019-08-01 14:17:22 (GMT)
Hi @Simon Cruanes ,
I've started to work on proof checking CNF. One problem are the quantifiers because the low-level-prover can only prove ground problems. Your idea to use the variable names to identify what has to be instantiated with what is very cool, but it seems that Skolemization renames variables (search for Var.copy in Cnf.ml). Do you know why variables get renamed there?
2019-08-01 14:54:16 (GMT)
This copy might be unnecessary, it's there because it's cleaner to rename variables on the fly when performing a substitution. You can try to remove it to see how it goes :p
Another possibility is to track skolemization as a separate proof step which also records the variable mappings explicitly (so it's easy to remember what corresponds to what).
2019-08-01 14:54:53 (GMT)
Semi-related: sidekick compiles again so we could try to use it for the LLProver at some point (to see if it'd help with performance, at least), on a branch.
2019-08-01 15:5:32 (GMT)
Ok, I'll try to remove it. Concerning sidekick, I'll first try to improve the coverage of proof checking and efficiency later...
2019-08-01 15:6:26 (GMT)
Of course, I'm just mentioning it as a potential thing to explore in the future (it should be more efficient than the naive Tableau), but ofc make it correct first, fast second.
2019-08-01 16:20:34 (GMT)
@Alexander Bentkamp are you working from master on that?
2019-08-01 16:23:15 (GMT)
No, from lzip
2019-08-01 16:27:2 (GMT)
Alright! Good luck!
2019-08-06 17:47:51 (GMT)
Hi @Simon,
I have started to work on proof checking for CNF-transformation. I currently focus on solving the problem of shifting scopes of quantifiers. My example is:
q ∨ (∀ x. p x)
which CNF-transforms into
∀ x. (p x ∨ q)
My approach is the following: When converting a proof step from Proof to LLProof, you implemented a mechanism to ground all formulas using fresh skolem constants. I switched that off completely. Instead, I go through all subterms of the step's conjecture that start with a quantifier and add a corresponding instance of the axiom of choice as a premise to the proof step. In the example that would be
(p sk ∨ q) ⇒ (∀ x. p x ∨ q)
Then, I go through all subterms of the step's premises that start with a quantifier and add a corresponding instance of the axiom of instantiation as a premise to the proof step. In the example that would be
(∀ x. p x) ⇒ p sk
Using the variable names, I can identify which skolem to use. With these additional premises, the ground prover can prove the step correct.
Now I am tempted to use this approach not only in CNF steps, but everywhere. I would remove your implementation of grounding with fresh skolems completely and replace it by this more general approach. What do you think? It's a trade-off between the amount of trusted code and efficiency.
2019-08-06 18:12:49 (GMT)
I'd rather not disable it for other steps, intuitively, for now. First, I'm not sure you can rely on the same trick of using the variable name in Term.t-based steps (where variables are nameless). Second, efficiency matters, we're not trying to have as small a kernel as possible imho [1].
A side note for efficiency: for some problems the current checker is a bottleneck, as I already mentioned. For example, see examples/data/mult_300.zf. I used the profiling script with perf and noticed that a lot of the overhead seems to be in term conversions (going from large terms to large TypedSTerm.t to large LLTerm.t. It might be very interesting to cut there and convert directly to LLTerm.t without intermediate once the checking of CNF and HO is complete.
[1] rather, if you want more trust, the checker should have the capability to output Isabelle/Coq/Dedukti traces that it reconstructs from its lower level, more precise proofs (Proof.t is far too high level anyway).
2019-08-06 18:39:19 (GMT)
Note: I might have missed some part of your idea. If you think this mechanism of instantiating choice could also work for normal steps, I'm open to hear it. It'd be particularly interesting, performance-wise, if it could occur after the step is a LLTerm.t (i.e. first convert, then instantiate DB vars with skolems).
2019-08-06 18:40:53 (GMT)
Other note: I have a branch with sidekick replacing the LLProver. It's faster but there's still a lot of overhead on mult_300.zf because of term conversion. I think it should be the way to go eventually as it would handle theories, boolean subterms, etc. properly.
Otherwise I can help to design and write a less naive Tableau+CC prover when the time comes.
2019-08-07 10:14:18 (GMT)
I'm not sure you can rely on the same trick of using the variable name in Term.t-based steps (where variables are nameless)
The free variables (i.e. implicitly universally quantified) of a clause have names. Your current implementation also uses those names to know which variables to ground with the same skolem symbol.
It'd be particularly interesting, performance-wise, if it could occur after the step is a LLTerm.t (i.e. first convert, then instantiate DB vars with skolems).
Why would that be better?
Other note: I have a branch with sidekick replacing the LLProver. It's faster but there's still a lot of overhead on mult_300.zf because of term conversion. I think it should be the way to go eventually as it would handle theories, boolean subterms, etc. properly.
Otherwise I can help to design and write a less naive Tableau+CC prover when the time comes.
Very cool! Maybe I should switch to that branch immediately? Would you then want to handle the instantiations/grounding as a theory, too?
2019-08-07 12:40:34 (GMT)
BTW, what it local_intros for? I don't see it used anywhere?
2019-08-07 22:43:9 (GMT)
BTW, what it
local_introsfor? I don't see it used anywhere?
That's a good question. It might be a remnant of some previous version of the code. Try to delete it and see if zipperposition --checkreturns the same results :grinning:
2019-08-07 23:4:25 (GMT)
Converting to LLTerm.t before instantiating would avoid instantiating twice (with and without the substitution) I think. Might be a detail.
2019-08-07 23:5:2 (GMT)
Not sure sidekick is particularly relevant for instantiation steps. You can try it whenever you want, it only modifies dune and the LLProver part.
2019-08-14 11:2:25 (GMT)
I have basic CNF checking now (Advanced features like definition unfolding don't work yet). I've also just tried sidekick. Unfortunately, sidekick does not do the ∃ to ¬∀¬ conversion that we had previously. @Simon Cruanes Any tips how to add that?
2019-08-14 11:24:1 (GMT)
This works, but is that how it's supposed to be done? I added in LLProver.ml, in cc_view:
| T.Bind {body;binder=Binder.Exists;ty_var} -> cc_view (T.Form.not_ (T.bind ~ty:T.bool Binder.Forall ~ty_var (T.Form.not_ body)))
2019-08-14 13:34:24 (GMT)
I think this could be done in the convertion to LLTerm.t instead. We don't really need to distinguish exists and forall in the low level proof terms (a more canonical form should do).
2019-08-14 14:50:4 (GMT)
(I mean, converting an ∃ to LLTerm.t would directly give the ∀ version + 2 negations)
2019-08-14 20:10:33 (GMT)
Yes, I prefer that, too. Done!
2019-08-15 13:59:45 (GMT)
Sidekick seems to be able to beta-reduce and also to rewrite under quantifiers, but it does not beta-reduce under quantifiers. How can I change that?
2019-08-15 14:16:24 (GMT)
I think I understand how rewriting under binders is possible, but how does beta-reduction work?
2019-08-15 14:45:56 (GMT)
I believe the β-reduction is done as part of the translation to LLTerm.t (or possibly as part of the instantiation). To get something truly robust we need a theory of β-reduction _inside_ the SMT solver, which would work modulo congruence closure (ie. from f = λx. u and f a it would add f a = u[a/x]).
2019-08-15 16:24:40 (GMT)
So that would be a second theory besides the boolean theory that you've already added, right?
2019-08-15 16:27:50 (GMT)
Yes indeed. The boolean theory is responsible for turning connectives into clauses. For a proper full checker we'd need something for and β reduction (modulo congruence closure), and possibly LRA/LIA.
2019-08-15 18:44:27 (GMT)
Ah, and datatypes.
An upside to this is that we could also use the resulting SMT for doing AVATAR modulo theories. But I don't suppose you're very interested in that :-)
2019-08-16 9:39:2 (GMT)
The theory of beta-eta-equivalence is not as easy to do as I thought. Apparently, a simplifier that beta-eta-reduces is not enough. Am I right that congruence closure only rewrites in one direction? For example, if (λ x y, f y x) = g and g a b ≠ f b a are given, CC might not find the contradiction because it decided to rewrite (λ x y, f y x) → g and not g → (λ x y, f y x).
2019-08-16 9:42:52 (GMT)
A second problem is that Sidekick currently adds equations of DB indices to the congruence closure. For example, when it encounters the term (λx y, x = y), it adds DB0 = DB1 to the congruence closure.
2019-08-16 13:27:6 (GMT)
A second problem is that Sidekick currently adds equations of DB indices to the congruence closure. For example, when it encounters the term
(λx y, x = y), it addsDB0 = DB1to the congruence closure.
Hmm right, we need to think that through. Typically we could make non-closed terms opaque to the CC, I think.
As for the theory of β-reduction it's indeed not that easy; like a lot of syntactic theories it can be reduced, in theory, to E-matching the axiom (cc @Haniel Barbosa :stuck_out_tongue_wink:) or write custom code that does the same. Also see the classic SMT theory of arrays, which is basically that (arrays are restricted functions, select is application). So it's doable but a bit of work.
2019-08-16 13:50:44 (GMT)
boy does @Simon Cruanes loves trigger-based instantiation
2019-08-16 13:51:20 (GMT)
(it's a very good way to just have something working rather than a custom procedure, anyways)
2019-08-16 13:51:22 (GMT)
:joy: well sometimes it's a complete decision procedure, right?
2019-08-16 13:51:48 (GMT)
modulo a loooot of engineering challenges, yes :P
2019-08-16 13:52:17 (GMT)
Although it's probably still simpler, right now, to have a small custom theory of arrays, or of β-reduction. So I think we should do that (I need to add actual theories to sidekick anyway).
2019-08-16 13:52:47 (GMT)
theory of B-reduction? I did not follow the long discussion above
2019-08-16 13:53:2 (GMT)
you want to avoid getting rid of it at pre-processing?
2019-08-16 13:53:54 (GMT)
Basically, have sidekick be able to reduce/equate/merge f a to u[a/x] if f = λx. u is in the congruence closure. It's not enough to do it at preprocessing because it might be branch-dependent.
2019-08-16 13:54:45 (GMT)
right
2019-08-16 13:55:13 (GMT)
if you lift the lambda and have a proper instantiation it amounts to the same
2019-08-16 13:55:21 (GMT)
but that is what you were saying...?
2019-08-16 13:55:59 (GMT)
"amounts to the same", efficiency notwithstanding, I mean
2019-08-16 14:20:41 (GMT)
Typically we could make non-closed terms opaque to the CC, I think.
Do you mean this?
let cc_view (t:T.t) : _ V.t =
match T.view t with
[...]
| T.Bind {body;binder;ty_var} -> V.Opaque t
That would indeed affect only a few rules, for which we might want to refer to the extensionality axiom anyway.
Alternatively, we could modify LLTerm.equals to make all Variables artificially different, even if they are identical. What do you think?
2019-08-16 14:30:12 (GMT)
@Alexander Bentkamp that's indeed what I mean. Congruence closure will not enter an Opaque term, it's like a constant (ie. only syntactic equality applies).
2019-08-16 14:32:48 (GMT)
But wait, how can the equality DB0 = DB1 end up in CC at all?
2019-08-16 14:33:25 (GMT)
Just because there is a term (λx y, x = y) doesn't mean that DB0 = DB1, right?
2019-08-16 14:33:47 (GMT)
I'll investigate how that can happen.
2019-08-16 14:42:26 (GMT)
Right, it's indeed weird. I can give you tips on how to debug what happens in sidekick if you want.
2019-08-16 14:45:46 (GMT)
Can I get sidekick's debug output when running zipperpos?
2019-08-16 14:46:14 (GMT)
Currently, I add a theory and use the hooks to get debug output.
2019-08-16 14:53:41 (GMT)
Currently, I add a theory and use the hooks to get debug output.
That's an interesting way of doing it! I hadn't thought about that too much, but it could be useful to have a module in sidekick for doing that more systematically.
Also you can enable sidekick's own debug level (see the executable part of the project which has a debug option something).
2019-08-19 8:45:13 (GMT)
@Simon Cruanes I think I found out why the equality DB0 = DB1 is added to the CC. The function add_new_term_ is called on the term DB0 = DB1, which I think is correct. But then compute_sig0 determines that it is an equation and in the end push_pending is called to queue the equality DB0 = DB1 to be added to the CC. Why is compute_sig0there at all? Shouldn't this be the task of the Boolean theory?
2019-08-19 12:5:36 (GMT)
I have a theory for beta-equivalence now. I used triggers as we discussed. I am not sure if I used the "explanation" correctly though. I guess the explanation why I want to merge f a with g a a should be Expl.mk_merge "f" "λx. g x x", right?
2019-08-19 13:41:20 (GMT)
About compute_sig0: it's because equality is special to the congruence closure. But that's why we should probably wrap any non closed term in Opaque anyway. The content of Avatar atoms or lambda terms shouldn't matter too much to the CC, right?
Or we could flag some equalities as inactive, I don't know.
Re the beta-equivalence: awesome. Is it custom to LLTerm.t? The explanation should be correct, indeed, you just have to say "if f = λx. u then f a = u[a/x]".
2019-08-19 14:42:48 (GMT)
Hm, I still don't understand why compute_sig0 is sound, even if we make non-closed terms opaque. I haven't tested this, but wouldn't a term like f (c = d) also cause c and d to be merged in the CC?
2019-08-19 14:45:59 (GMT)
No, adding the term a = b doesn't merge a and b. It's when you assert it to be true (as a literal) that something happens.
2019-08-19 14:48:58 (GMT)
Is it custom to LLTerm.t?
Yes, sidekick does not know about lambdas, so it needs to be custom to LLTerm.
It's here, if you'd like to see it: https://github.com/benti/zipperposition/blob/esa_proofs_sidekick/src/proofs/LLProver.ml#L156
2019-08-19 14:52:20 (GMT)
I'll look, thanks. If you look at sidekick's codebase, you'll see that theories are functors over a module that defines terms, and a theory-specific view on these terms.
If you're ok with it I might port your theory into a functor :)
2019-08-19 14:53:48 (GMT)
sure
2019-08-19 14:56:47 (GMT)
No, adding the term a = b doesn't merge a and b. It's when you assert it to be true (as a literal) that something happens.
I think the function add_new_term_ accidentally asserts things. The function add_new_term_ should only add the term to the CC and not assert anything, right? But still, it calls push_pending, which asserts things, if I understand it correctly.
2019-08-19 14:58:39 (GMT)
That's weird that it behaves likes that then. It means propagation is meaningless (since things are true already)‽ If you could find a small repro you could file an issue somewhere :-)
Or a PR to sidekick, I'd love that. I can give you tips on how to work on sidekick (not too different from zipperposition).
2019-08-19 15:1:21 (GMT)
My fix would be to remove compute_sig0 :D
2019-08-19 15:2:19 (GMT)
Or maybe it should be called when a literal is asserted. I will have a look.
2019-08-19 15:3:24 (GMT)
I think compute_sig0 is fine, and that it's add_term_rec_ shouldn't assert anything. It's in assert_lit and the likes that merges should happen.
2019-08-19 15:6:14 (GMT)
But assert_lit already looks for equalities and merges them.
2019-08-19 15:10:33 (GMT)
Which is why it's likely a bug if equalities lead to merges anywhere else (and certainly it shouldn't happen in compute_sig0)
2019-08-19 15:33:29 (GMT)
so what are these signatures for?
2019-08-19 15:36:28 (GMT)
The equality signature is used to propagate a=b whenever the classes of a and b are merged. It's also the main mechanism for conflicts: if you assert a ≠ b, it's actually (a=b) ~ false; then if a and b are merged, a=b and true are merged, so true ~ false, which has a builtin case for triggering a conflict.
2019-08-20 13:14:20 (GMT)
I know now why literals like DB0 = DB1 show up. They are added as atoms to the SAT solver in Sidekick_msat_solver.ml/add_bool_subterms_.
2019-08-20 13:15:33 (GMT)
Ah, that's it indeed. We need to modify that so it only does it outside of binders.
2019-08-20 13:15:36 (GMT)
We should probably make all non-ground terms opaque. But maybe we could add a theory that can handle non-ground terms?
2019-08-20 13:16:34 (GMT)
We need to modify that so it only does it outside of binders.
That would be nice, but this part of the code does not know about binders.
2019-08-20 13:20:11 (GMT)
Good point. Maybe I need to rethink slightly the add_bool_subterms so it can take a function to tell it to stop and not enter a subterm (when the subterm isn't closed).
2019-08-21 10:47:31 (GMT)
What part of the code is shared between Zipperposition and its proof checker? How independent is it? What kind of bugs would not be detected?
2019-08-21 13:5:31 (GMT)
The proof checker is relatively independent (that's why I decided to have a separate term representation, LLTerm.t, rather than reusing InnerTerm.t). Of course it's not 100% autonomous, possible hidden bugs could include:
- term translation bugs
- bugs in Id.t
- parsing bugs
2019-08-21 14:40:34 (GMT)
What part of the code is shared between Zipperposition and its proof checker?
Are you asking how trustworthy the checker is, @Alexander Bentkamp ? :slight_smile: I think it should be good for catching most of the bugs. If you want real independently checked certificates (which will probably take time) I think llprover could output some LFSC/dedukti/coq/… from its reconstructed proof, there's no other simple way.
2019-08-21 14:44:47 (GMT)
I am just trying to be prepared for the questions people will ask me :D