zulip log viewer

CNF Proof checking

Alexander Bentkamp

2019-08-01 14:17:22 (GMT)

Hi @Simon Cruanes ,
I've started to work on proof checking CNF. One problem are the quantifiers because the low-level-prover can only prove ground problems. Your idea to use the variable names to identify what has to be instantiated with what is very cool, but it seems that Skolemization renames variables (search for Var.copy in Cnf.ml). Do you know why variables get renamed there?

Simon Cruanes

2019-08-01 14:54:16 (GMT)

This copy might be unnecessary, it's there because it's cleaner to rename variables on the fly when performing a substitution. You can try to remove it to see how it goes :p
Another possibility is to track skolemization as a separate proof step which also records the variable mappings explicitly (so it's easy to remember what corresponds to what).

Simon Cruanes

2019-08-01 14:54:53 (GMT)

Semi-related: sidekick compiles again so we could try to use it for the LLProver at some point (to see if it'd help with performance, at least), on a branch.

Alexander Bentkamp

2019-08-01 15:5:32 (GMT)

Ok, I'll try to remove it. Concerning sidekick, I'll first try to improve the coverage of proof checking and efficiency later...

Simon Cruanes

2019-08-01 15:6:26 (GMT)

Of course, I'm just mentioning it as a potential thing to explore in the future (it should be more efficient than the naive Tableau), but ofc make it correct first, fast second.

Simon Cruanes

2019-08-01 16:20:34 (GMT)

@Alexander Bentkamp are you working from master on that?

Alexander Bentkamp

2019-08-01 16:23:15 (GMT)

No, from lzip

Simon Cruanes

2019-08-01 16:27:2 (GMT)

Alright! Good luck!

Alexander Bentkamp

2019-08-06 17:47:51 (GMT)

Hi @Simon,
I have started to work on proof checking for CNF-transformation. I currently focus on solving the problem of shifting scopes of quantifiers. My example is:

q ∨ (∀ x. p x)

which CNF-transforms into

∀ x. (p x ∨ q)

My approach is the following: When converting a proof step from Proof to LLProof, you implemented a mechanism to ground all formulas using fresh skolem constants. I switched that off completely. Instead, I go through all subterms of the step's conjecture that start with a quantifier and add a corresponding instance of the axiom of choice as a premise to the proof step. In the example that would be

(p sk ∨ q) ⇒ (∀ x. p x ∨ q)

Then, I go through all subterms of the step's premises that start with a quantifier and add a corresponding instance of the axiom of instantiation as a premise to the proof step. In the example that would be

(∀ x. p x) ⇒ p sk

Using the variable names, I can identify which skolem to use. With these additional premises, the ground prover can prove the step correct.

Now I am tempted to use this approach not only in CNF steps, but everywhere. I would remove your implementation of grounding with fresh skolems completely and replace it by this more general approach. What do you think? It's a trade-off between the amount of trusted code and efficiency.

Simon Cruanes

2019-08-06 18:12:49 (GMT)

I'd rather not disable it for other steps, intuitively, for now. First, I'm not sure you can rely on the same trick of using the variable name in Term.t-based steps (where variables are nameless). Second, efficiency matters, we're not trying to have as small a kernel as possible imho [1].

A side note for efficiency: for some problems the current checker is a bottleneck, as I already mentioned. For example, see examples/data/mult_300.zf. I used the profiling script with perf and noticed that a lot of the overhead seems to be in term conversions (going from large terms to large TypedSTerm.t to large LLTerm.t. It might be very interesting to cut there and convert directly to LLTerm.t without intermediate once the checking of CNF and HO is complete.

[1] rather, if you want more trust, the checker should have the capability to output Isabelle/Coq/Dedukti traces that it reconstructs from its lower level, more precise proofs (Proof.t is far too high level anyway).

Simon Cruanes

2019-08-06 18:39:19 (GMT)

Note: I might have missed some part of your idea. If you think this mechanism of instantiating choice could also work for normal steps, I'm open to hear it. It'd be particularly interesting, performance-wise, if it could occur after the step is a LLTerm.t (i.e. first convert, then instantiate DB vars with skolems).

Simon Cruanes

2019-08-06 18:40:53 (GMT)

Other note: I have a branch with sidekick replacing the LLProver. It's faster but there's still a lot of overhead on mult_300.zf because of term conversion. I think it should be the way to go eventually as it would handle theories, boolean subterms, etc. properly.
Otherwise I can help to design and write a less naive Tableau+CC prover when the time comes.

Alexander Bentkamp

2019-08-07 10:14:18 (GMT)

I'm not sure you can rely on the same trick of using the variable name in Term.t-based steps (where variables are nameless)

The free variables (i.e. implicitly universally quantified) of a clause have names. Your current implementation also uses those names to know which variables to ground with the same skolem symbol.

It'd be particularly interesting, performance-wise, if it could occur after the step is a LLTerm.t (i.e. first convert, then instantiate DB vars with skolems).

Why would that be better?

Other note: I have a branch with sidekick replacing the LLProver. It's faster but there's still a lot of overhead on mult_300.zf because of term conversion. I think it should be the way to go eventually as it would handle theories, boolean subterms, etc. properly.
Otherwise I can help to design and write a less naive Tableau+CC prover when the time comes.

Very cool! Maybe I should switch to that branch immediately? Would you then want to handle the instantiations/grounding as a theory, too?

Alexander Bentkamp

2019-08-07 12:40:34 (GMT)

BTW, what it local_intros for? I don't see it used anywhere?

Simon Cruanes

2019-08-07 22:43:9 (GMT)

BTW, what it local_intros for? I don't see it used anywhere?

That's a good question. It might be a remnant of some previous version of the code. Try to delete it and see if zipperposition --checkreturns the same results :grinning:

Simon Cruanes

2019-08-07 23:4:25 (GMT)

Converting to LLTerm.t before instantiating would avoid instantiating twice (with and without the substitution) I think. Might be a detail.

Simon Cruanes

2019-08-07 23:5:2 (GMT)

Not sure sidekick is particularly relevant for instantiation steps. You can try it whenever you want, it only modifies dune and the LLProver part.

Alexander Bentkamp

2019-08-14 11:2:25 (GMT)

I have basic CNF checking now (Advanced features like definition unfolding don't work yet). I've also just tried sidekick. Unfortunately, sidekick does not do the ∃ to ¬∀¬ conversion that we had previously. @Simon Cruanes Any tips how to add that?

Alexander Bentkamp

2019-08-14 11:24:1 (GMT)

This works, but is that how it's supposed to be done? I added in LLProver.ml, in cc_view:

      | T.Bind {body;binder=Binder.Exists;ty_var} ->
        cc_view (T.Form.not_ (T.bind ~ty:T.bool Binder.Forall ~ty_var (T.Form.not_ body)))

Simon Cruanes

2019-08-14 13:34:24 (GMT)

I think this could be done in the convertion to LLTerm.t instead. We don't really need to distinguish exists and forall in the low level proof terms (a more canonical form should do).

Simon Cruanes

2019-08-14 14:50:4 (GMT)

(I mean, converting an ∃ to LLTerm.t would directly give the ∀ version + 2 negations)

Alexander Bentkamp

2019-08-14 20:10:33 (GMT)

Yes, I prefer that, too. Done!

Alexander Bentkamp

2019-08-15 13:59:45 (GMT)

Sidekick seems to be able to beta-reduce and also to rewrite under quantifiers, but it does not beta-reduce under quantifiers. How can I change that?

Alexander Bentkamp

2019-08-15 14:16:24 (GMT)

I think I understand how rewriting under binders is possible, but how does beta-reduction work?

Simon Cruanes

2019-08-15 14:45:56 (GMT)

I believe the β-reduction is done as part of the translation to LLTerm.t (or possibly as part of the instantiation). To get something truly robust we need a theory of β-reduction _inside_ the SMT solver, which would work modulo congruence closure (ie. from f = λx. u and f a it would add f a = u[a/x]).

Alexander Bentkamp

2019-08-15 16:24:40 (GMT)

So that would be a second theory besides the boolean theory that you've already added, right?

Simon Cruanes

2019-08-15 16:27:50 (GMT)

Yes indeed. The boolean theory is responsible for turning connectives into clauses. For a proper full checker we'd need something for $\eta$ and β reduction (modulo congruence closure), and possibly LRA/LIA.

Simon Cruanes

2019-08-15 18:44:27 (GMT)

Ah, and datatypes.
An upside to this is that we could also use the resulting SMT for doing AVATAR modulo theories. But I don't suppose you're very interested in that :-)

Alexander Bentkamp

2019-08-16 9:39:2 (GMT)

The theory of beta-eta-equivalence is not as easy to do as I thought. Apparently, a simplifier that beta-eta-reduces is not enough. Am I right that congruence closure only rewrites in one direction? For example, if (λ x y, f y x) = g and g a b ≠ f b a are given, CC might not find the contradiction because it decided to rewrite (λ x y, f y x) → g and not g → (λ x y, f y x).

Alexander Bentkamp

2019-08-16 9:42:52 (GMT)

A second problem is that Sidekick currently adds equations of DB indices to the congruence closure. For example, when it encounters the term (λx y, x = y), it adds DB0 = DB1 to the congruence closure.

Simon Cruanes

2019-08-16 13:27:6 (GMT)

A second problem is that Sidekick currently adds equations of DB indices to the congruence closure. For example, when it encounters the term (λx y, x = y), it adds DB0 = DB1 to the congruence closure.

Hmm right, we need to think that through. Typically we could make non-closed terms opaque to the CC, I think.
As for the theory of β-reduction it's indeed not that easy; like a lot of syntactic theories it can be reduced, in theory, to E-matching the axiom (cc @Haniel Barbosa :stuck_out_tongue_wink:) or write custom code that does the same. Also see the classic SMT theory of arrays, which is basically that (arrays are restricted functions, select is application). So it's doable but a bit of work.

Haniel Barbosa

2019-08-16 13:50:44 (GMT)

boy does @Simon Cruanes loves trigger-based instantiation

Haniel Barbosa

2019-08-16 13:51:20 (GMT)

(it's a very good way to just have something working rather than a custom procedure, anyways)

Simon Cruanes

2019-08-16 13:51:22 (GMT)

:joy: well sometimes it's a complete decision procedure, right?

Haniel Barbosa

2019-08-16 13:51:48 (GMT)

modulo a loooot of engineering challenges, yes :P

Simon Cruanes

2019-08-16 13:52:17 (GMT)

Although it's probably still simpler, right now, to have a small custom theory of arrays, or of β-reduction. So I think we should do that (I need to add actual theories to sidekick anyway).

Haniel Barbosa

2019-08-16 13:52:47 (GMT)

theory of B-reduction? I did not follow the long discussion above

Haniel Barbosa

2019-08-16 13:53:2 (GMT)

you want to avoid getting rid of it at pre-processing?

Simon Cruanes

2019-08-16 13:53:54 (GMT)

Basically, have sidekick be able to reduce/equate/merge f a to u[a/x] if f = λx. u is in the congruence closure. It's not enough to do it at preprocessing because it might be branch-dependent.

Haniel Barbosa

2019-08-16 13:54:45 (GMT)

right

Haniel Barbosa

2019-08-16 13:55:13 (GMT)

if you lift the lambda and have a proper instantiation it amounts to the same

Haniel Barbosa

2019-08-16 13:55:21 (GMT)

but that is what you were saying...?

Haniel Barbosa

2019-08-16 13:55:59 (GMT)

"amounts to the same", efficiency notwithstanding, I mean

Alexander Bentkamp

2019-08-16 14:20:41 (GMT)

Typically we could make non-closed terms opaque to the CC, I think.

Do you mean this?

let cc_view (t:T.t) : _ V.t =
      match T.view t with
[...]
  | T.Bind {body;binder;ty_var} -> V.Opaque t

That would indeed affect only a few rules, for which we might want to refer to the extensionality axiom anyway.

Alternatively, we could modify LLTerm.equals to make all Variables artificially different, even if they are identical. What do you think?

Simon Cruanes

2019-08-16 14:30:12 (GMT)

@Alexander Bentkamp that's indeed what I mean. Congruence closure will not enter an Opaque term, it's like a constant (ie. only syntactic equality applies).

Alexander Bentkamp

2019-08-16 14:32:48 (GMT)

But wait, how can the equality DB0 = DB1 end up in CC at all?

Alexander Bentkamp

2019-08-16 14:33:25 (GMT)

Just because there is a term (λx y, x = y) doesn't mean that DB0 = DB1, right?

Alexander Bentkamp

2019-08-16 14:33:47 (GMT)

I'll investigate how that can happen.

Simon Cruanes

2019-08-16 14:42:26 (GMT)

Right, it's indeed weird. I can give you tips on how to debug what happens in sidekick if you want.

Alexander Bentkamp

2019-08-16 14:45:46 (GMT)

Can I get sidekick's debug output when running zipperpos?

Alexander Bentkamp

2019-08-16 14:46:14 (GMT)

Currently, I add a theory and use the hooks to get debug output.

Simon Cruanes

2019-08-16 14:53:41 (GMT)

Currently, I add a theory and use the hooks to get debug output.

That's an interesting way of doing it! I hadn't thought about that too much, but it could be useful to have a module in sidekick for doing that more systematically.
Also you can enable sidekick's own debug level (see the executable part of the project which has a debug option something).

Alexander Bentkamp

2019-08-19 8:45:13 (GMT)

@Simon Cruanes I think I found out why the equality DB0 = DB1 is added to the CC. The function add_new_term_ is called on the term DB0 = DB1, which I think is correct. But then compute_sig0 determines that it is an equation and in the end push_pending is called to queue the equality DB0 = DB1 to be added to the CC. Why is compute_sig0there at all? Shouldn't this be the task of the Boolean theory?

Alexander Bentkamp

2019-08-19 12:5:36 (GMT)

I have a theory for beta-equivalence now. I used triggers as we discussed. I am not sure if I used the "explanation" correctly though. I guess the explanation why I want to merge f a with g a a should be Expl.mk_merge "f" "λx. g x x", right?

Simon Cruanes

2019-08-19 13:41:20 (GMT)

About compute_sig0: it's because equality is special to the congruence closure. But that's why we should probably wrap any non closed term in Opaque anyway. The content of Avatar atoms or lambda terms shouldn't matter too much to the CC, right?
Or we could flag some equalities as inactive, I don't know.

Re the beta-equivalence: awesome. Is it custom to LLTerm.t? The explanation should be correct, indeed, you just have to say "if f = λx. u then f a = u[a/x]".

Alexander Bentkamp

2019-08-19 14:42:48 (GMT)

Hm, I still don't understand why compute_sig0 is sound, even if we make non-closed terms opaque. I haven't tested this, but wouldn't a term like f (c = d) also cause c and d to be merged in the CC?

Simon Cruanes

2019-08-19 14:45:59 (GMT)

No, adding the term a = b doesn't merge a and b. It's when you assert it to be true (as a literal) that something happens.

Alexander Bentkamp

2019-08-19 14:48:58 (GMT)

Is it custom to LLTerm.t?

Yes, sidekick does not know about lambdas, so it needs to be custom to LLTerm.
It's here, if you'd like to see it: https://github.com/benti/zipperposition/blob/esa_proofs_sidekick/src/proofs/LLProver.ml#L156

Simon Cruanes

2019-08-19 14:52:20 (GMT)

I'll look, thanks. If you look at sidekick's codebase, you'll see that theories are functors over a module that defines terms, and a theory-specific view on these terms.
If you're ok with it I might port your theory into a functor :)

Alexander Bentkamp

2019-08-19 14:53:48 (GMT)

sure

Alexander Bentkamp

2019-08-19 14:56:47 (GMT)

No, adding the term a = b doesn't merge a and b. It's when you assert it to be true (as a literal) that something happens.

I think the function add_new_term_ accidentally asserts things. The function add_new_term_ should only add the term to the CC and not assert anything, right? But still, it calls push_pending, which asserts things, if I understand it correctly.

Simon Cruanes

2019-08-19 14:58:39 (GMT)

That's weird that it behaves likes that then. It means propagation is meaningless (since things are true already)‽ If you could find a small repro you could file an issue somewhere :-)
Or a PR to sidekick, I'd love that. I can give you tips on how to work on sidekick (not too different from zipperposition).

Alexander Bentkamp

2019-08-19 15:1:21 (GMT)

My fix would be to remove compute_sig0 :D

Alexander Bentkamp

2019-08-19 15:2:19 (GMT)

Or maybe it should be called when a literal is asserted. I will have a look.

Simon Cruanes

2019-08-19 15:3:24 (GMT)

I think compute_sig0 is fine, and that it's add_term_rec_ shouldn't assert anything. It's in assert_lit and the likes that merges should happen.

Alexander Bentkamp

2019-08-19 15:6:14 (GMT)

But assert_lit already looks for equalities and merges them.

Simon Cruanes

2019-08-19 15:10:33 (GMT)

Which is why it's likely a bug if equalities lead to merges anywhere else (and certainly it shouldn't happen in compute_sig0)

Alexander Bentkamp

2019-08-19 15:33:29 (GMT)

so what are these signatures for?

Simon Cruanes

2019-08-19 15:36:28 (GMT)

The equality signature is used to propagate a=b whenever the classes of a and b are merged. It's also the main mechanism for conflicts: if you assert a ≠ b, it's actually (a=b) ~ false; then if a and b are merged, a=b and true are merged, so true ~ false, which has a builtin case for triggering a conflict.

Alexander Bentkamp

2019-08-20 13:14:20 (GMT)

I know now why literals like DB0 = DB1 show up. They are added as atoms to the SAT solver in Sidekick_msat_solver.ml/add_bool_subterms_.

Simon Cruanes

2019-08-20 13:15:33 (GMT)

Ah, that's it indeed. We need to modify that so it only does it outside of binders.

Alexander Bentkamp

2019-08-20 13:15:36 (GMT)

We should probably make all non-ground terms opaque. But maybe we could add a theory that can handle non-ground terms?

Alexander Bentkamp

2019-08-20 13:16:34 (GMT)

We need to modify that so it only does it outside of binders.

That would be nice, but this part of the code does not know about binders.

Simon Cruanes

2019-08-20 13:20:11 (GMT)

Good point. Maybe I need to rethink slightly the add_bool_subterms so it can take a function to tell it to stop and not enter a subterm (when the subterm isn't closed).

Alexander Bentkamp

2019-08-21 10:47:31 (GMT)

What part of the code is shared between Zipperposition and its proof checker? How independent is it? What kind of bugs would not be detected?

Simon Cruanes

2019-08-21 13:5:31 (GMT)

The proof checker is relatively independent (that's why I decided to have a separate term representation, LLTerm.t, rather than reusing InnerTerm.t). Of course it's not 100% autonomous, possible hidden bugs could include:
- term translation bugs
- bugs in Id.t
- parsing bugs

Simon Cruanes

2019-08-21 14:40:34 (GMT)

What part of the code is shared between Zipperposition and its proof checker?

Are you asking how trustworthy the checker is, @Alexander Bentkamp ? :slight_smile: I think it should be good for catching most of the bugs. If you want real independently checked certificates (which will probably take time) I think llprover could output some LFSC/dedukti/coq/… from its reconstructed proof, there's no other simple way.

Alexander Bentkamp

2019-08-21 14:44:47 (GMT)

I am just trying to be prepared for the questions people will ask me :D