zulip log viewer

Enumerative Quantifier Instantiation

Jasmin Blanchette

2020-08-14 13:8:19 (GMT)

I'm trying to make sense of the enumerative QI paper by Reynolds et al., and one aspect that's not clear to me is skolemization and clausification. Is it so that CVC4 and veriT first put a formula in Skolem normal form (as hinted by the comment on p. 2) and then apply miniscoping (not mentioned in the paper) to move the quantifiers as far in and get a split $G \cup Q$ ? I think my problem stems from the fact that the authors, on p. 2, assume familiarity with the notion of "quantified formula", but their notion seems to be different than mine, and their substitution operators seem to magically remove the leading foralls. Anyway, any reference or explanation that would help me understand what actually happens in theory and practice (so I can make sure to model it faithfully) would be greatly appreciated.

Jasmin Blanchette

2020-08-14 13:9:42 (GMT)

@Pascal Fontaine maybe you can help me too. ;)

Simon Cruanes

2020-08-14 13:22:53 (GMT)

so that's the next bookclub, heh? :)

Haniel Barbosa

2020-08-14 13:25:9 (GMT)

I don't quite understand what you are not understanding. :)

Skolemization is done a priori, generally only on top-level skolemizable quantifiers

Haniel Barbosa

2020-08-14 13:25:30 (GMT)

as in, skolemizable quantifiers that are not on the scope of non-skolemizable quantifiers

Haniel Barbosa

2020-08-14 13:26:2 (GMT)

thus the quantifier formulas that will later on be instantiated may contain quantifiers in their scope

Jasmin Blanchette

2020-08-14 13:26:55 (GMT)

To put it simply, I don't understand anything.

Jasmin Blanchette

2020-08-14 13:27:17 (GMT)

I don't understand fully what's the initial format, what transformations are applied in what order, and so on.

Jasmin Blanchette

2020-08-14 13:27:22 (GMT)

Your answer helps a bit.

Jasmin Blanchette

2020-08-14 13:28:8 (GMT)

But seeing how you apply $\sigma$ to $q$ , which is "a quantified formula" (could it be $\exists x. p(x)$ ??), that doesn't reassure me. :S I have to infer what you mean.

Haniel Barbosa

2020-08-14 13:28:14 (GMT)

What I said is how both CVC4 and veriT handle quantifiers. IIRC in the paper we assume that the formula is preprocessed so that prenexing/minscoping has been done

Jasmin Blanchette

2020-08-14 13:28:39 (GMT)

The paper never even mentions the phrase miniscoping. You see why I'm confused?

Haniel Barbosa

2020-08-14 13:28:49 (GMT)

miniscoping is an optimization

Haniel Barbosa

2020-08-14 13:28:53 (GMT)

prenexing suffices

Jasmin Blanchette

2020-08-14 13:29:7 (GMT)

Sure, but then everything is universal and you don't need G, just Q.

Jasmin Blanchette

2020-08-14 13:29:27 (GMT)

Or E, whatever. ;)

Haniel Barbosa

2020-08-14 13:29:29 (GMT)

true :)

Haniel Barbosa

2020-08-14 13:29:47 (GMT)

let me look what we wrote :)

Jasmin Blanchette

2020-08-14 13:29:49 (GMT)

I'm trying to understand and model a realistic SMT solver, not something tha merely works in principle.

Jasmin Blanchette

2020-08-14 13:30:34 (GMT)

Seriously, in the real CVC4 or veriT:

Jasmin Blanchette

2020-08-14 13:30:50 (GMT)

If I have $p \longleftrightarrow (\forall x. q(x))$ ,

Jasmin Blanchette

2020-08-14 13:31:23 (GMT)

Does that immediately get massaged in such a way that Skolems are introduced and the forall is on top?

Haniel Barbosa

2020-08-14 13:32:30 (GMT)

I'm not sure what CVC4 will do. What veriT will do: it will eliminate the iff operator so that the negative occurrence of the forall can be skolemized

Jasmin Blanchette

2020-08-14 13:32:33 (GMT)

And in Theorem 3, when you write "finite sets of instances of Q", you mean ground instances, right?

Haniel Barbosa

2020-08-14 13:32:39 (GMT)

yes

Haniel Barbosa

2020-08-14 13:32:53 (GMT)

In the paper we say:
"We assume that all quantified formulas in ψ are of the form ∀x̄. φ with φ quantifier-free. This can be achieved by prenex form transformation and Skolemization."

Jasmin Blanchette

2020-08-14 13:33:1 (GMT)

OK, so there's still a huge disconnect between the paper and the implementation.

Jasmin Blanchette

2020-08-14 13:33:35 (GMT)

Where is that written?

Haniel Barbosa

2020-08-14 13:33:43 (GMT)

the difference between that and what happens in practice is that \phi is not quanifier-free because skolemization is done only when it'd generate constant functions and that prenexing is not generally done

Jasmin Blanchette

2020-08-14 13:33:56 (GMT)

I'm starting to wonder if the version I got from the Matryoshka web site is up to date :S

Haniel Barbosa

2020-08-14 13:34:5 (GMT)

that is not written because it does not matter to the scope of the paper :)

Jasmin Blanchette

2020-08-14 13:34:38 (GMT)

I can't parse your last 3 sentences.

Haniel Barbosa

2020-08-14 13:34:41 (GMT)

(funny enough the one question about this paper when I presented it at TACAS was exactly about this...)

Jasmin Blanchette

2020-08-14 13:34:51 (GMT)

I'm not surprised.

Haniel Barbosa

2020-08-14 13:34:54 (GMT)

hahaha

Haniel Barbosa

2020-08-14 13:34:57 (GMT)

I don't get it, really

Jasmin Blanchette

2020-08-14 13:35:10 (GMT)

You quoted from your own paper. I cannot find the quote.

Jasmin Blanchette

2020-08-14 13:35:14 (GMT)

On which page is it?

Haniel Barbosa

2020-08-14 13:35:19 (GMT)

But I'd like that you understood this so please let me help

Haniel Barbosa

2020-08-14 13:35:23 (GMT)

I quoted from page 3

Haniel Barbosa

2020-08-14 13:35:40 (GMT)

in the "Instantiation-based SMT solvers"

Jasmin Blanchette

2020-08-14 13:35:54 (GMT)

Ah, I'm reading "Revisiting Enumerative Instantiation".

Haniel Barbosa

2020-08-14 13:36:11 (GMT)

So am I. That is the name of the subsection

Jasmin Blanchette

2020-08-14 13:36:13 (GMT)

I thought so much was clear from the title of this thread. :S

Jasmin Blanchette

2020-08-14 13:36:38 (GMT)

As I said, I think I'm reading an obsolete (e.g. submitted) version.

Jasmin Blanchette

2020-08-14 13:36:52 (GMT)

I repeat my claim that the version I got from the Matryoshka web site is probably not up to date.

Haniel Barbosa

2020-08-14 13:37:3 (GMT)

Reynolds2018.pdf

Jasmin Blanchette

2020-08-14 13:37:57 (GMT)

Indeed, that was it. Thanks for clearing that up.

Jasmin Blanchette

2020-08-14 13:38:15 (GMT)

Note to self to update the PDF next week (after migration to the new web site).

Jasmin Blanchette

2020-08-14 13:39:32 (GMT)

OK, so you do agree there's a quite nasty gap between the theory & implem.

Jasmin Blanchette

2020-08-14 13:39:49 (GMT)

I.e. even if the theory is complete and we strongly suspect the implem is, there's no easy way to bridge the two.

Haniel Barbosa

2020-08-14 13:40:21 (GMT)

you mean "complete" as in the completeness of FOL?

Jasmin Blanchette

2020-08-14 13:40:29 (GMT)

Yes, refutationally complete.

Haniel Barbosa

2020-08-14 13:40:39 (GMT)

OK. I do not think the gap is nasty

Jasmin Blanchette

2020-08-14 13:40:42 (GMT)

An SMT+QI solver is basically a first-order prover.

Haniel Barbosa

2020-08-14 13:40:44 (GMT)

Yes, there is a gap

Jasmin Blanchette

2020-08-14 13:40:45 (GMT)

Why not?

Jasmin Blanchette

2020-08-14 13:40:50 (GMT)

How would you fill it?

Haniel Barbosa

2020-08-14 13:41:2 (GMT)

because one can prove (nobody has) that the gap can be bridged

Haniel Barbosa

2020-08-14 13:41:9 (GMT)

what the implementation is doing

Jasmin Blanchette

2020-08-14 13:41:18 (GMT)

Yes but the implementation might be incomplete.

Jasmin Blanchette

2020-08-14 13:41:24 (GMT)

This just begs the question.

Haniel Barbosa

2020-08-14 13:41:30 (GMT)

is equivalent to what the paper assumes. But it is done incrementally, as the formulas are instantiated

Jasmin Blanchette

2020-08-14 13:41:47 (GMT)

Yes, but that changes everything.

Haniel Barbosa

2020-08-14 13:41:56 (GMT)

That is, of course, assuming the implementation is applying the technique as described in the paper. CVC4 and veriT do

Jasmin Blanchette

2020-08-14 13:41:58 (GMT)

It changes the order the models could be generated etc.

Haniel Barbosa

2020-08-14 13:42:10 (GMT)

(veriT kinda actually, there are some issues)

Jasmin Blanchette

2020-08-14 13:42:18 (GMT)

Exactly.

Jasmin Blanchette

2020-08-14 13:42:32 (GMT)

My point is, you haven't proved a thing until you've proved a thing.

Haniel Barbosa

2020-08-14 13:42:42 (GMT)

I agree with you. I'm conjecturing

Haniel Barbosa

2020-08-14 13:42:48 (GMT)

I'm confident enough to not care though

Jasmin Blanchette

2020-08-14 13:42:55 (GMT)

As I explained to Pascal last time, it's not even clear a priori that enum. QI + SMT is refut. complete.

Haniel Barbosa

2020-08-14 13:43:18 (GMT)

The theorem 3 in this paper, the main one, is something that nobody was surprised to see, because everybody trusts it as well

Jasmin Blanchette

2020-08-14 13:43:19 (GMT)

I am convinced that it is, and one of the reasons for this is related, obscurely, to the existence of limit points in topological spaces.

Haniel Barbosa

2020-08-14 13:43:30 (GMT)

but I think it's quite nice it exists

Jasmin Blanchette

2020-08-14 13:43:47 (GMT)

But one could in principle imagine a scenario where the ground SMT solver and QI spin forever.

Jasmin Blanchette

2020-08-14 13:43:59 (GMT)

And Herbrand couldn't help because new formulas and models would keep appearing.

Haniel Barbosa

2020-08-14 13:44:11 (GMT)

in pure equational FOL?

Jasmin Blanchette

2020-08-14 13:44:13 (GMT)

E.g. if you instantiate a Bool with a formula.

Jasmin Blanchette

2020-08-14 13:44:30 (GMT)

All of this is "a priori". I'm going to prove that it cannot happen.

Jasmin Blanchette

2020-08-14 13:44:38 (GMT)

For some reasonable logics, strategies, etc.

Jasmin Blanchette

2020-08-14 13:46:1 (GMT)

My point is just, again, until we've proved it, we haven't proved it, and there's a substantial gap between Hebrand++ (from the EQI paper) and actual refutational completeness of a solver based on that, let alone CVC4 and veriT.

Jasmin Blanchette

2020-08-14 13:46:39 (GMT)

Thanks for your help. This was extremely useful. :)

Haniel Barbosa

2020-08-14 13:47:4 (GMT)

Yes, you're not wrong

Haniel Barbosa

2020-08-14 13:47:25 (GMT)

the nastiest thing might be how the quantifier instantiation techniques interact

Jasmin Blanchette

2020-08-14 13:48:46 (GMT)

Hm, I'm not too worried about that, actually. From my point of view, it's always OK to throw more formulas into the mix.

Jasmin Blanchette

2020-08-14 13:49:11 (GMT)

But what I'm seeing is that this whole topic is too big to be a section or a subsection in a paper on AVATAR. I'll have to come back to it later.

Haniel Barbosa

2020-08-14 13:50:30 (GMT)

Jasmin Blanchette said:

Hm, I'm not too worried about that, actually. From my point of view, it's always OK to throw more formulas into the mix.

Ah, right. I was thinking of "not doing one but doing other" but what actually happens in the solver is that if one technique fails another is applied in a completely independent manner.

Jasmin Blanchette

2020-08-14 14:23:17 (GMT)

I'm confident enough to not care though

My conversations with Giles and some theoretical incompletenesses of AVATAR led to very similar comments. ;)

Jasmin Blanchette

2020-08-14 14:23:29 (GMT)

I don't mind if people don't care. I do care, though. ;)

Jasmin Blanchette

2020-08-14 14:24:12 (GMT)

These issues become increasingly important as the techniques get generalized (e.g. to HOL), adapted, or reimplemented.

Jasmin Blanchette

2020-08-14 14:24:31 (GMT)

What I'm trying to say is, one can see it as a matter of self-respect to care. :)

Haniel Barbosa

2020-08-14 14:28:20 (GMT)

hahaha yeah, I get it. I completely understand other people caring and I'd be happy to see such results, as I am with the theoretical part of this EQI paper.

Hans-Jörg

2020-08-14 14:28:55 (GMT)

Does that mean authors of SMT solvers (or similar systems) who are writing papers about their methods should be more precise about the gap between theoretical presentation and implementation and prove that the results for the theoretical method can be lifted to the practical implementation?

Haniel Barbosa

2020-08-14 14:29:10 (GMT)

I just don't believe much in proving the exact behavior of implementations correct. I like such results modulo nice assumptions. For the nastiest stuff I'm happy with checking certificates from the systems. :)

Jasmin Blanchette

2020-08-14 14:29:34 (GMT)

There are two types of gaps.

Jasmin Blanchette

2020-08-14 14:29:45 (GMT)

You have those that can be closed by refinement and those that can't.

Jasmin Blanchette

2020-08-14 14:30:17 (GMT)

The first type is fine. You're bound to have some gap between a 15-page paper and a 30 000 line implementation.

Jasmin Blanchette

2020-08-14 14:30:31 (GMT)

It's the second type of gap that I find worrying.

Jasmin Blanchette

2020-08-14 14:31:5 (GMT)

Basically, when the theory doesn't quite capture accurately what happens in practice and couldn't be used to prove the implementation correct.

Jasmin Blanchette

2020-08-14 14:31:51 (GMT)

Distinguishing the two types of gaps isn't trivial. Sometimes, a clever idea (taking the form of a sophisticated refinement relation) can be used to bridge the gap, unexpectedly.

Jasmin Blanchette

2020-08-14 14:33:2 (GMT)

Looking at EQI, one would be tempted to generalize the theory (the Herbrand theorem), so that it also works with nonground terms.

Jasmin Blanchette

2020-08-14 14:33:15 (GMT)

But maybe the general case can be reduced to the ground case.

Jasmin Blanchette

2020-08-14 14:33:50 (GMT)

(By "refinement", above, I meant specifically "step-wise refinement", a la @Mathias Fleury 's PhD thesis.)

Jasmin Blanchette

2020-08-14 14:35:35 (GMT)

Sometimes the gap is just that the theory considers a smaller set of connectives. In theory, there is a gap, and one must generalize the theory, but in practice, this step is accepted "by analogy" by most people (but, mind you, not by proof assistants). I don't see this as problematic; if I did, I'd have to reject the majority of math and TCS papers.

Jasmin Blanchette

2020-08-14 14:37:20 (GMT)

@Haniel Barbosa you might find the following rewiew extract (fresh off the press) funny/sad:

A first big point concerns the reliability of this work. As things get more
and more difficult and involved, how can you guarantee that no mistake
has been made? Or at least, can you try to provide or evidence or partial
support in this direction?
This issue can be ignored to some extent for clearcut basic algorithms
-- this was in particular the foundation of the LCF architecture by
Milner in the 1970s. But it can be less and less ignored as things get
more complicated. This is why even the Field Medalist Veovodsky, or at
a modest level, researchers submitting their wotk to conferences such
as POPL or LICS formalize their work using a well-kwn prof assistant
such as Agda, Coq or Isabelle. Sure, it involves a hard work on top
of the already hard conceptual work performed.

My first suggestion would then to provide a formalization if the work in a
reliable proof assistant. This formalization should contain clear statements,
instructions for replaying the proof and check that no hypothesis or axiom is
left under the carpet. This way many people can get convinced that
the theorems hold without needing to follow all technical details.
As a compensation, a high-level description of what is going on is enough.
If this is out of reach in a reasonable amount of time, then authors could
at least explain what is their TCB (trusted code base) and prepare a roadmap
to reduce it.

Jasmin Blanchette

2020-08-14 14:37:37 (GMT)

The context is E.

Hans-Jörg

2020-08-14 14:38:15 (GMT)

I think I see where you're getting at

Haniel Barbosa

2020-08-14 14:42:14 (GMT)

@Jasmin Blanchette I find it naive...

Haniel Barbosa

2020-08-14 14:43:2 (GMT)

And apparently experimental evaluation was not taken into account at all in the confidence part for the results?

Jasmin Blanchette

2020-08-14 14:58:1 (GMT)

The reviewer was a self-confessing ITP-Mensch. They read the paper with their ITP goggles on, which means focus on 100.0000% certainty.

Jasmin Blanchette

2020-08-14 14:58:27 (GMT)

The answer is easy: We'll sketch both approaches (verifiying vs. certificates) and discuss the merits of both.

Jasmin Blanchette

2020-08-14 14:58:43 (GMT)

Regardless, so that the formalizers have work to do, somebody's got to develop the theory in the first place. ;)

Simon Cruanes

2020-08-14 15:43:29 (GMT)

isn't the answer "we'll have a proof checker" anyway? :upside_down:

Hans-Jörg

2020-08-14 15:47:56 (GMT)

Well that wont help you for completeness.

Simon Cruanes

2020-08-14 15:55:44 (GMT)

Indeed not. Depends on what the goal is indeed. But personally when it comes to automatic theorem provers, in practice, I'm mostly concerned about them being correct.

Jasmin Blanchette

2020-08-14 16:0:21 (GMT)

I think they're worried about soundness. When I wrote "sketch both approaches", I meant we'll explain that proof checker is the way to go in practice (cf. Heule).

Jasmin Blanchette

2020-08-14 16:1:26 (GMT)

Completeness is still not to be disregarded in practice -- for various decidable fragments. But even there, you can have independent checkers of saturations.

Hans-Jörg

2020-08-14 16:5:12 (GMT)

Culturally completeness seem to matter. Most of the time one starts with a calculus that one argues that it's complete and then might restrict it to an incomplete one.

Jasmin Blanchette

2020-08-14 16:43:48 (GMT)

Apart from soundness, completeness is one of the very few theoretical things we can prove about a prover. That's what distinguishes some-program-that-does-some-stuff and a good prover. I guess it has so much focus because there's not much else we can prove and because it's hard.

Haniel Barbosa

2020-08-14 16:46:13 (GMT)

But we still only prove it modulo a million simplifying assumptions

Haniel Barbosa

2020-08-14 16:46:30 (GMT)

some more serious than others, as we discussed :)

Jasmin Blanchette

2020-08-15 13:48:54 (GMT)

https://www.youtube.com/watch?v=qAXzzHM8zLw