2020-05-15 14:59:2 (GMT)
creating topic
2020-05-15 16:44:25 (GMT)
I think I fixed the problem now without the fallback to the old option. It's in the master branch of verit-rmx. I also merged the la-coefficients branch. I will fix mainline veriT in the next few days.
2020-05-15 17:14:57 (GMT)
Danke!
2020-06-09 16:7:15 (GMT)
I implemented the removal of the implicit equality in the branch no-implicit-equality-order. First tests show now problems, but I want to do some performance regression tests before merging.
2020-06-12 11:6:34 (GMT)
I fixed a bug with this that lead to a performance regression, but I still have some weird bugs that disappear when compiling in debugging mode. Hence I will have to do a bit of debugging before merging. But if you want to play around with this you can use that branch.
2020-07-07 22:23:24 (GMT)
@Mathias Fleury @Hans-Jörg : is the experimental data you used in your PxTP paper available somewhere? I'd like to have a set of benchmarks produced by Isabelle and for which veriT produces proofs that are successfully reconstructed as a reference for the output of veriT proofs in CVC4
2020-07-08 4:17:51 (GMT)
No. I used mirabelle, which does not generate them (at least by default). However, I have to rerun mirabelle soon, so I can also try to generate SMT files if you are interested.
2020-07-08 6:7:53 (GMT)
Indeed, there's a 'keep' option or something in Mirabelle you can use to generate the SMT-LIB files.
2020-07-08 6:11:40 (GMT)
Thanks!
2020-07-08 12:46:7 (GMT)
That'd be great, thanks @Mathias Fleury !
2020-08-31 8:57:38 (GMT)
$ grep "veriT_smt" ../mirabelle_{cvc4,z3}_HOLLibrary/*.log | grep -v "timed out" | grep -v "failed" | wc -l
243
$ grep "[^_]smt" ../mirabelle_{cvc4,z3}_HOLLibrary/*.log | grep -v "timed out" | grep -v "failed" | grep -v sledgehammer | wc -l
939
2020-08-31 8:58:11 (GMT)
25.9% veriT :P
2020-08-31 8:59:29 (GMT)
$ grep "veriT_smt (best)" ../mirabelle_{cvc4,z3}_HOLLibrary/*.log | grep -v "timed out" | grep -v "failed" | grep -v sledgehammer | wc -l
35
$ grep "veriT_smt (default)" ../mirabelle_{cvc4,z3}_HOLLibrary/*.log | grep -v "timed out" | grep -v "failed" | grep -v sledgehammer | wc -l
73
$ grep "veriT_smt (ccfv_SIG)" ../mirabelle_{cvc4,z3}_HOLLibrary/*.log | grep -v "timed out" | grep -v "failed" | grep -v sledgehammer | wc -l
25
$ grep "veriT_smt (ccfv_threshold)" ../mirabelle_{cvc4,z3}_HOLLibrary/*.log | grep -v "timed out" | grep -v "failed" | grep -v sledgehammer | wc -l
58
$ grep "veriT_smt (del_insts)" ../mirabelle_{cvc4,z3}_HOLLibrary/*.log | grep -v "timed out" | grep -v "failed" | grep -v sledgehammer | wc -l
52
2020-09-15 19:4:46 (GMT)
@Jasmin Blanchette Do you know why Sascha never used the coefficients given by Z3 to improve reconstruction?
2020-09-15 19:15:44 (GMT)
Laziness? ;)
2020-09-15 19:21:49 (GMT)
Or he got scared when he saw the code of linarith.
2020-09-15 19:24:17 (GMT)
Trying to replace my code by that is a very interesting experience.
2020-09-15 19:24:59 (GMT)
I already learnt more than I ever wanted to know about Drule.implies_intr_hyps (and it is still not working)
2020-09-15 19:29:54 (GMT)
2020-10-06 16:53:47 (GMT)
@Jasmin Blanchette @Pascal Fontaine @Hans-Jörg @Martin Desharnais
Jasmin had the good point that veriT_smt is a bad name for an Isabelle tactic because it is too long to type.
To continue in the Isabelle tradition of naming things after rivers, I like the namelethe (the mythological river [Léthé in French] https://en.wikipedia.org/wiki/Lethe whose name appears in the greek word meaning verity), but the veriT branding gets lost in the name.
Does anyone have a better idea?
For context, shorter name of other tactics (the one that come to my mind): try, smt , auto, simp, best, fast, try0 ; equal: force, metis, arith; longer names: fastforce, sledgehammer, presburger, linarith.
2020-10-06 17:15:28 (GMT)
There's also moura. How about fontaine?
2020-10-06 17:16:0 (GMT)
Like Jean de La Fontaine, of course.
2020-10-06 17:18:52 (GMT)
So it's really going to be a separate tactic, not smt (verit) or something? Because I thought a priori smt can handle multiple SMT solvers. It's just that the syntax for switching is currently too cumbersome.
2020-10-06 17:19:19 (GMT)
My main objection to veriT_smt was in the paper. I didn't know if it was a code name or the real thing.
2020-10-06 17:19:47 (GMT)
The tactic is only a shortcut for the switching syntax
2020-10-06 17:19:49 (GMT)
For a code name, it's not so much length as the redundancy (veriT is an SMT solver, why add _smt).
2020-10-06 17:20:32 (GMT)
Ah, I see. Then, to be honest, I'd go with smt_veriT or, even better, just have a lightweight switching syntax smt (verit) or smt (veriT).
2020-10-06 17:20:37 (GMT)
I don't know if I like the smt (verit) (best) syntax
2020-10-06 17:20:57 (GMT)
I mean by (smt (verit) foo bar). Like metis etc.
2020-10-06 17:21:17 (GMT)
The problem with lethe etc. is that it adds more aliases that people have to learn. That's almost always bad.
2020-10-06 17:21:21 (GMT)
Yeah, I meant for switching the strategy used by veriT
2020-10-06 17:21:58 (GMT)
Oh, you can write by (smt (verit, best) one_plus_one_eq_zero). Again like metis.
2020-10-06 17:22:13 (GMT)
smt would just take a list of options, given in any order.
2020-10-06 17:22:44 (GMT)
I'm not saying the metis syntax is perfect, but it works and it's been working for years and people have never complained
2020-10-06 17:23:17 (GMT)
except Tobias once, but he complained more about the fact that sometimes the options given by Sledgehammer are unnecessarily complicated than about the syntax per se.
2020-10-06 17:24:2 (GMT)
However, in your paper, you can use a code name. For an example, corec etc. is called AmiCo in our paper "Friends with Benefits".
2020-10-06 17:24:5 (GMT)
Right I will change the smt tactic then
2020-10-06 17:24:21 (GMT)
Cool. And you have some weeks to think about the paper.
2020-10-06 17:24:45 (GMT)
Lethe is cool. Or any other name related to water. ;)
2020-10-06 17:24:50 (GMT)
just out of curiosity: how often do people type "smt" by hand vs. it being suggested by Sledgehammer?
2020-10-06 17:25:3 (GMT)
1 vs 999.
2020-10-06 17:25:13 (GMT)
Possibly with a fourth 9. ;)
2020-10-06 17:26:56 (GMT)
The paper "Extending Sledgehammer with SMT" is one of my most cited. It took about three weeks to do the research -- mostly the evaluation. Suddenly, people started using smt. Nobody (or hardly anybody) was doing that before.
2020-10-06 17:31:58 (GMT)
I like the smt + options solution. At least at one place smt will no longer be synonymous with z3 :P
2020-10-06 17:34:34 (GMT)
but Lethe is a really cool name @Mathias Fleury we should find something to name Lethe
2020-10-07 7:59:2 (GMT)
Hi, I do not feel strongly about the name, so feel free to do what you want. I do not dislike credits, but it is not mandatory that veriT appears clearly in the tactic name, or cryptically ;-) . I know this is a joke but to be sure: do not call the tactic fontaine. Among many arguments against this, veriT is the work of several people.
Thanks!
2020-10-07 8:33:37 (GMT)
You never know if it's a joke or not. There is a tactic called moura in Isabelle, no joke. :)
2020-10-17 13:55:36 (GMT)
2020-10-17 14:7:27 (GMT)
I can't wait for everything breaking during the RC phase :P
2020-10-17 15:4:35 (GMT)
The next step, I guess, is for @Martin Desharnais to enable veriT by default in Sledgehammer for people with enough cores (to find proofs) and for reconstruction?
2020-10-17 15:6:8 (GMT)
@Hans-Jörg From what I understand, your work lives in a branch of veriT. What is your and Pascal's plan for integrating it into the main branch? Because I'm hoping the plan is not to let your code "rot" in some nonstandard branch forever. :smiley_cat:
2020-10-17 15:6:25 (GMT)
Anything else that separates us from greatness?
2020-10-17 15:58:2 (GMT)
@Jasmin Blanchette In my understanding Pascal want's to perform some deeper changes to the veriT code base. Essentially fixing some of the rough patches in the code that makes it hard to develop it further. During this my (and Daniels) work would naturally converge into one codebase.
2020-10-17 15:58:20 (GMT)
What's the time frame?
2020-10-17 15:59:25 (GMT)
If this all works, this would be great, but sometimes the better is the enemy of the good.
2020-10-17 16:2:49 (GMT)
See e.g. Joel on Software: https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i/
2020-10-17 16:14:28 (GMT)
oh wait just to ensure the context is correct: are you asking about my code in relation to the Isabelle 2021 release, or the overall development arch of veriT?
2020-10-17 16:14:48 (GMT)
Latter.
2020-10-17 16:15:14 (GMT)
In general, I hate code that rots in a branch, like I've seen so many times with SPASS. Ask Pascal.
2020-10-17 16:50:34 (GMT)
Jasmin Blanchette said:
The next step, I guess, is for Martin Desharnais to enable veriT by default in Sledgehammer for people with enough cores (to find proofs) and for reconstruction?
I have some design questions on that:
- SH tests the smt method if smt_proofs is true. Do we add an extra verit_smt_proofs options or do we keep only one?
- Testing only one strategy for veriT or all?
- Maybe scheduling of the strategies on veriT in Sledgehammer?
2020-10-17 17:26:36 (GMT)
2020-10-17 17:27:54 (GMT)
2020-10-17 17:28:49 (GMT)
2020-10-17 17:30:39 (GMT)
See also tasks #7 and #8 in the SH bug tracker. ;)
2020-10-17 17:34:25 (GMT)
@Hans-Jörg I should say I'm not in a position to throw the first stone. The code of my LICS 2017 paper (https://ieeexplore.ieee.org/document/8005071) is unfinished and rots in a branch. If I could go back in time, I'd refused to submit until we've finished the job. Now it's virtually too late. Isabelle will likely never have nonuniform datatypes. The topic can't even be given to a PhD student, because the publication is already out.
2020-10-19 17:6:3 (GMT)
Here is a patch that fulfils only the first points raised by Jasmin, as base for discussion: http://isabelle.in.tum.de/repos/testboard/rev/1cca528a8b46
Jasmin Blanchette said:
- Testing takes user time. A good target would be to keep the same upper bound as today for Z3, but now for Z3 and veriT. Shorter slices and less Z3 should do it.
@Jasmin Blanchette : you mean when checking each method, so your idea is to divide the timeout by 2 in the function raw_preplay_step_for_method for SMT-based methods, right?
2020-10-19 17:44:27 (GMT)
Minor:
Why not imitate Metis_Method and have the verit option stored in a string list? As discussed last time, there's value in imitating existing idioms instead of doing things differently.
SMT_VERIT_... -> SMT_Verit... or SMT_VeriT_... or SMT_veriT_ (I'd go with the first one since the tool is called verit in Isabelle.
As for division by 2, I don't know; that's for you to find out. All I know is that it should take the same time as before overall in the worst case (failure of reconstruction), not twice or thrice as much time.
2020-10-19 18:0:32 (GMT)
is replaying methods threaded or not?
2020-10-21 17:34:26 (GMT)
@Jasmin Blanchette the algorithm in Sledgehammer used to redirect proofs does not take into account variable appearing in the steps, right? (this is relevant for translating skolems from veriT to something SH can deal with. If not, I have to make sure that steps involving them appear only after their introduction… )
2020-10-21 17:52:58 (GMT)
I think the idea is that we always made sure so far you'd have two steps: before and after skolemization, with an arrow in between. Then it worked.
2020-10-21 17:54:0 (GMT)
I suspect whatever veriT does cannot be so different from all the other provers, for which this worked. There should be a way to massage the veriT proof if necessary -- I'd guess.
2020-10-21 17:59:7 (GMT)
Okay so far I replaced:
step1:
by
step1:
but I guess, I need to do multi-step replacements to have this arrow, namely by replacing
step1:
step2:
by
step2: .
Thanks!
2020-10-21 18:0:31 (GMT)
Replacing = by arrows is something I tried with Z3. It ended with tears.
2020-10-21 18:0:51 (GMT)
I'm unfortunately completely swamped this week but I'd be happy to discuss next week. ;)
2020-10-21 18:5:48 (GMT)
Damn. That will get super ugly when the skolemization happen in subproofs, because step 2 can be implicit.