zulip log viewer

Release 1.6

Simon Cruanes

2019-03-22 20:32:7 (GMT)

I'd like to make a release soon-ish, withdune and iter and msat 0.8 in it. But I heard Petar has improved the heuristics for first-order so maybe we should wait a bit to integrate that?

Alexander Bentkamp

2019-03-22 20:38:18 (GMT)

Petar's heuristic changes and my bug fixes are all based on the version that contains the new HO mode. So if we integrate them, we should probably integrate everything.

Alexander Bentkamp

2019-03-22 20:40:8 (GMT)

I am not sure how much work that is. Depends on how much you changed since I pulled from dev (around November i guess).

Simon Cruanes

2019-03-22 20:54:50 (GMT)

hmm most changes are on master, but a lot of it is s/Sequence/Iter, and some changes in the SAT solver's interface. I haven't touched to the rest (or barely).

Simon Cruanes

2019-03-22 20:55:16 (GMT)

Still, if it's based on the HO branch then it can wait until your work is merged, which is non trivial.

Alexander Bentkamp

2019-03-22 20:56:30 (GMT)

We should ask Petar if he really based his stuff on my branch, but I would guess so.

Simon Cruanes

2019-03-22 20:57:37 (GMT)

The question if/when we merge is what to enable by default. Should the first-order calculus be preserved, or should the prover try at least lfhol unconditionnally?

Alexander Bentkamp

2019-03-22 21:0:56 (GMT)

lfhol?

Alexander Bentkamp

2019-03-22 21:1:19 (GMT)

I removed lambda-free from our branch :D

Alexander Bentkamp

2019-03-22 21:2:52 (GMT)

Do you mean our new calculus?

Alexander Bentkamp

2019-03-22 21:3:27 (GMT)

Let's call it lamsup, because it can handle lambdas now...

Simon Cruanes

2019-03-22 21:12:43 (GMT)

Alright. Anyway I suppose that if it's conditional on detecting that the input is HO (or on --ho being passed), it wouldn't affect first-order at all, so it could simply live in the master branch.

Alexander Bentkamp

2019-03-22 21:14:19 (GMT)

Yes, it's activated by passing an option only

Alexander Bentkamp

2019-03-22 21:14:56 (GMT)

The default mode should be more or less the same (modulo bug fixes and heuristic improvements by Petar)

Alexander Bentkamp

2019-03-22 21:16:15 (GMT)

I've implemented an option --mode that we could use to make the different modes easier to access

Simon Cruanes

2019-04-16 22:30:20 (GMT)

Is this on master? I'd like to release soon ^^

Alexander Bentkamp

2019-04-16 23:59:33 (GMT)

I don't think so. I think Petar is working on the MPI repo. @Petar Vukmirovic What branch are you working on currently?

Alexander Bentkamp

2019-04-17 0:1:27 (GMT)

I copied Petar's implementation a while ago from the MPI to GitHub. It's on lzip: https://github.com/c-cube/zipperposition/tree/lzip.

Alexander Bentkamp

2019-04-17 0:24:51 (GMT)

Most merge conflicts seem to come from Sequence vs Iter. I'll have a closer look.

Simon Cruanes

2019-04-17 4:14:23 (GMT)

you may want to also do sed -i 's/Sequence/Iter/g' on the source files on the lzip branch, it could help with the merge.

Alexander Bentkamp

2019-04-17 15:50:48 (GMT)

I think Petar is on holiday. I am not sure which branch we should merge into master. Maybe we should wait for him to come back.

Simon Cruanes

2019-04-17 18:45:17 (GMT)

Sure, it can wait.

Alexander Bentkamp

2019-04-19 18:12:12 (GMT)

I tried to replace Sequence/Iter and added iter in the .opam files. iter is installed on opam, too. But still I get Error: Unbound module Iter. Any ideas?

Simon Cruanes

2019-04-19 18:19:35 (GMT)

You also need to update the dune files (see the similar commit on master)

Alexander Bentkamp

2019-04-19 18:44:32 (GMT)

Ok, thanks. It still doesn't compile, but I think I'll merge first and repair afterwards.

Simon Cruanes

2019-04-19 18:47:11 (GMT)

have you opam install iter?
does it still complain about missing module Iter after updating the dune files?

Alexander Bentkamp

2019-04-19 19:3:32 (GMT)

No, it was complaining about another module. But it doesn't matter because after merging it now compiles.

Simon Cruanes

2019-04-19 19:6:32 (GMT)

It'd be optimal if you could now run the prover on as many tests as possible :)

Alexander Bentkamp

2019-04-19 19:7:26 (GMT)

https://github.com/c-cube/zipperposition/compare/supext_fix_merge

Alexander Bentkamp

2019-04-19 19:8:24 (GMT)

It compiles, but some tests fail. But some of them also failed before merging. Let's wait for Petar...

Simon Cruanes

2019-04-19 19:13:38 (GMT)

I meant a set of benchmarks, but local tests are important too, indeed.

Petar Vukmirovic

2019-04-22 19:24:20 (GMT)

Hi guys!

Petar Vukmirovic

2019-04-22 19:24:28 (GMT)

Can you get me up to speed on tests?

Simon Cruanes

2019-04-22 19:28:5 (GMT)

I'm no expert but it seems like it may be related to unification…?

Alexander Bentkamp

2019-04-22 19:28:30 (GMT)

Sorry, what I posted there won't help at all :D. Let's start from the beginning.

I merged master and supext_fix. I think already on supext_fix, there are a few tests that fail. Do you know why, Petar?

Alexander Bentkamp

2019-04-22 19:31:55 (GMT)

On supext_fix, it currently looks like this:

Testing all.
This run has ID `027DD200-AFB0-474C-9964-E36FFB0DE049`.
[OK]                units           0   subst.rename.
[OK]                units           1   subst.unify.
[OK]                units           2   multiset.max.
[OK]                units           3   multiset.compare.
[OK]                units           4   multiset.size.
[OK]                units           5   ordering.rpo.
[OK]                units           6   ordering.kbo.
[OK]                units           7   db shift.
[OK]                units           8   db unshift.
[OK]                units           9   is_appvar.
[OK]                units          10   whnf1.
[OK]                units          11   patterns.
[OK]                units          12   poly app.
[OK]                units          13   eta reduce.
[OK]                units          14   eta qreduce.
[OK]                units          15   eta expand.
[OK]                units          16   check unifiable.
[OK]                units          17   check unifiable.
[OK]                units          18   check unifiable.
[OK]                units          19   check unifiable.
[OK]                units          20   check unifiable.
[OK]                units          21   check unifiable.
[OK]                units          22   check unifiable.
[OK]                units          23   check unifiable.
[OK]                units          24   check unifiable.
[OK]                units          25   check unifiable.
[OK]                units          26   check unifiable.
[OK]                units          27   check unifiable.
[ERROR]             units          28   check matchable.
[OK]                units          29   regression matching.
[OK]                units          30   JP unification.
 ...                units          31   JP-unif check count.^C

And then it doesn't terminate because I didn't put an upper bound on the steps that JP unif is allowed to make...

Alexander Bentkamp

2019-04-22 19:33:7 (GMT)

After my merge, I get even more errors:

[ERROR]             units           0   subst.rename.
[ERROR]             units           1   subst.unify.
[OK]                units           2   multiset.max.
[OK]                units           3   multiset.compare.
[OK]                units           4   multiset.size.
[OK]                units           5   ordering.rpo.
[OK]                units           6   ordering.kbo.
[OK]                units           7   db shift.
[OK]                units           8   db unshift.
[OK]                units           9   is_appvar.
[OK]                units          10   whnf1.
[OK]                units          11   patterns.
[OK]                units          12   poly app.
[OK]                units          13   eta reduce.
[OK]                units          14   eta qreduce.
[OK]                units          15   eta expand.
[ERROR]             units          16   check unifiable.
[ERROR]             units          17   check unifiable.
[ERROR]             units          18   check unifiable.
[ERROR]             units          19   check unifiable.
[ERROR]             units          20   check unifiable.
[ERROR]             units          21   check unifiable.
[ERROR]             units          22   check unifiable.
[ERROR]             units          23   check unifiable.
[OK]                units          24   check unifiable.
[OK]                units          25   check unifiable.
[OK]                units          26   check unifiable.
[ERROR]             units          27   check unifiable.
[ERROR]             units          28   check matchable.
[OK]                units          29   regression matching.
[ERROR]             units          30   JP unification.
 ...                units          31   JP-unif check count.^C

Alexander Bentkamp

2019-04-22 19:35:14 (GMT)

Should I look into the errors that I introduced while merging and you (Petar) look into the errors that were already present on the supext_fix branch?

Simon Cruanes

2019-04-22 19:38:52 (GMT)

I forgot how, but there's a way to see the actual error message for the test, of course. That could be useful.

Alexander Bentkamp

2019-04-22 19:40:1 (GMT)

I think if you wait long enough there is a timeout and you'll get the message

Alexander Bentkamp

2019-04-22 19:40:22 (GMT)

But I already found a big problem with my merge. I'll try to fix that first :D

Alexander Bentkamp

2019-04-22 19:52:38 (GMT)

It's weird that I get a lot of errors displayed in my editor that do not occur if I compile on the terminal...

Simon Cruanes

2019-04-22 19:56:13 (GMT)

it depends on what profile you use, I wonder (dune build is much more strict than dune build --profile-release)

Alexander Bentkamp

2019-04-22 19:56:47 (GMT)

I don't know what I did, but I just got rid of them :D

Simon Cruanes

2019-04-22 19:57:31 (GMT)

Possibly some things were not compiled yet and merlin was complaining?

Alexander Bentkamp

2019-04-22 19:57:54 (GMT)

Maybe it's just VSCode that's behaving weirdly

Alexander Bentkamp

2019-04-22 20:18:46 (GMT)

Ok, found the error that was introduced by the merge. Now the tests look like on Petar's branch. It would be cool if you could look into this, @Petar Vukmirovic , as you probably know better where they are coming from.

Petar Vukmirovic

2019-04-22 20:19:44 (GMT)

Sorry guys

Petar Vukmirovic

2019-04-22 20:19:56 (GMT)

Was not getting notifications on my phone for Zulip

Petar Vukmirovic

2019-04-22 20:20:18 (GMT)

Thanks for merging!

Petar Vukmirovic

2019-04-22 20:20:23 (GMT)

I will do the check tomorrow

Petar Vukmirovic

2019-04-22 20:20:32 (GMT)

Let you know as soon as I fix 8t

Simon Cruanes

2019-04-22 20:22:2 (GMT)

I think it'd be nice if @Alexander Bentkamp could run the branch's Zipperposition on TPTP, as a form of integration test, to ensure there's no new unsoundness?

Simon Cruanes

2019-04-22 20:29:20 (GMT)

Note: I'm also fixing a tiny issue that pops up in older OCaml versions on https://travis-ci.org/c-cube/zipperposition/

Petar Vukmirovic

2019-04-23 8:9:22 (GMT)

I can also do that :) I have a feeling Alex has a lot in his hands

Petar Vukmirovic

2019-04-23 16:22:12 (GMT)

I have successfuly merged, and made all tests pass

Simon Cruanes

2019-04-23 16:23:10 (GMT)

Now we need to run this on TPTP or something similar, then I'll merge into master and make a release :)

Petar Vukmirovic

2019-04-23 16:23:15 (GMT)

Few notes: The matching test failed because pattern/LFHO higher-order unifciation algorithm implementened previously
did some smart stuff with lifting constants of variables, but dealt with DeBruijn indices incorrectly

Petar Vukmirovic

2019-04-23 16:23:45 (GMT)

3 tests in JP unification hanged because it just became too slow when implemented with full iteration :(

Petar Vukmirovic

2019-04-23 16:24:2 (GMT)

Now we need to run this on TPTP or something similar, then I'll merge into master and make a release :)

The task is running :)

Simon Cruanes

2019-04-23 16:24:14 (GMT)

Ah, you mean going from f a b to (λx y. f x y) a b, hmm?

Petar Vukmirovic

2019-04-23 16:27:1 (GMT)

Yup

Petar Vukmirovic

2019-04-23 16:27:23 (GMT)

The problem is that it would do the same with terms that have no vars
but do have loosely bound variables

Petar Vukmirovic

2019-04-23 16:27:42 (GMT)

and then because no shifting occurs when you apply substitution --
all kinds of funky things happen.

Petar Vukmirovic

2019-04-23 16:28:17 (GMT)

Right now I disabled the lifting if the term is not ground and _closed_ + removed the test case since now it is out of the fragment.

Simon Cruanes

2019-04-23 16:28:31 (GMT)

Well that's worrying indeed. It was an experiment, I'm glad we now have a more robust calculus thanks to y'all's hard work!

Alexander Bentkamp

2019-04-23 16:28:33 (GMT)

3 tests in JP unification hanged because it just became too slow when implemented with full iteration :(

Didn't I implement full iteration? I remember those tests passed before. What has changed exactly?

Petar Vukmirovic

2019-04-23 16:29:16 (GMT)

You remember how we found out that you also need to iterate on non-functional arguments.

Petar Vukmirovic

2019-04-23 16:30:10 (GMT)

I implemented literally only that change (few lines) and now it hangs (in the first case because the unifier is very particular
and in the others because now there are more than 1 or 2 unifiers.

Petar Vukmirovic

2019-04-23 16:30:22 (GMT)

most of the test cases still pass though

Petar Vukmirovic

2019-04-23 16:31:1 (GMT)

Ah, you mean going from f a b to (λx y. f x y) a b, hmm?

It is still smart and useful idea that we in the group all independently came up with :)

Alexander Bentkamp

2019-04-23 16:32:18 (GMT)

Ah, right! I remember now. Hopefully we can improve JP a bit when I am back.

Alexander Bentkamp

2019-04-23 16:33:5 (GMT)

What are your thoughts about --mode? What modes are there now and what modes should there be?

Petar Vukmirovic

2019-04-23 16:33:18 (GMT)

Actually, because of architectural reasons I think improving pure JP is hopeless.
If we want something pragmatic we should work on the architecture I have in PragHOUnif.
WEe can discuss this on one ocassion.

Alexander Bentkamp

2019-04-23 16:34:16 (GMT)

Even if we restrict ourselves to second order unification?

Petar Vukmirovic

2019-04-23 16:35:27 (GMT)

What are your thoughts about --mode? What modes are there now and what modes should there be?

I wanted to talk exactly about that. In my opinion we need 3:
As fast as possible first-order
As complete as possible higher-order
Best performing higher-order (some compromise between performance on FO and completeness)

Petar Vukmirovic

2019-04-23 16:40:11 (GMT)

Even if we restrict ourselves to second order unification?

I think so. The main problem is when beta-reduce and apply substitution too much. That is common to any order of higher-order logic.

Petar Vukmirovic

2019-04-23 16:40:53 (GMT)

While unifying you are mostly interested in what is the head of the term, and for that you only need to dereference the head of the term and get terms in whnf. From my experience, this brings huge speed improvement.

Petar Vukmirovic

2019-04-23 16:41:3 (GMT)

But of course, we can discuss this further.

Alexander Bentkamp

2019-04-23 16:43:58 (GMT)

About the modes: So far, I have always switched off features like avatar, induction & arithmetic for the evaluation because I thought that is not what I want to evaluate. That's why I gave the current modes the suffix "-basic". But maybe we don't need extra modes for this. And maybe we actually want to enable these features in the eval. What do you think?

Petar Vukmirovic

2019-04-23 16:48:28 (GMT)

I don't know on top of my mind. I think we need to evaluate all that to have the clear picture.

We need to do some fairly extensive evaluation of combination of parameters and then see
what happens.

Alexander Bentkamp

2019-04-23 16:49:32 (GMT)

Ok, but we don't really need a mode for every configuration that we want to evaluate, right?

Alexander Bentkamp

2019-04-23 16:50:43 (GMT)

So I think the three that you proposed sound good.

Alexander Bentkamp

2019-04-23 16:53:46 (GMT)

Do we want a HO mode that uses the old architecture without streams of unifiers? And what should be the default mode? Should the default mode also get a name? If yes, what effect is --mode default supposed to have?

Simon Cruanes

2019-04-23 17:25:43 (GMT)

I'd like the default mode to be automatic, if possible (iirc there are some basic feature detections for arithmetic or induction, to be confirmed). It should try to use whatever is needed for the problem at hand.
If you want to evaluate a particular (set of) features then --mode ho or similar is the way to go?

Simon Cruanes

2019-04-23 17:27:14 (GMT)

I don't know how Avatar (or rather its simplified version) fits into your calculi but I think it generally helps in the benchmarks, it should be on in default modes, I think. Arithmetic is less robust, induction as well (but they should at least keep working if required).

Petar Vukmirovic

2019-04-23 17:55:56 (GMT)

Ok, but we don't really need a mode for every configuration that we want to evaluate, right?

No but we have to know which features should go in there (in each of the non-complete modes)

Simon Cruanes

2019-04-23 17:58:22 (GMT)

Ok, but we don't really need a mode for every configuration that we want to evaluate, right?

Actually that sounds like a pretty good idea to me, for more "specialized" modes. Could also be useful as ± snapshots of parameters (like --mode casc-tff-2019).

Petar Vukmirovic

2019-04-24 8:32:3 (GMT)

OK. Good news. I tested Zip on 1000 problems, and it performs quite nice, with no discovered bugs so far.

Full TPTP is still running (and will be until tomorrow I think).

Alexander Bentkamp

2019-05-01 15:2:42 (GMT)

Hi @Simon Cruanes , why do you want to make the release now? Petar and I think that it would be better to wait 1-2 months until we (probably mostly Petar) have tested properly, tuned the heuristics and cleaned up.

Simon Cruanes

2019-05-01 15:25:15 (GMT)

Sure, I thought heuristics were up to date, but it's not that urgent. I just like to have significant changes released but maybe I thought things were more stable than they are!