2020-05-20 11:45:16 (GMT)
The rule and_simplify is used to prove that
((a ∧ b ∨ c ∧ d) ∧ ¬ (a ∧ b ∨ c ∧ d)) = False
However, I don't see how and_simplify is supposed to do that, or is distribution of the negation part of what the rule does? Additionally is there a list of up-to-date rule? I only found your email (@Hans-Jörg ) from last year.
What and_simplify does:
and_simplify; captures:
and with only true ↔ true
and ↔ and with all true removed
and ↔ and with repeated literals removed
and a1 ‥ false ‥ an ↔ false
and a1 ‥ ai ‥ (not ai) ‥ an ↔ false
also applied when (not ai) ‥ ai
The problem and the proof:
; --proof-with-sharing --index-fresh-sorts --proof-define-skolems --proof-prune --proof-merge --disable-print-success --disable-banner --max-time=20
(set-option :produce-proofs true)
(set-logic AUFLIA)
(declare-fun a$ () Bool)
(declare-fun b$ () Bool)
(declare-fun c$ () Bool)
(declare-fun d$ () Bool)
(assert (! (not (=> (or (and a$ b$) (and c$ d$)) (or (and a$ b$) (and c$ d$)))) :named a0))
(check-sat)
(get-proof)
SMT: Invoking SMT solver "verit" ...
SMT: Solver:
SMT: Result:
unsat
(assume a0 (! (not (! (=> (! (or (and a$ b$) (and c$ d$)) :named @p_1) @p_1) :named @p_4)) :named @p_2))
(step t2 (cl (! (= @p_2 (! (and @p_1 (not @p_1)) :named @p_5)) :named @p_3)) :rule bool_simplify)
(step t3 (cl (not @p_3) @p_4 @p_5) :rule equiv_pos2)
(step t4 (cl @p_5) :rule th_resolution :premises (a0 t2 t3))
(step t5 (cl (! (= @p_5 false) :named @p_6)) :rule and_simplify)
(step t6 (cl (not @p_6) (not @p_5) false) :rule equiv_pos2)
(step t7 (cl false) :rule th_resolution :premises (t4 t5 t6))
(step t8 (cl (not false)) :rule false)
(step t9 (cl) :rule resolution :premises (t7 t8))
2020-05-20 13:5:13 (GMT)
isn't that a case of the last part in my "what it does" case distinction? (if you read (a ∧ b ∨ c ∧ d) as ai)
2020-05-25 14:25:29 (GMT)
regarding the la_generic rule: the missing part is probably the strengthening that is happening. If you have an linear sum with integer variables and a strict constraint you can strengthen to a weak constraint. (e.g. $x > 0$ becomes $x >= 1$).
We can either handle this on the reconstruction side with a first pass that strengthens the literals, or generate a proof. Generating a proof would be a bit of work (generating tautologies weak => strong, then doing resolution with those), but should be doable.
I do think, however, that it would be more sensible to first implement other improvements that are harder to fix on the reconstruction side (such as equality reordering)
2020-05-25 14:32:12 (GMT)
Wouldn't it make more sense to do this as a postprocessing in veriT before getting the proof out?
2020-05-25 14:33:10 (GMT)
(which is the second thing you suggest, I think)
2020-05-25 14:34:56 (GMT)
@Hans-Jörg If you replace the inequalities by hand in the smt2 (the one on skype), is your checker able to check the proof?
2020-05-25 14:37:43 (GMT)
In that case, I tend to see that as part of what la_generic does and, therefore, that the reconstruction should take care of that…
2020-05-25 14:42:31 (GMT)
Assume we have the rule eq1 \/ eq2 \/ eq3 with the coefficients c1, c2, and c3. The reconstruction must do c1 * eq1 + c2 * eq2 + c3 * eq3. Therefore, you need some knowledge about arithmetic to combine >=, >, +, -, and co. Hence, I don't see it as an issue to also assume that the reconstruction knows that combining (A >= B) + (C > D) yields A + B >= B + D +1 on integers.
2020-05-25 14:44:20 (GMT)
In particular, the trick of +1 does not work on real numbers, so the reconstruction is more regular in a certain sense…
2020-06-02 14:52:45 (GMT)
comming back to this: yeah manually moving from a strict to a non-strict inequality makes the steps check able. Sadly I can't automatize this since I don't know the types of the variables from the proof alone.
2020-06-05 11:41:15 (GMT)
@Haniel Barbosa is CVC4 now able to produce proofs with quantifiers? Output veriT format?
2020-06-05 12:31:31 (GMT)
We are in the middle of a from-scratch rewrite of the CVC4 proof module. We can print veriT proofs only for a fragment of QF_UF for now. Will soon extend that for arithmetic and strings. After I get support for all that I'll look into quantifiers
2020-06-05 12:31:49 (GMT)
I was using the old veriT format in the printer, will change it to the new version you did
2020-06-05 12:40:52 (GMT)
When we get to the point that things are more stable I'll want to discuss with you guys how to better present the proofs
2020-06-15 9:50:53 (GMT)
The rule "bool_simplify" can transform (((P ⟶ Q) ⟶ P) ⟶ P) into ((P ⟶ Q) ∨ P): is a rule missing in bool_simplify or am I missing something? (the transformation makes sense, it is just a documentation question)
bool_simplify; captures:
!(P -> Q) ↔ P & !Q
!(P | Q) ↔ !P & !Q
!(P & Q) ↔ !P | !Q
P -> (Q -> R) ↔ (P & Q) -> R
(P -> Q) -> Q ↔ P | Q
P & (P -> Q) ↔ P & Q
(P -> Q) & P ↔ P & Q
The problem:
SMT: Problem:
; --proof-with-sharing --index-fresh-sorts --proof-define-skolems --proof-prune --proof-merge --disable-print-success --disable-banner --max-time=20
(set-option :produce-proofs true)
(set-logic AUFLIA)
(declare-fun p$ () Bool)
(declare-fun q$ () Bool)
(assert (! (not (=> (=> (=> p$ q$) p$) p$)) :named a0))
(check-sat)
(get-proof)
SMT: Invoking SMT solver "verit" ...
SMT: Solver:
SMT: Result:
unsat
(assume a0 (! (not (! (=> (=> (! (=> p$ q$) :named @p_2) p$) p$) :named @p_1)) :named @p_3))
(step t2 (cl (= @p_1 (! (or @p_2 p$) :named @p_4))) :rule bool_simplify)
(step t3 (cl (! (= @p_3 (! (not @p_4) :named @p_6)) :named @p_5)) :rule cong :premises (t2))
(step t4 (cl (not @p_5) @p_1 @p_6) :rule equiv_pos2)
(step t5 (cl @p_6) :rule th_resolution :premises (a0 t3 t4))
(step t6 (cl (not @p_2)) :rule not_or :premises (t5))
(step t7 (cl p$) :rule not_implies1 :premises (t6))
(step t8 (cl (not p$)) :rule not_or :premises (t5))
(step t9 (cl) :rule resolution :premises (t8 t7))
2020-06-15 14:28:13 (GMT)
where are these rules documented? Is there a document or paper listing them?
2020-06-15 14:36:18 (GMT)
First, this is not the standard veriT, but Hans-Jörg's own branch. Second, as far as I know, there is only an email listing them...
2020-06-15 14:56:53 (GMT)
Second question: is lia_generic also supposed to give coefficients or what is its intended meaning compared to la_generic? Here is an example where no args is returned:
SMT: Problem:
; --proof-with-sharing --index-fresh-sorts --proof-define-skolems --proof-prune --proof-merge --disable-print-success --disable-banner --max-time=20
(set-option :produce-proofs true)
(set-logic AUFLIA)
(declare-sort Nat$ 0)
(declare-fun x$ () Nat$)
(declare-fun of_nat$ (Nat$) Int)
(assert (! (not (not (= (* 2 (of_nat$ x$)) 1))) :named a0))
(assert (! (<= 0 (of_nat$ x$)) :named a1))
(check-sat)
(get-proof)
SMT: Invoking SMT solver "verit" ...
SMT: Solver:
SMT: Result:
unsat
(assume a0 (! (= 1 (* 2 (of_nat$ x$))) :named @p_1))
(step t2 (cl (not @p_1)) :rule lia_generic)
(step t3 (cl) :rule resolution :premises (t2 a0))
SMT: Time (ms):
2020-06-15 15:56:57 (GMT)
Third question: Is 4 * dec_10 4 - 10 < 10 ∨ 4 ≠ dec_10 4 an intend goal for la_generic?
SMT: Problem:
; --proof-with-sharing --index-fresh-sorts --proof-define-skolems --proof-prune --proof-merge --disable-print-success --disable-banner --max-time=20
(set-option :produce-proofs true)
(set-logic AUFLIA)
(declare-fun dec_10$ (Int) Int)
(assert (! (forall ((?v0 Int)) (= (dec_10$ ?v0) (ite (< ?v0 10) ?v0 (dec_10$ (- ?v0 10))))) :named a0))
(assert (! (not (= (dec_10$ (* 4 (dec_10$ 4))) 6)) :named a1))
(check-sat)
(get-proof)
SMT: Invoking SMT solver "verit" ...
SMT: Solver:
SMT: Result:
unsat
(assume a0 (! (forall ((?v0 Int)) (! (= (! (dec_10$ ?v0) :named @p_2) (! (ite (! (< ?v0 10) :named @p_5) ?v0 (! (dec_10$ (! (- ?v0 10) :named @p_7)) :named @p_9)) :named @p_11)) :named @p_13)) :named @p_1))
(assume a1 (not (! (= (! (dec_10$ (! (* 4 (! (dec_10$ 4) :named @p_37)) :named @p_27)) :named @p_26) 6) :named @p_73)))
(anchor :step t3 :args ((:= ?v0 veriT_vr0)))
(step t3.t1 (cl (! (= ?v0 veriT_vr0) :named @p_4)) :rule refl)
(step t3.t2 (cl (= @p_2 (! (dec_10$ veriT_vr0) :named @p_3))) :rule cong :premises (t3.t1))
(step t3.t3 (cl @p_4) :rule refl)
(step t3.t4 (cl (= @p_5 (! (< veriT_vr0 10) :named @p_6))) :rule cong :premises (t3.t3))
(step t3.t5 (cl @p_4) :rule refl)
(step t3.t6 (cl @p_4) :rule refl)
(step t3.t7 (cl (= @p_7 (! (- veriT_vr0 10) :named @p_8))) :rule cong :premises (t3.t6))
(step t3.t8 (cl (= @p_9 (! (dec_10$ @p_8) :named @p_10))) :rule cong :premises (t3.t7))
(step t3.t9 (cl (= @p_11 (! (ite @p_6 veriT_vr0 @p_10) :named @p_12))) :rule cong :premises (t3.t4 t3.t5 t3.t8))
(step t3.t10 (cl (= @p_13 (! (= @p_3 @p_12) :named @p_14))) :rule cong :premises (t3.t2 t3.t9))
(step t3 (cl (! (= @p_1 (! (forall ((veriT_vr0 Int)) @p_14) :named @p_16)) :named @p_15)) :rule bind)
(step t4 (cl (not @p_15) (not @p_1) @p_16) :rule equiv_pos2)
(step t5 (cl @p_16) :rule th_resolution :premises (a0 t3 t4))
(anchor :step t6 :args ((:= veriT_vr0 veriT_vr1)))
(step t6.t1 (cl (! (= veriT_vr0 veriT_vr1) :named @p_18)) :rule refl)
(step t6.t2 (cl (= @p_3 (! (dec_10$ veriT_vr1) :named @p_17))) :rule cong :premises (t6.t1))
(step t6.t3 (cl @p_18) :rule refl)
(step t6.t4 (cl (= @p_6 (! (< veriT_vr1 10) :named @p_19))) :rule cong :premises (t6.t3))
(step t6.t5 (cl @p_18) :rule refl)
(step t6.t6 (cl @p_18) :rule refl)
(step t6.t7 (cl (= @p_8 (! (- veriT_vr1 10) :named @p_20))) :rule cong :premises (t6.t6))
(step t6.t8 (cl (= @p_10 (! (dec_10$ @p_20) :named @p_21))) :rule cong :premises (t6.t7))
(step t6.t9 (cl (= @p_12 (! (ite @p_19 veriT_vr1 @p_21) :named @p_22))) :rule cong :premises (t6.t4 t6.t5 t6.t8))
(step t6.t10 (cl (= @p_14 (! (= @p_17 @p_22) :named @p_23))) :rule cong :premises (t6.t2 t6.t9))
(step t6 (cl (! (= @p_16 (! (forall ((veriT_vr1 Int)) @p_23) :named @p_25)) :named @p_24)) :rule bind)
(step t7 (cl (not @p_24) (not @p_16) @p_25) :rule equiv_pos2)
(step t8 (cl @p_25) :rule th_resolution :premises (t5 t6 t7))
(step t9 (cl (or (! (not @p_25) :named @p_35) (! (= @p_26 (! (ite (! (< @p_27 10) :named @p_30) @p_27 (! (dec_10$ (! (- @p_27 10) :named @p_53)) :named @p_31)) :named @p_29)) :named @p_28))) :rule forall_inst :args ((:= veriT_vr1 @p_27)))
(anchor :step t10)
(assume t10.h1 @p_28)
(step t10.t2 (cl (! (= @p_28 (! (and (! (= @p_26 @p_29) :named @p_50) (! (ite @p_30 (= @p_27 @p_29) (! (= @p_31 @p_29) :named @p_52)) :named @p_51)) :named @p_32)) :named @p_33)) :rule ite_intro)
(step t10.t3 (cl (not @p_33) (! (not @p_28) :named @p_34) @p_32) :rule equiv_pos2)
(step t10.t4 (cl @p_32) :rule th_resolution :premises (t10.h1 t10.t2 t10.t3))
(step t10 (cl @p_34 @p_32) :rule subproof :discharge (h1))
(step t11 (cl @p_35 @p_28) :rule or :premises (t9))
(step t12 (cl (! (or @p_35 @p_32) :named @p_36) @p_25) :rule or_neg)
(step t13 (cl @p_36 (! (not @p_32) :named @p_49)) :rule or_neg)
(step t14 (cl @p_36) :rule th_resolution :premises (t11 t10 t12 t13))
(step t15 (cl (or @p_35 (! (= @p_37 (! (ite (! (< 4 10) :named @p_39) 4 (! (dec_10$ (! (- 4 10) :named @p_40)) :named @p_41)) :named @p_42)) :named @p_38))) :rule forall_inst :args ((:= veriT_vr1 4)))
(anchor :step t16)
(assume t16.h1 @p_38)
(step t16.t2 (cl (= @p_39 true)) :rule comp_simplify)
(step t16.t3 (cl (= @p_40 (- 6))) :rule minus_simplify)
(step t16.t4 (cl (= @p_41 (! (dec_10$ (- 6)) :named @p_43))) :rule cong :premises (t16.t3))
(step t16.t5 (cl (= @p_42 (! (ite true 4 @p_43) :named @p_44))) :rule cong :premises (t16.t2 t16.t4))
(step t16.t6 (cl (= 4 @p_44)) :rule ite_simplify)
(step t16.t7 (cl (= 4 @p_42)) :rule trans :premises (t16.t5 t16.t6))
(step t16.t8 (cl (! (= @p_38 (! (= 4 @p_37) :named @p_45)) :named @p_46)) :rule cong :premises (t16.t7))
(step t16.t9 (cl (not @p_46) (! (not @p_38) :named @p_47) @p_45) :rule equiv_pos2)
(step t16.t10 (cl @p_45) :rule th_resolution :premises (t16.h1 t16.t8 t16.t9))
(step t16 (cl @p_47 @p_45) :rule subproof :discharge (h1))
(step t17 (cl @p_35 @p_38) :rule or :premises (t15))
(step t18 (cl (! (or @p_35 @p_45) :named @p_48) @p_25) :rule or_neg)
(step t19 (cl @p_48 (! (not @p_45) :named @p_66)) :rule or_neg)
(step t20 (cl @p_48) :rule th_resolution :premises (t17 t16 t18 t19))
(step t21 (cl @p_49 @p_50) :rule and_pos)
(step t22 (cl (not @p_51) @p_30 @p_52) :rule ite_pos1)
(step t23 (cl @p_49 @p_51) :rule and_pos)
(step t24 (cl @p_35 @p_32) :rule or :premises (t14))
(step t25 (cl @p_32) :rule resolution :premises (t24 t8))
(step t26 (cl @p_50) :rule resolution :premises (t21 t25))
(step t27 (cl @p_51) :rule resolution :premises (t23 t25))
(step t28 (cl @p_35 @p_45) :rule or :premises (t20))
(step t29 (cl @p_45) :rule resolution :premises (t28 t8))
(step t30 (cl (or @p_35 (! (= @p_31 (! (ite (! (< @p_53 10) :named @p_56) @p_53 (! (dec_10$ (- @p_53 10)) :named @p_57)) :named @p_55)) :named @p_54))) :rule forall_inst :args ((:= veriT_vr1 @p_53)))
(anchor :step t31)
(assume t31.h1 @p_54)
(step t31.t2 (cl (! (= @p_54 (! (and (! (= @p_31 @p_55) :named @p_63) (! (ite @p_56 (! (= @p_53 @p_55) :named @p_65) (= @p_57 @p_55)) :named @p_64)) :named @p_58)) :named @p_59)) :rule ite_intro)
(step t31.t3 (cl (not @p_59) (! (not @p_54) :named @p_60) @p_58) :rule equiv_pos2)
(step t31.t4 (cl @p_58) :rule th_resolution :premises (t31.h1 t31.t2 t31.t3))
(step t31 (cl @p_60 @p_58) :rule subproof :discharge (h1))
(step t32 (cl @p_35 @p_54) :rule or :premises (t30))
(step t33 (cl (! (or @p_35 @p_58) :named @p_61) @p_25) :rule or_neg)
(step t34 (cl @p_61 (! (not @p_58) :named @p_62)) :rule or_neg)
(step t35 (cl @p_61) :rule th_resolution :premises (t32 t31 t33 t34))
(step t36 (cl @p_62 @p_63) :rule and_pos)
(step t37 (cl (not @p_64) (not @p_56) @p_65) :rule ite_pos2)
(step t38 (cl @p_62 @p_64) :rule and_pos)
(step t39 (cl @p_35 @p_58) :rule or :premises (t35))
(step t40 (cl @p_58) :rule resolution :premises (t39 t8))
(step t41 (cl @p_63) :rule resolution :premises (t36 t40))
(step t42 (cl @p_64) :rule resolution :premises (t38 t40))
(step t43 (cl @p_56 @p_66) :rule la_generic :args (1 4))
(step t44 (cl @p_56) :rule resolution :premises (t43 t29))
(step t45 (cl @p_65) :rule resolution :premises (t37 t44 t42))
(step t46 (cl (or (! (= 6 @p_53) :named @p_67) (! (not (! (<= 6 @p_53) :named @p_71)) :named @p_68) (! (not (! (<= @p_53 6) :named @p_70)) :named @p_69))) :rule la_disequality)
(step t47 (cl @p_67 @p_68 @p_69) :rule or :premises (t46))
(step t48 (cl @p_70 @p_66) :rule la_generic :args (1 4))
(step t49 (cl @p_70) :rule resolution :premises (t48 t29))
(step t50 (cl @p_71 @p_66) :rule la_generic :args (1 4))
(step t51 (cl @p_71) :rule resolution :premises (t50 t29))
(step t52 (cl @p_67) :rule resolution :premises (t47 t51 t49))
(step t53 (cl (! (not @p_63) :named @p_76) (! (not @p_65) :named @p_77) (! (not @p_67) :named @p_78) (! (= 6 @p_31) :named @p_72)) :rule eq_transitive)
(step t5
2020-06-15 16:13:15 (GMT)
@Simon Cruanes I started a document to collect documentation, but it didn't make a lot of progress so far. Should put more time into it.
2020-06-15 17:52:18 (GMT)
Mathias Fleury said:
Third question: Is
4 * dec_10 4 - 10 < 10 ∨ 4 ≠ dec_10 4an intend goal for la_generic?
and that is causing trouble due to the order of the equalities…
2020-06-16 7:2:12 (GMT)
To give an example of the ordering problem:
SMT: Problem:
; --proof-with-sharing --index-fresh-sorts --proof-define-skolems --proof-prune --proof-merge --disable-print-success --disable-banner --max-time=20
(set-option :produce-proofs true)
(set-logic AUFLIA)
(declare-fun dec_10$ (Int) Int)
(assert (! (forall ((?v0 Int)) (= (dec_10$ ?v0) (ite (< ?v0 10) ?v0 (dec_10$ (- ?v0 10))))) :named a0))
(assert (! (not (= (dec_10$ (* 4 (dec_10$ 4))) 6)) :named a1))
(check-sat)
(get-proof)
The proof contains four la_generic steps:
(step t43 (cl (< (- (* 4 (dec_10$ 4)) 10) 10) (not (= 4 (dec_10$ 4)))) :rule la_generic :args (1 4))
(step t48 (cl (<= (- (* 4 (dec_10$ 4)) 10) 6) (not (= 4 (dec_10$ 4)))) :rule la_generic :args (1 4))
(step t50 (cl (<= 6 (- (* 4 (dec_10$ 4)) 10)) (not (= 4 (dec_10$ 4)))) :rule la_generic :args (1 4))
(step t58 (cl (not (< (* 4 (dec_10$ 4)) 10)) (not (= 4 (dec_10$ 4)))) :rule la_generic :args (1 4))
Two of them in more standard notation:
- t43: 4 * dec_10 4 - 10 < 10 ∨ 4 ≠ dec_10 4 with args 1 4
- t50: 6 ≤ 4 * dec_10 4 - 10 ∨ 4 ≠ dec_10 4with args 1 4
In t50, the coefficient should be -4, not 4. This is annoying to reconstruct (at least, I don't have a good idea yet to do so).
2020-06-17 8:58:17 (GMT)
Regarding lia_generic: I don't know when that is issued, I only worked onla_generic so far and hoped lia_generic would not pop up before that is polished. I will check later if I can see what's happening there.
2020-06-17 19:1:22 (GMT)
(((P ⟶ Q) ⟶ P) ⟶ P) to ((P ⟶ Q) ∨ P) is an instance of (P -> Q) -> Q ↔ P | Q (where P is "P ⟶ Q" and Q is P)
2020-06-18 8:27:55 (GMT)
I had forgotten to add that rule to my bool_simplify rules…
2020-06-24 17:58:12 (GMT)
It took me two days of work for a 5 line change -.-, but I think the coefficients for equalities should now be correct.
2020-06-25 18:13:21 (GMT)
@Mathias Fleury: just a heads up: I rebased the no-implicit-equality-orderbranch on master. I still have some surprising performance regressions that I need to look at.
2020-06-30 15:52:52 (GMT)
for what it's worth: I'm still working on those performance regressions. They are quite subtle. They seem to be related on how the preprocessing techniques interact with each other. Esp. symmetry breaking (which isn't even on in proof mode).
2020-07-01 19:55:25 (GMT)
Haniel Barbosa said:
When we get to the point that things are more stable I'll want to discuss with you guys how to better present the proofs
I'm glad to say we're in this point now hehe
2020-07-01 19:56:51 (GMT)
although now that I come to think of it, to have any kind of testing related with Isabelle we need to have quantifier reasoning in the proof as well
2020-07-01 19:57:28 (GMT)
I cannot just use Isabelle as a checker for a quantifier-free proof, for example
2020-07-01 20:19:24 (GMT)
Well, you can, it's just that it's not very interesting from the point of view of Isabelle users.
2020-07-01 20:20:10 (GMT)
It'd be helpful for the SMT solver developer though while in WIP mode until supporting what is interesting for the Isabelle users
2020-07-02 7:28:3 (GMT)
The trouble is this: We have a wishlist and bug list for Sledgehammer that would easily keep 3 PhD students busy for 4 years, just trying to cater to our users. Right now, CVC4 proof reconstruction with quantifiers is one of the 20+ items on that list (for the future, obviously). Implementing proof reconstruction without quantifiers for WIP would take even more time and divert us from our objectives -- and perhaps by the time quantifiers are added, the proof output would have changed again and we'd have to redo half of it. Notice that all of this is engineering with little hope of leading to a publication.
Above I wrote "we", but this is basically me, who has had in the past 4-5 years about 1% of his time to work on that, and since recently Martin Desharnais, who's just starting doing it as a side project, his main PhD topic behing on the user side of Isabelle. So we're seriously understaffed.
To solve your particular problem, in your shoes I would simply use other SMT solvers to validate CVC4's proof steps for those theories they have in common. That's less work than full proof reconstruction and would be enough in practice for debugging. Otherwise, well, I can help you get started with Isabelle programming, and you can always do it yourself. ;)
2020-07-02 7:42:10 (GMT)
There is currently no Isabelle checker that takes an SMT file and a proof as argument, you can only check goals generated by Isabelle. The one used for Z3 was lost.
2020-07-02 7:42:18 (GMT)
I should try to convince myself or someone else to do that (@Martin Desharnais we should talk about that, you know a lot more about the internals that I do…)
2020-07-02 8:23:11 (GMT)
BTW Mathias: @Martin Desharnais can give you access to our bug list (in rotten French). Since there's no clear division between SMT and Sledgehammer, it's probably the best place to coordinate all of this work.
2020-07-02 13:32:45 (GMT)
Sorry, I was not clear. First what I was talking about is for CVC4 producing proofs in the veriT proof format. For debugging the proof production infrastructure itself we have an internal proof checker which already gets us really far. What would help in the checker I mentioned, were it possible, would be for debugging the proof production in the veriT proof format
2020-07-02 13:33:11 (GMT)
I have done what you suggested in learning the language and doing a checker myself for Lean, but to do it for Isabelle as well is too much for me :)
2020-07-02 13:35:19 (GMT)
Mathias Fleury said:
There is currently no Isabelle checker that takes an SMT file and a proof as argument, you can only check goals generated by Isabelle. The one used for Z3 was lost.
Yes, that is the thing that would have done what I was thinking of. Since this is not possible, I'll come back when you can directly integrate CVC4 into the proof reconstruction machinery with the veriT proof format
2020-07-22 15:30:40 (GMT)
@Mathias Fleury I feel like I'm stuck with the eq order thing. Too many things break that I don't understand (esp. in the non-proof version). I'm not sure if the time investment is worth it. Maybe I should focus on other things (lia_generic, the other implicit things). Maybe we can find a way to work around this on the Isabelle side? (some normalization on a syntactic criterion?)
2020-07-22 15:37:20 (GMT)
I think I can live with that (but please, continue giving the '=' in LIA in the right order)
2020-09-03 18:3:48 (GMT)
I found something unexpected when trying to understand why cong is sometimes so slow
( @Hans-Jörg I let you triage my questions)
(anchor :step t4 :args ((:= ?v1 (c$ ?v0))))
...
(step t4 (cl (= (forall ((?v0 Body$) (?v1 Nat$) (?v2 Point_Point_nat_fun_fun$)) (=> (and (iOb$ ?v0) (= ?v1 (c$ ?v0))) (and (less$ zero$ ?v1) (forall ((?v3 Point$) (?v4 Point$)) (= (exists ((?v5 Body$)) (and (ph$ ?v5) (and (w$ ?v0 ?v5 ?v3) (w$ ?v0 ?v5 ?v4)))) (= (fun_app$ (fun_app$a space2$ ?v3) ?v4) (times$ (times$ (c$ ?v0) (c$ ?v0)) (fun_app$ (fun_app$a ?v2 ?v3) ?v4)))))))) (forall ((?v0 Body$) (?v2 Point_Point_nat_fun_fun$)) (=> (and (iOb$ ?v0) (= (c$ ?v0) (c$ ?v0))) (and (less$ zero$ (c$ ?v0)) (forall ((?v3 Point$) (?v4 Point$)) (= (exists ((?v5 Body$)) (and (ph$ ?v5) (and (w$ ?v0 ?v5 ?v3) (w$ ?v0 ?v5 ?v4)))) (= (fun_app$ (fun_app$a space2$ ?v3) ?v4) (times$ (times$ (c$ ?v0) (c$ ?v0)) (fun_app$ (fun_app$a ?v2 ?v3) ?v4)))))))))) :rule qnt_simplify)
Is that expected that v1 is mapped to a function application with a free variable?
2020-09-03 18:8:11 (GMT)
The reconstruction works currently, but this is mostly an accident
2020-09-03 18:19:1 (GMT)
Is that expected that v1 is mapped to a function application with a free variable?
No, whatever the context. :)
2020-09-03 18:22:10 (GMT)
What is the semantics of anchor in the new proof format again?
2020-09-03 18:58:31 (GMT)
anchor is the beginning a subproof
2020-09-03 19:3:49 (GMT)
I see. So in what you posted above a context is being defined in which v1 is mapped to c(v0), with v0 free?
2020-09-03 19:4:40 (GMT)
Exactly. I guess it should be ((?v0) (:= ?v1 (c$ ?v0)))but I am not certain.
2020-09-03 19:5:48 (GMT)
In proof version 2, it is the same:
(set .c4 (qnt_simplify :args ((:= ?v1 (c$ ?v0)))
(set .c1 (refl :conclusion ((= ?v1 (c$ ?v0)))))
2020-09-03 19:6:38 (GMT)
Yeah, your suggestion makes sense. When eliminating a variable depending on previous ones these previous ones should be part of the context as well
2020-09-03 19:56:49 (GMT)
Ah right. This should be not to difficult to triangulate when looking at the code that generates qnt_simplify
2020-09-04 5:10:33 (GMT)
Could you also put the forall_inst clausification in a new rule? I would like to see if I can optimize the reconstruction for it.
2020-09-04 8:21:3 (GMT)
yeah I will do that
2020-09-04 8:57:21 (GMT)
Can someone confirm that la_disequality is always ‹a = b ∨ ¬a ≤ b ∨ ¬b ≤ a›? And if so fix the --proof-format-and-exit?
2020-09-04 9:1:19 (GMT)
it is (I documented this yesterday for the proofonomicon ^^) should fix it in in --proof-format-and-exit too
2020-09-04 9:3:8 (GMT)
can you resend the link of that? I always forget where it is...
2020-09-04 9:7:17 (GMT)
it's in the proof-documentation branch. I plan to merge that branch as soon as I have documented all the rules. It is built as part of the CI pipeline too: https://gitlab.inria.fr/hschurr/verit-rmx/-/jobs/746049/artifacts/file/doc/proofonomicon/proofonomicon.pdf. I should add some deploy step that uploads this stuff somewhere at some point.
2020-09-04 9:10:29 (GMT)
I added a dedicated tmp_quantifier_cnf rule
2020-09-07 12:44:54 (GMT)
I think this should be fixed with 098169c31e76b868b185ebc5f6594799c76e1676
2020-09-07 12:45:5 (GMT)
(this being the qnt_simplify context.)
2020-09-07 13:9:19 (GMT)
My parser does support Not sure actually.((v0) (:= x y)) but not (v0 (:= x y)). Not sure what is intended here
2020-09-07 13:36:29 (GMT)
hm the code producec the second form, but I didn't change that for this commit
2020-09-07 13:38:20 (GMT)
I forgot to call some code that replaces v0 by (:= v0 v0)
2020-09-07 13:38:34 (GMT)
I am now fighting with the coefficients...
2020-09-07 13:41:47 (GMT)
oh no :( Maybe I'm wrong? I found it hard to get my head around this and even now I'm in doubt.
2020-09-07 13:45:33 (GMT)
currently it is strange because some cases work and not others
2020-09-07 13:59:16 (GMT)
hm do you have a small example that doesn't work?
2020-09-07 14:7:55 (GMT)
Either I misunderstand you change, or you are randomly changing coefficients
2020-09-07 14:56:22 (GMT)
Fixed now (I was missing some simp rules)