zulip log viewer

Jasmin Blanchette

2019-04-03 16:20:43 (GMT)

Some potentially interesting sources of benchmarks, beyond the TPTP library and Sledgehammer (/ Judgment Day), all produced by Chad Brown and sometimes some colleagues of his:

GRUNGE:
https://arxiv.org/pdf/1903.02539.pdf

Grundlagen:
http://grid01.ciirc.cvut.cz/~chad/grundlagen-thf-ltb.tgz
http://grid01.ciirc.cvut.cz/~chad/grundlagen-thf.tar.7z

Mizar:
http://grid01.ciirc.cvut.cz/~chad/mizar-thf-ltb.tgz
http://grid01.ciirc.cvut.cz/~chad/mptp_thf.tgz

Please make sure to acknowledge/thank Chad (and, for GRUNGE, his coauthors) if you use these.

P.S. Unfortunately Chad is hard to contact, but he can be reached among others through snail mail.

Haniel Barbosa

2019-04-03 16:24:57 (GMT)

Do you know if these are going to be included into TPTP? On that note, did you or do you plan to submit the JD benchmarks into TPTP? It'd be good to diversify the THF library

Jasmin Blanchette

2019-04-03 16:26:23 (GMT)

We submitted some JD problems on several occasions but Geoff typically takes only a small subset of what I send him. He's trying to keep some balance or something.

Jasmin Blanchette

2019-04-03 16:26:57 (GMT)

But I was thinking of releasing a new, much larger and more diverse set of benchmarks and call it Halloween maybe, to replace Judgment Day.

Haniel Barbosa

2019-04-03 16:27:33 (GMT)

These would include that you generated fro Benzmueller?

Simon Cruanes

2019-04-03 16:27:37 (GMT)

HOLowWins? :thinking:

Jasmin Blanchette

2019-04-03 16:29:5 (GMT)

They would not include de Benz ones. They would be selected purely randomly from all the AFP (Archive of Formal Proof), the library of user contributions. And they would exist in several versions (SMT-LIB 2, SMT-LIB 3, TH1, TH0, TF1, TF0).

Haniel Barbosa

2019-04-03 16:29:5 (GMT)

This new library could contain versions in THF and in HOSMT, that'd be good

Jasmin Blanchette

2019-04-03 16:29:16 (GMT)

Yes exactly.

Haniel Barbosa

2019-04-03 16:30:16 (GMT)

Selecting from AFP is different than selecting from the Isabelle theories?

Haniel Barbosa

2019-04-03 16:30:27 (GMT)

I don't know much about all that :)

Jasmin Blanchette

2019-04-03 16:30:38 (GMT)

Simon: I live in HOL-land. And I'm an alco-HOL-ic. Compared with that, any HOL pun is going to be lame. ;)

HOLy soit qui mal y pense.

Jasmin Blanchette

2019-04-03 16:32:18 (GMT)

The Benz benchmark were also from the AFP, but I don't know what I did to select them (pure randomness, or followed some hunch).

I want a true representative sample of the AFP, and avoid having too many problems coming from the same theory file (for more variety). Anyway, it doesn't matter; the point is simply that I should at some point sit down and do this, or better find somebody who's willing to do it. ;)

Haniel Barbosa

2019-04-03 16:40:34 (GMT)

I see

Benchmark sets

Jasmin Blanchette

Haniel Barbosa

Jasmin Blanchette

Jasmin Blanchette

Haniel Barbosa

Simon Cruanes

Jasmin Blanchette

Haniel Barbosa

Jasmin Blanchette

Haniel Barbosa

Haniel Barbosa

Jasmin Blanchette

Jasmin Blanchette

Haniel Barbosa