Abstract
The considerable mathematical knowledge encoded by the Flyspeck project is combined with external automated theorem provers (ATPs) and machine-learning premise selection methods trained on the Flyspeck proofs, producing an AI system capable of proving a wide range of mathematical conjectures automatically. The performance of this architecture is evaluated in a bootstrapping scenario emulating the development of Flyspeck from axioms to the last theorem, each time using only the previous theorems and proofs. It is shown that 39 % of the 14185 theorems could be proved in a push-button mode (without any high-level advice and user interaction) in 30 seconds of real time on a fourteen-CPU workstation. The necessary work involves: (i) an implementation of sound translations of the HOL Light logic to ATP formalisms: untyped first-order, polymorphic typed first-order, and typed higher-order, (ii) export of the dependency information from HOL Light and ATP proofs for the machine learners, and (iii) choice of suitable representations and methods for learning from previous proofs, and their integration as advisors with HOL Light. This work is described and discussed here, and an initial analysis of the body of proofs that were found fully automatically is provided.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Adams, M.: Introducing HOL Zero- (extended abstract). In: Fukuda, K., van der Hoeven, J., Joswig, M., Takayama, N. (eds.) ICMS (LNCS), vol. 6327, pp 142–143. LNCS, Springer (2010)
Akbarpour, B., Paulson, L.C.: MetiTarski: an automatic theorem prover for real-valued special functions. J. Autom. Reason. 44(3), 175–205 (2010)
Alama, J., Brink, K., Mamane, L., Urban, J.: Large formal wikis: issues and solutions. In: Davenport, J.H., Farmer, W.M., Urban, J., Rabe, F. (eds.) Calculemus/MKM, vol. 6824, pp 133–148. LNCS, Springer (2011)
Alama, J., Heskes, T., Kühlwein, D., Tsivtsivadze, E., Urban, J.: Premise selection for mathematics by corpus analysis and kernel methods. J. Autom. Reason. 52(2), 191–213 (2014)
Alama, J., Kühlwein, D., Urban, J.: Automated and human proofs in general mathematics: an initial comparison. Bjrner, N., Voronkov, A. (eds.) Logic for Programming, Artificial Intelligence, and Reasoning - 18th International Conference, LPAR-18, Merida, Venezuela, March 11-15, 2012. Proceedings, vol. 7180, pp. 37–45. LNCS, Springer
Armando, A., Baumgartner, P., Dowek, G.: Automated Reasoning, 4th International Joint Conference, IJCAR 2008, Sydney, Australia, 12–15 August 2008. Proceedings, vol. 5195. LNCS, Springer (2008)
Autexier, S., Calmet, J., Delahaye, D., Ion, P.D.F., Rideau, L., Rioboo, R., Sexton, A.P. (eds.): Intelligent Computer Mathematics, 10th International Conference, AISC 2010, 17th Symposium, Calculemus 2010, and 9th International Conference, MKM 2010, Paris, France, July 5-10, 2010. Proceedings, vol. 6167. LNCS, Springer (2010)
Beeson, M.: Automatic derivation of the irrationality of e. J. Symb. Comput. 32(4), 333–349 (2001)
Belinfante, J. G. F.: On computer-assisted proofs in ordinal number theory. J. Autom. Reason. 22(2), 341–378 (1999)
Benzmüller, C., Paulson, L.C., Theiss, F., Fietzke, A.: LEO-II - a cooperative automatic theorem prover for classical higher-order logic (system description). In: Armando, A., Baumgartner, P., Dowek, G., (eds.) Automated Reasoning, 4th International Joint Conference, IJCAR 2008, Sydney, Australia, August 12-15, 2008. Proceedings, vol. 5195, pp. 162–170. LNCS, Springer (2008)
Bjørner, N., Sofronie-Stokkermans, V. (eds.): Automated Deduction - CADE-23 - 23rd International Conference on Automated Deduction, Wroclaw, Poland, 31 July–5 August 2011. Proceedings, vol. 6803. LNCS, Springer (2011)
Bjørner, N., Voronkov, A. (eds.): Logic for Programming, Artificial Intelligence, and Reasoning - 18th International Conference, LPAR-18, Mérida, Venezuela, 11–15 March 2012. Proceedings, vol. 7180. LNCS, Springer (2012)
Blanchette, J. C.: Automatic Proofs and Refutations for Higher-Order Logic. PhD thesis, Fakultät für Informatik, Technische Universität München (2012). http://www21.in.tum.de/blanchet/phdthesis.pdf.
Blanchette, J.C., Böhme, S., Paulson, L.C.: Extending Sledgehammer with SMT solvers. In: Bjørner, N., Sofronie-Stokkermans, V. (eds.) Automated Deduction - CADE-23 - 23rd International Conference on Automated Deduction, Wroclaw, Poland, July 31 - August 5, 2011. Proceedings, vol. 6803, pp. 116–130. LNCS, Springer (2011)
Blanchette, J.C., Böhme, S., Popescu, A., Smallbone, N.: Encoding monomorphic and polymorphic types. In: Piterman, N., Smolka, S.A. (eds.) TACAS, Lecture Notes in Computer Science, vol. 7795, pp. 493–507. Springer (2013)
Blanchette, J.C., Bulwahn, L., Nipkow, T.: Automatic proof and disproof in Isabelle/HOL. In: Tinelli, C., Sofronie-Stokkermans, V. (eds.) FroCoS, vol. 6989, pp. 12–27. LNCS, Springer (2011)
Blanchette, J.C., Paskevich, A.: TFF1: the TPTP typed first-order form with rank-1 polymorphism. In: Bonacina, M.P. (ed.) CADE, Lecture Notes in Computer Science, vol. 7898, pp. 414–420. Springer (2013)
Blanchette, J.C., Popescu, A., Wand, D., Weidenbach, C.: More SPASS with Isabelle - superposition with hard sorts and configurable simplification. In: Beringer, L., Felty, A.P. (eds.) ITP, vol. 7406, pp. 345–360. LNCS, Springer (2012)
Böhme, S., Nipkow, T.: Sledgehammer: judgement day. In: Giesl, J., Hähnle, R. (eds.) IJCAR, vol. 6173, pp. 107–121. LNCS, Springer (2010)
Brown, C.E.: Satallax: an automatic higher-order prover. In: Gramlich, B., Miller, D., Sattler, U. (eds.) Automated Reasoning - 6th International Joint Conference, IJCAR 2012, Manchester, UK, June 26-29, 2012. Proceedings, vol. 7364, pp. 111–117. LNCS, Springer (2012)
Carlson, A., Cumby, C., Rosen, J., Roth, D.: The SNoW Learning Architecture, vol. 5 Technical Report UIUCDCS-R-99-2101, UIUC Computer Science Department (1999)
Church, A.: A formulation of the simple theory of types. J. Symb. Log. 5, 56–68 (1940)
Dahn, I.: Interpretation of a Mizar-like logic in first-order logic. In: Caferra, R., Salzer, G. (eds.) FTP (LNCS Selection), vol. 1761, pp. 137–151. LNCS, Springer (1998)
Dahn, I., Wernhard, C.: First order proof problems extracted from an article in the MIZAR Mathematical Library. In: Bonacina, M.P., Furbach, U. (eds.) International Workshop on First-Order Theorem Proving (FTP’97), RISC-Linz Report Series No. 97-50, pp. 58–62. Johannes Kepler Universität, Linz (Austria) (1997)
de Moura, L.M., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS, vol. 4963, pp. 337–340. LNCS, Springer (2008)
Filliâtre, J.-C., Marché, C.: The Why/Krakatoa/Caduceus platform for deductive program verification. In: Damm, W., Hermanns, H. (eds.) CAV, vol. 4590, pp. 173–177. LNCS, Springer (2007)
Furbach, U., Shankar, N. (eds.): Automated Reasoning, Third International Joint Conference, IJCAR 2006, Seattle, WA, USA, August 17-20 2006. Proceedings, vol. 4130. LNCS, Springer (2006)
Van Gelder, A., Sutcliffe, G.: Extending the TPTP language to higher-order logic with automated parser generation. In: Furbach, U., Shankar, N. (eds.) Automated Reasoning, Third International Joint Conference, IJCAR 2006, Seattle, WA, USA, August 17-20, 2006, Proceedings, vol. 4130, pp. 156–161. LNCS, Springer (2006)
Genesereth, M.R., Fikes, R. E.: Knowledge Interchange Format, Version 3.0 Reference Manual. Technical Report Logic-92-1, Computer Science Department, Stanford University (1992)
Gramlich, B., Miller, D., Sattler, U. (eds.): Automated Reasoning - 6th International Joint Conference, IJCAR 2012, Manchester, UK, 26–29 June 2012. Proceedings, vol. 7364. LNCS, Springer (2012)
Hähnle, R., Kerber, M., Weidenbach, C.: Common Syntax of the DFGSchwerpunktprogramm Deduction. Technical Report TR 10/96, Fakultät für Informatik, Universität Karlsruhe, Karlsruhe, Germany (1996)
Hales, T.C.: Introduction to the Flyspeck project. In: Coquand, T., Lombardi, H., Roy, M.-F. (eds.) Mathematics, Algorithms, Proofs, number 05021 in Dagstuhl Seminar Proceedings, pp. 1–11. Dagstuhl, Germany (2006). Internationales Begegnungs- und Forschungszentrum für Informatik (IBFI), Schloss Dagstuhl, Germany
Hales, T. C. Mathematics in the age of the Turing machine. Lecture Notes in Logic (2012), to appear. http://www.math.pitt.edu/thales/papers/turing.pdf
Hales, T. C., Harrison, J., McLaughlin, S., Nipkow, T., Obua, S., Zumkeller, R.: A revision of the proof of the Kepler Conjecture. Discret. Comput. Geom. 44(1), 1–34 (2010)
Harrison, J.: HOL light: a tutorial introduction. In: Srivas, M.K., Camilleri, A.J. (eds.) FMCAD, vol. 1166, pp. 265–269. LNCS, Springer (1996)
Harrison, J.: A Mizar mode for HOL. In: von Wright, J., Grundy, J., Harrison, J. (eds.) TPHOLs, vol. 1125, pp. 203–220. LNCS, Springer (1996)
Harrison, J.: Optimizing proof search in model elimination. In: McRobbie, M., Slaney, J.K. (eds.) Proceedings of the 13th International Conference on Automated Deduction, vol. 1104, pp. 313–327. LNAI, Springer-Verlag (1996)
Hindley, R.: The principal type-scheme of an object in combinatory logic. Trans. Am. Math. Soc. 146, 29–60 (1969)
Hoder, K., Voronkov, A.: Sine qua non for large theory reasoning. In: Bjrner, N., Sofronie-Stokkermans, V. (eds.) Automated Deduction - CADE-23 - 23rd International Conference on Automated Deduction, Wroclaw, Poland, July 31 - August 5, 2011. Proceedings, vol. 6803, pp. 299–314. LNCS, Springer (2011)
Hurd, J.: Integrating Gandalf and HOL. In: Bertot, Y., Dowek, G., Hirschowitz, A., Paulin, C., Théry, L. (eds.) TPHOLs, vol. 1690, pp. 311–322. LNCS, Springer (1999)
Hurd, J.: An LCF-style interface between HOL and first-order logic. In: Voronkov, A. (ed.) CADE, Lecture Notes in Computer Science, vol. 2392, pp. 134–138. LNCS, Springer (2002)
Hurd, J.: First-order proof tactics in higher-order logic theorem provers. In: Archer, M., Di Vito, B., Muñoz, C. (eds.) Design and Application of Strategies/Tactics in Higher Order Logics (STRATA 2003), number NASA/CP-2003-212448 in NASA Technical, Reports, pp. 56–68, September (2003)
Kaliszyk, C.: Web interfaces for proof assistants. Electr. Notes Theor. Comput. Sci. 174(2), 49–61 (2007)
Kaliszyk, C., Krauss, A.: Scalable LCF-style proof translation. In: Blazy, S., Paulin-Mohring, C., Pichardie, D. (eds.) Proceedings of the 4th International Conference on Interactive Theorem Proving (ITP’13), vol. 7998, pp. 51–66. LNCS, Springer Verlag (2013)
Kaliszyk, C., Urban, J.: Initial experiments with external provers and premise selection on HOL Light corpora. In: Fontaine, P., Schmidt, R.A., Schulz, S. (eds.) PAAR-2012, EPiC Series, vol. 21, pp. 72–81. EasyChair (2013)
Kepler, J.: Strena seu de nive sexangula. Frankfurt: Gottfried. Tampach, 1611
Kovács, L., Voronkov, A.: First-order theorem proving and Vampire. In: Sharygina, N., Veith, H. (eds.) CAV, of Lecture Notes in Computer Science, vol. 8044, pp. 1–35. Springer (2013)
Kuehlwein, D., Urban, J.: Learning from multiple proofs: first experiments. In: Fontaine, P., Schmidt, R.A., Schulz, S. (eds.) PAAR-2012, volume 21 of EPiC Series, vol. 21, pp. 82–94. EasyChair (2013)
Kühlwein, D., van Laarhoven, T., Tsivtsivadze, E., Urban, J., Heskes, T.: Overview and evaluation of premise selection techniques for large theory mathematics. In: Gramlich, B., Miller, D., Sattler, U. (eds.) Automated Reasoning - 6th International Joint Conference, IJCAR 2012, Manchester, UK, June 26-29, 2012. Proceedings, vol. 7364, pp. 378–392. LNCS, Springer (2012)
Loveland, D. W.: Mechanical theorem proving by model elimination. J. ACM 15(2), 236–251 (1968)
Loveland, D. W.: Automated Theorem Proving: A Logical Basis. North-Holland, Amsterdam (1978)
McAllester, D. A.: Ontic: A Knowledge Representation System for Mathematics. MIT Press, Cambridge (1989)
McCune, W.: Prover9 and Mace4. http://www.cs.unm.edu/mccune/prover9/, (2005–2010)
McCune, W., Matlin, O.S.: Ivy: a preprocessor and proof checker for first-order logic. In: Kaufmann, M., Manolios, P., Strother Moore, J. (eds.) Computer-Aided Reasoning: ACL2 Case Studies, in Advances in Formal Methods, vol. 4, pp. 265–282. Kluwer Academic Publishers (2000)
McCune, W., Wos, L.: Otter: the CADE-13 competition incarnations. J. Autom. Reason. 18(2), 211–220 (1997)
Meier, A.: System description: tramp: transformation of machine-found proofs into nd-proofs at the assertion level. In: McAllester, D.A. (ed.) CADE, vol. 1831, pp. 460–464. LNCS, Springer (2000)
Meng, J., Paulson, L. C.: Translating higher-order clauses to first-order clauses. J. Autom. Reason. 40(1), 35–60 (2008)
Obua, S., Skalberg, S.: Importing hol into Isabelle/HOL. In: Furbach, U., Shankar, N. (eds.) Automated Reasoning, Third International Joint Conference, IJCAR 2006, Seattle, WA, USA, August 17-20, 2006, vol. 4130, pp. 298–302. LNCS, Springer (2006)
Paulson, L. C.: A generic tableau prover and its integration with Isabelle. J. UCS 5(3), 73–87 (1999)
Paulson, L.C., Blanchette, J.: Three years of experience with Sledgehammer, a practical link between automated and interactive theorem provers. In: 8th IWIL (2010). Invited talk.
Paulson, L.C., Susanto, K.W.: Source-level proof reconstruction for interactive theorem proving. In: Schneider, K., Brandt, J. (eds.) TPHOLs, vol. 4732, pp. 232–245. LNCS, Springer (2007)
Pease, A., Sutcliffe, G.: First order reasoning on a large ontology. In: Sutclie, G., Urban, J., Schulz, S. (eds.) Proceedings of the CADE-21 Workshop on Empirically Successful Automated Reasoning in Large Theories, Bremen, Germany, 17th July 2007, CEUR Workshop, vol. 257. CEUR-WS.org (2007)
Pitts, A.: The HOL logic. In: Gordon, M.J.C., Melham, T.F. (eds.) Introduction to HOL: A Theorem Proving Environment for Higher Order Logic. Cambridge University Press (1993)
Poincaré, H.: The Foundations of Science: Science and Hypothesis, The Value of Science, Science and Method. The Science Press, New York (1913)
Prevosto, V., Waldmann, U.: SPASS+T. In: Sutcliffe, G., Schmidt, R., Schulz, S. (eds.) ESCoR 2006, CEUR, vol. 192, pp. 18–33 (2006)
Quaife, A.: Automated Development of Fundamental Mathematical Theories. Kluwer Academic Publishers (1992)
Rudnicki, P., Trybulec, A.: On the integrity of a repository of formalized mathematics. In: Asperti, A., Buchberger, B., Davenport, J.H. (eds.) MKM, vol. 2594, pp. 162–174. LNCS, Springer (2003)
Schulz, S.: E - A Brainiac Theorem Prover. AI Commun. 15(2–3), 111–126 (2002)
Slind, K., Gordon, M.J.C., Boulton, R.J., Bundy, A.: System description: an interface between CLAM and HOL. In: Kirchner, C., Kirchner, H. (eds.) CADE, vol. 1421, pp. 134–138. LNCS, Springer (1998)
Sutcliffe, G., Benzmüller, C.: Automated reasoning in higher-order logic using the TPTP THF infrastructure. J. Formal. Reason. 3(1), 1–27 (2010)
Sutcliffe, G., Schulz, S., Claessen, K., Baumgartner, P.: The TPTP typed first-order form with arithmetic. In: Bjrner, N., Voronkov, A. (eds.) Logic for Programming, Artificial Intelligence, and Reasoning - 18th International Conference, LPAR-18, Merida, Venezuela, March 11-15, 2012. Proceedings, vol. 7180, pp. 406–419. LNCS, Springer (2012)
Sutcliffe, G., Schulz, S., Claessen, K., Van Gelder, A.: Using the TPTP language for writing derivations and finite interpretations. In: Furbach, U., Shankar, N. (eds.) Automated Reasoning, Third International Joint Conference, IJCAR 2006, Seattle, WA, USA, August 17-20, 2006. Proceedings, vol. 4130, pp. 67–81. LNCS, Springer (2006)
Sutcliffe, G., Urban, J., Schulz, S. (eds.): Proceedings of the CADE-21 Workshop on Empirically Successful Automated Reasoning in Large Theories, Bremen, Germany, 17th July 2007, CEUR Workshop Proceedings, vol. 257. CEUR-WS.org (2007)
Urban, J.: Translating Mizar for first order theorem provers. In: MKM, vol. 2594, pp. 203–215. LNCS, Springer (2003)
Urban, J.: MPTP - motivation, implementation, first experiments. J. Autom. Reason. 33(3-4), 319–339 (2004)
Urban, J.: MoMM - fast interreduction and retrieval in large libraries of formalized mathematics. Int. J. Artif. Intell. Tools 15(1), 109–130 (2006)
Urban, J.: MPTP 0.2: design, implementation, and initial experiments. J. Autom. Reason. 37(1–2), 21–43 (2006)
Urban, J.: An overview of methods for large-theory automated theorem proving (Invited Paper). In: Höfner, P., McIver, A., Struth, G. (eds.) ATE Workshop, CEUR Workshop Proceedings, vol. 760, pp. 3–8. CEUR-WS.org (2011)
Urban, J.: BliStr: The Blind Strategymaker. CoRR (2013). arXiv: 1301.2683
Urban, J., Rudnicki, P., Sutcliffe, G.: ATP and presentation service for Mizar formalizations. J. Autom. Reason. 50, 229–241 (2013)
Urban, J., Sutcliffe, G.: Automated reasoning and presentation support for formalizing mathematics in Mizar. In: Autexier, S., Calmet, J., Delahaye, D., Ion, P.D.F., Rideau, L., Rioboo, R., Sexton, A.P. (eds.) Intelligent Computer Mathematics, 10th International Conference, AISC 2010, 17th Symposium, Calculemus 2010, and 9th International Conference, MKM 2010, Paris, France, July 5-10, 2010. Proceedings, vol. 6167, pp. 132–146. LNCS, Springer (2010)
Urban, J., Sutcliffe, G., Pudlák, P., Vyskočil, J.: MaLARea SG1 - Machine Learner for Automated Reasoning with Semantic Guidance. In: Armando, A., Baumgartner, P., Dowek, G. (eds.) Automated Reasoning, 4th International Joint Conference, IJCAR 2008, Sydney, Australia, August 12-15, 2008. Proceedings, vol. 5195, pp. 441–456. LNCS, Springer (2008)
Urban, J., Vyskočil, J.: Theorem proving in large formal mathematics as an emerging AI field. In: Bonacina, M.P., Stickel, M.E. (eds.) Automated reasoning and mathematics: essays in memory of William McCune, vol. 7788, pp. 240–257. LNAI, Springer (2013)
Urban, J., Vyskočil, J., Štěpánek, P.: MaLeCoP: Machine learning connection prover. In: Brünnler, K., Metcalfe, G. (eds.) TABLEAUX, vol. 6793, pp. 263–277. LNCS, Springer (2011)
Wall, L.: Programming Perl, 3rd edn. O’Reilly & Associates, Inc., Sebastopol (2000)
Weidenbach, C., Afshordel, B., Brahm, U., Cohrs, C., Engel, T., Keen, E., Theobalt, C., Topić, D.: System description: SPASS version 1.0.0. In: Automated Deduction - CADE-16, vol. 1632, pp. 378–382. LNCS, Springer Berlin Heidelberg (1999)
Weidenbach, C., Dimova, D., Fietzke, A., Kumar, R., Suda, M., Wischnewski, P.: SPASS Version 3.5. In: Schmidt, R.A. (ed.) CADE, vol. 5663, pp. 140–145. LNCS, Springer (2009)
Wiedijk, F.: Estimating the Cost of a Standard Library for a Mathematical Proof Checker. http://www.cs.ru.nl/freek/notes/mathstdlib2.pdf (2001)
Wiedijk, F.: A synthesis of the procedural and declarative styles of interactive theorem proving, Vol. 8 (2012)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
About this article
Cite this article
Kaliszyk, C., Urban, J. Learning-Assisted Automated Reasoning with Flyspeck. J Autom Reasoning 53, 173–213 (2014). https://doi.org/10.1007/s10817-014-9303-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10817-014-9303-3