AlphaProof and AlphaGeometry 2 Solve Advanced Math Problems
In a new blog, Google DeepMind introduced two new math-focused AI models to the world: AlphaProof and AlphaGeometry 2. According to Google, these models have successfully solved four out of six problems from this year’s International Mathematical Olympiad, achieving a performance equivalent to a silver medalist.
The IMO: A Prestigious Benchmark
The IMO, held annually since 1959, is the world’s oldest and most prestigious competition for young mathematicians. It presents six exceptionally difficult problems in algebra, combinatorics, geometry, and number theory.
Elite pre-college mathematicians, who often train for thousands of hours, compete in this rigorous contest. Many Fields Medal winners, one of the highest honors for mathematicians, have participated in the IMO.
This competition has recently become a significant challenge for machine learning, serving as a benchmark for evaluating an AI system’s advanced mathematical reasoning capabilities. This year, Google’s combined AI system tackled the competition problems, which were scored by prominent mathematicians, including Prof. Sir Timothy Gowers and Dr. Joseph Myers.
AlphaProof’s Reasoning
AlphaProof is a groundbreaking system that trains itself to prove mathematical statements in the formal language Lean. By integrating a pre-trained language model with the AlphaZero reinforcement learning algorithm, AlphaProof generates solution candidates and proves or disproves them through a search process in Lean.
Each verified proof reinforces the model, enhancing its problem-solving capabilities. Google trained AlphaProof for the IMO by proving or disproving millions of problems, spanning various difficulties and mathematical topics.
This rigorous training, conducted over weeks leading up to the competition, enabled AlphaProof to solve two algebra problems and one number theory problem, including the competition’s most challenging problem.
AlphaGeometry 2
AlphaGeometry 2 is an improved version of our previous geometry-solving system. It is a neuro-symbolic hybrid system based on the Gemini language model, trained on a significantly larger dataset. This allowed AlphaGeometry 2 to tackle more complex geometry problems, including those involving movements of objects and equations of angles, ratios, or distances.
The symbolic engine of AlphaGeometry 2 is two orders of magnitude faster than its predecessor. A novel knowledge-sharing mechanism allows advanced combinations of different search trees to solve more complex problems.
Before this year’s competition, AlphaGeometry 2 had already solved 83% of all historical IMO geometry problems from the past 25 years, a significant improvement from its predecessor’s 53% success rate.
New Frontiers in Mathematical Reasoning
In addition to these models, the team at DeepMind experimented with a natural language reasoning system built upon Gemini. This system does not require the translation of problems into a formal language, offering a promising alternative approach to advanced problem-solving.
Concluding their announcement blog, the team stated that they will continue to explore multiple AI approaches to advance mathematical reasoning. They also announced that they planned on release more technical details on AlphaProof in the near future.