Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Artificial intelligence
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Mathematics === Large language models, such as [[GPT-4]], [[Gemini (chatbot)|Gemini]], [[Claude (language model)|Claude]], [[Llama (language model)|LLaMa]] or [[Mistral AI|Mistral]], are increasingly used in mathematics. These probabilistic models are versatile, but can also produce wrong answers in the form of [[Hallucination (artificial intelligence)|hallucinations]]. They sometimes need a large database of mathematical problems to learn from, but also methods such as [[Supervised learning|supervised]] [[Fine-tuning (deep learning)|fine-tuning]]<ref>{{Cite journal |date=2024 |title=ReFT: Representation Finetuning for Language Models |journal=NeurIPS |arxiv=2404.03592 |last1=Wu |first1=Zhengxuan |last2=Arora |first2=Aryaman |last3=Wang |first3=Zheng |last4=Geiger |first4=Atticus |last5=Jurafsky |first5=Dan |last6=Manning |first6=Christopher D. |last7=Potts |first7=Christopher }}</ref> or trained [[Statistical classification|classifiers]] with human-annotated data to improve answers for new problems and learn from corrections.<ref>{{Cite web |date=2023-05-31 |title=Improving mathematical reasoning with process supervision |url=https://openai.com/index/improving-mathematical-reasoning-with-process-supervision/ |access-date=2025-01-26 |website=OpenAI |language=en-US}}</ref> A February 2024 study showed that the performance of some language models for reasoning capabilities in solving math problems not included in their training data was low, even for problems with only minor deviations from trained data.<ref>{{Cite arXiv |eprint=2402.19450 |class=cs.AI |first=Saurabh |last=Srivastava |title=Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap |date=2024-02-29}}</ref> One technique to improve their performance involves training the models to produce correct [[Automated reasoning|reasoning]] steps, rather than just the correct result.<ref>{{cite arXiv |eprint=2305.20050v1 |class=cs.LG |first1=Hunter |last1=Lightman |first2=Vineet |last2=Kosaraju |title=Let's Verify Step by Step |date=2023 |last3=Burda |first3=Yura |last4=Edwards |first4=Harri |last5=Baker |first5=Bowen |last6=Lee |first6=Teddy |last7=Leike |first7=Jan |last8=Schulman |first8=John |last9=Sutskever |first9=Ilya |last10=Cobbe |first10=Karl}}</ref> The [[Alibaba Group]] developed a version of its ''[[Qwen]]'' models called ''Qwen2-Math'', that achieved state-of-the-art performance on several mathematical benchmarks, including 84% accuracy on the MATH dataset of competition mathematics problems.<ref name="VentureBeat 8 August 2024">{{cite web |last1=Franzen |first1=Carl |title=Alibaba claims no. 1 spot in AI math models with Qwen2-Math |url=https://venturebeat.com/ai/alibaba-claims-no-1-spot-in-ai-math-models-with-qwen2-math/ |website=VentureBeat |date=2024-08-08|access-date=2025-02-16}}</ref> In January 2025, Microsoft proposed the technique ''rStar-Math'' that leverages [[Monte Carlo tree search]] and step-by-step reasoning, enabling a relatively small language model like ''Qwen-7B'' to solve 53% of the [[American Invitational Mathematics Examination|AIME]] 2024 and 90% of the MATH benchmark problems.<ref>{{Cite web |last=Franzen |first=Carl |date=2025-01-09 |title=Microsoft's new rStar-Math technique upgrades small models to outperform OpenAI's o1-preview at math problems |url=https://venturebeat.com/ai/microsofts-new-rstar-math-technique-upgrades-small-models-to-outperform-openais-o1-preview-at-math-problems/ |access-date=2025-01-26 |website=VentureBeat |language=en-US}}</ref> Alternatively, dedicated models for mathematical problem solving with higher precision for the outcome including proof of theorems have been developed such as ''AlphaTensor'', ''[[AlphaGeometry]]'' and ''AlphaProof'' all from [[Google DeepMind]],<ref>{{Cite web |last=Roberts |first=Siobhan |date=July 25, 2024 |title=AI achieves silver-medal standard solving International Mathematical Olympiad problems |url=https://www.nytimes.com/2024/07/25/science/ai-math-alphaproof-deepmind.html |access-date=2024-08-07 |website=[[The New York Times]] |archive-date=26 September 2024 |archive-url=https://web.archive.org/web/20240926131402/https://www.nytimes.com/2024/07/25/science/ai-math-alphaproof-deepmind.html |url-status=live }}</ref> ''Llemma'' from [[EleutherAI]]<ref>{{Cite web |last1=Azerbayev |first1=Zhangir |last2=Schoelkopf |first2=Hailey |last3=Paster |first3=Keiran |last4=Santos |first4=Marco Dos |last5=McAleer' |first5=Stephen |last6=Jiang |first6=Albert Q. |last7=Deng |first7=Jia |last8=Biderman |first8=Stella |last9=Welleck |first9=Sean |date=2023-10-16 |title=Llemma: An Open Language Model For Mathematics |url=https://blog.eleuther.ai/llemma/ |access-date=2025-01-26 |website=EleutherAI Blog |language=en}}</ref> or ''Julius''.<ref>{{Cite web |title=Julius AI |url=https://julius.ai/home/ai-math |access-date= |website=julius.ai |language=en}}</ref> When natural language is used to describe mathematical problems, converters can transform such prompts into a formal language such as [[Lean (proof assistant)|Lean]] to define mathematical tasks. Some models have been developed to solve challenging problems and reach good results in benchmark tests, others to serve as educational tools in mathematics.<ref>{{Cite web |last=McFarland |first=Alex |date=2024-07-12 |title=8 Best AI for Math Tools (January 2025) |url=https://www.unite.ai/best-ai-for-math-tools/ |access-date=2025-01-26 |website=Unite.AI |language=en-US}}</ref> [[Topological deep learning]] integrates various [[topology|topological]] approaches.
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Artificial intelligence
(section)
Add topic