Apple researchers question AI’s reasoning ability in mathematics

October 12, 2024

16

New Delhi, Oct 12 (IANS) A team of Apple researchers has questioned the formal reasoning capabilities of large language models (LLMs), particularly in mathematics.

They found that LLMs exhibit noticeable variance when responding to different instantiations of the same question.

Literature suggests that the reasoning process in LLMs is probabilistic pattern-matching rather than formal reasoning.

Although LLMs can match more abstract reasoning patterns, they fall short of true logical reasoning. Small changes in input tokens can drastically alter model outputs, indicating a strong token bias and suggesting that these models are highly sensitive and fragile.

“Additionally, in tasks requiring the correct selection of multiple tokens, the probability of arriving at an accurate answer decreases exponentially with the number of tokens or steps involved, underscoring their inherent unreliability in complex reasoning scenarios,” said Apple researchers in their paper titled “GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models.”

The ‘GSM8K’ benchmark is widely used to assess the mathematical reasoning of models on grade-school level questions.

While the performance of LLMs on GSM8K has significantly improved in recent years, it remains unclear whether their mathematical reasoning capabilities have genuinely advanced, raising questions about the reliability of the reported metrics.

To address these concerns, the researchers conducted a large-scale study on several state-of-the-art open and closed models.

“To overcome the limitations of existing evaluations, we introduce GSM-Symbolic, an improved benchmark created from symbolic templates that allow for the generation of a diverse set of questions,” the authors wrote.

GSM-Symbolic enables more controllable evaluations, providing key insights and more reliable metrics for measuring the reasoning capabilities of models.

“Our findings reveal that LLMs exhibit noticeable variance when responding to different instantiations of the same question,” said researchers, adding that overall, “our work provides a more nuanced understanding of LLMs’ capabilities and limitations in mathematical reasoning”.

–IANS

na/

Go to Source

Disclaimer

The information contained in this website is for general information purposes only. The information is provided by TodayIndia.news and while we endeavour to keep the information up to date and correct, we make no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability or availability with respect to the website or the information, products, services, or related graphics contained on the website for any purpose. Any reliance you place on such information is therefore strictly at your own risk.

In no event will we be liable for any loss or damage including without limitation, indirect or consequential loss or damage, or any loss or damage whatsoever arising from loss of data or profits arising out of, or in connection with, the use of this website.

Through this website you are able to link to other websites which are not under the control of TodayIndia.news We have no control over the nature, content and availability of those sites. The inclusion of any links does not necessarily imply a recommendation or endorse the views expressed within them.

Every effort is made to keep the website up and running smoothly. However, TodayIndia.news takes no responsibility for, and will not be liable for, the website being temporarily unavailable due to technical issues beyond our control.

For any legal details or query please visit original source link given with news or click on Go to Source.

Our translation service aims to offer the most accurate translation possible and we rarely experience any issues with news post. However, as the translation is carried out by third part tool there is a possibility for error to cause the occasional inaccuracy. We therefore require you to accept this disclaimer before confirming any translation news with us.

If you are not willing to accept this disclaimer then we recommend reading news post in its original language.

Apple researchers question AI’s reasoning ability in mathematics

Disclaimer

PM Gati Shakti scheme assesses 208 infrastructure projects worth Rs 15.39 lakh crore

I pray whoever is destroying others’ lives be shown good path by God: K’taka CM on Dussehra

Bhumi Pednekar: Over the years our films have just set really unrealistic beauty standards

Most Popular

PM Gati Shakti scheme assesses 208 infrastructure projects worth Rs 15.39 lakh crore

I pray whoever is destroying others’ lives be shown good path by God: K’taka CM on Dussehra

Bhumi Pednekar: Over the years our films have just set really unrealistic beauty standards

Med9 leaders urge ceasefire, diplomatic efforts in Middle East

EDITOR PICKS

PM Gati Shakti scheme assesses 208 infrastructure projects worth Rs 15.39...

I pray whoever is destroying others’ lives be shown good path...

Bhumi Pednekar: Over the years our films have just set really ...

POPULAR POSTS

Narayana Murthy gifts Infosys stock worth over Rs 240 crore to ...

Top 29 popular CMS software for your website development in 2024

Indian Whisky Goes Global: Sanjay Dutt’s The Glenwalk Debuts in Dubai

POPULAR CATEGORY

PM Gati Shakti scheme assesses 208 infrastructure projects worth Rs 15.39...