Machine Learning Times
Machine Learning Times
EXCLUSIVE HIGHLIGHTS
Survey: Machine Learning Projects Still Routinely Fail to Deploy
 Originally published in KDnuggets. Eric Siegel highlights the chronic...
Three Best Practices for Unilever’s Global Analytics Initiatives
    This article from Morgan Vawter, Global Vice...
Getting Machine Learning Projects from Idea to Execution
 Originally published in Harvard Business Review Machine learning might...
Eric Siegel on Bloomberg Businessweek
  Listen to Eric Siegel, former Columbia University Professor,...
SHARE THIS:

9 months ago
ChatGPT’s Performance Is Slipping, New Study Says

 
Originally published in Decrypt, July 19, 2023.

UC Berkeley researchers found that ChatGPT has not improved over time, and in fact, may have gotten worse.

ChatGPT exploded onto the scene late last year, dazzling people with its human-like conversational abilities, and the release of latest version prompted a  crypto rally and calls for a pause in development. But according to a new study, the leading AI bot’s skills may actually be on the decline.

Researchers at Stanford and UC Berkeley systematically analyzed different versions of ChatGPT from March and June 2022. They developed rigorous benchmarks to evaluate the model’s competency in math, coding, and visual reasoning tasks. The results of ChatGPT’s performance over time were not good.

The tests revealed a startling drop-off in performance between versions. On a math challenge of determining prime numbers, ChatGPT solved 488 out of 500 questions correctly in March, an accuracy of 97.6%. However, in June, ChatGPT only managed to get 12 questions right, plunging to 2.4% accuracy.

To continue reading this article, click here.

5 thoughts on “ChatGPT’s Performance Is Slipping, New Study Says

  1. As the backrooms game continue to capture the imagination of online communities, game developers, and players alike, the digital exploration of these mysterious spaces serves as a testament to the enduring power of internet-driven storytelling and shared experiences.

     

Leave a Reply