{"id":13483,"date":"2024-04-07T13:37:46","date_gmt":"2024-04-07T17:37:46","guid":{"rendered":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/?p=13483"},"modified":"2024-05-04T11:49:38","modified_gmt":"2024-05-04T15:49:38","slug":"large-language-models-use-a-surprisingly-simple-mechanism-to-retrieve-some-stored-knowledge","status":"publish","type":"post","link":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/large-language-models-use-a-surprisingly-simple-mechanism-to-retrieve-some-stored-knowledge\/13483\/","title":{"rendered":"Large language models use a surprisingly simple mechanism to retrieve some stored knowledge"},"content":{"rendered":"Originally published in MIT News, March 25, 2024 Researchers demonstrate a technique that can be used to probe a model to see what it knows about new subjects. Large language models, such as those that power popular artificial intelligence chatbots like ChatGPT, are incredibly complex. Even though these models are being used as tools in many areas, such as customer support, code generation, and language translation, scientists still don\u2019t fully grasp how they work. They found a surprising result: Large language models (LLMs) often use a very simple linear function to recover and decode stored facts. Moreover, the <a href=\"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/large-language-models-use-a-surprisingly-simple-mechanism-to-retrieve-some-stored-knowledge\/13483\/\" class=\"more-link\">(more&hellip;)<\/a>","protected":false},"excerpt":{"rendered":"<p>Originally published in MIT News, March 25, 2024 Researchers demonstrate a technique that can be used to probe a model to see what it knows about new subjects. Large language models, such as those that power popular artificial intelligence chatbots like ChatGPT, are incredibly complex. Even though these models are being used as tools in [&hellip;]<\/p>\n","protected":false},"author":78,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_links_to":"","_links_to_target":""},"categories":[11,48],"tags":[],"class_list":["post-13483","post","type-post","status-publish","format-standard","hentry","category-industry-news","category-left-hand"],"_links":{"self":[{"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/posts\/13483","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/users\/78"}],"replies":[{"embeddable":true,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/comments?post=13483"}],"version-history":[{"count":2,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/posts\/13483\/revisions"}],"predecessor-version":[{"id":13485,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/posts\/13483\/revisions\/13485"}],"wp:attachment":[{"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/media?parent=13483"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/categories?post=13483"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/tags?post=13483"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}