{"id":13190,"date":"2023-09-18T08:02:02","date_gmt":"2023-09-18T12:02:02","guid":{"rendered":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/?p=13190"},"modified":"2023-09-18T08:02:02","modified_gmt":"2023-09-18T12:02:02","slug":"medusa-simple-framework-for-accelerating-llm-generation-with-multiple-decoding-heads","status":"publish","type":"post","link":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/medusa-simple-framework-for-accelerating-llm-generation-with-multiple-decoding-heads\/13190\/","title":{"rendered":"Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads"},"content":{"rendered":"Originally published in together.ai, Sept 11, 2023. Large Language Models (LLMs) have changed the world. However, generating text with them can be slow and expensive. While methods like speculative decoding have been proposed to accelerate the generation speed, their intricate nature has left many in the open-source community hesitant to embrace them. That&#8217;s why we&#8217;re thrilled to unveil Medusa: a simpler, more user-friendly framework for accelerating LLM generation. Instead of using an additional draft model like speculative decoding, Medusa merely introduces a few additional decoding heads, following the idea of [Stern et al. 2018] with some other ingredients. <a href=\"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/medusa-simple-framework-for-accelerating-llm-generation-with-multiple-decoding-heads\/13190\/\" class=\"more-link\">(more&hellip;)<\/a>","protected":false},"excerpt":{"rendered":"<p>Originally published in together.ai, Sept 11, 2023. Large Language Models (LLMs) have changed the world. However, generating text with them can be slow and expensive. While methods like speculative decoding have been proposed to accelerate the generation speed, their intricate nature has left many in the open-source community hesitant to embrace them. That&#8217;s why we&#8217;re [&hellip;]<\/p>\n","protected":false},"author":72,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_links_to":"","_links_to_target":""},"categories":[11,48],"tags":[879,368,791,1324,1267,1268,243,1074],"class_list":["post-13190","post","type-post","status-publish","format-standard","hentry","category-industry-news","category-left-hand","tag-ai","tag-artificial-intelligence","tag-deep-learning","tag-deep-learning-analytics","tag-large-language-models","tag-llm","tag-machine-learning","tag-machine-learning-analytics"],"_links":{"self":[{"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/posts\/13190","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/users\/72"}],"replies":[{"embeddable":true,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/comments?post=13190"}],"version-history":[{"count":1,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/posts\/13190\/revisions"}],"predecessor-version":[{"id":13191,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/posts\/13190\/revisions\/13191"}],"wp:attachment":[{"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/media?parent=13190"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/categories?post=13190"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/tags?post=13190"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}