Machine Learning Times
Machine Learning Times
EXCLUSIVE HIGHLIGHTS
Announcing Eric Siegel’s New Book: The AI Playbook
  Dear Reader, I’m excited to announce the forthcoming,...
Predictive Analytics for the Call Center
 So, you just received your shiny new smart watch....
MLW Preview Video: Gulrez Khan, Data Science Lead at PayPal
 In anticipation of his upcoming keynote presentation at Predictive...
MLW Preview Video: Devanshi Vyas, Co-Founder at Censius
 In anticipation of her upcoming presentation at Deep Learning...

deep learning analytics

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

 Originally published in together.ai, Sept 11, 2023. Large Language Models (LLMs) have changed the world. However, generating text with them can be slow and expensive. While methods like speculative decoding have been proposed to accelerate the generation speed, their intricate nature has left many in the open-source community hesitant to embrace them. That’s why we’re