{"id":13637,"date":"2024-11-08T04:03:08","date_gmt":"2024-11-08T09:03:08","guid":{"rendered":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/?p=13637"},"modified":"2024-11-08T07:25:41","modified_gmt":"2024-11-08T12:25:41","slug":"nvidia-improves-metas-llama-model-with-new-training-approach","status":"publish","type":"post","link":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/nvidia-improves-metas-llama-model-with-new-training-approach\/13637\/","title":{"rendered":"Nvidia improves Meta&#8217;s Llama model with new training approach"},"content":{"rendered":"Originally published in the-decoder.com, Oct 18, 2024. Nvidia has introduced a new large language model that outperforms others in alignment benchmarks. The company achieved this through a special training procedure combining evaluation and preference models. The new model, called Llama-3.1-Nemotron-70B-Instruct, is based on Meta&#8217;s open-source Llama 3.1 model. Nvidia optimized it to provide helpful answers to user queries by combining different training methods. However, the results only show that the answers align better with human preferences, not that the content is necessarily more accurate. In fact, the Nemotron variant performs slightly worse than the base model on the <a href=\"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/nvidia-improves-metas-llama-model-with-new-training-approach\/13637\/\" class=\"more-link\">(more&hellip;)<\/a>","protected":false},"excerpt":{"rendered":"<p>Originally published in the-decoder.com, Oct 18, 2024. Nvidia has introduced a new large language model that outperforms others in alignment benchmarks. The company achieved this through a special training procedure combining evaluation and preference models. The new model, called Llama-3.1-Nemotron-70B-Instruct, is based on Meta&#8217;s open-source Llama 3.1 model. Nvidia optimized it to provide helpful answers [&hellip;]<\/p>\n","protected":false},"author":13468,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_links_to":"","_links_to_target":""},"categories":[11,48],"tags":[],"class_list":["post-13637","post","type-post","status-publish","format-standard","hentry","category-industry-news","category-left-hand"],"_links":{"self":[{"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/posts\/13637","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/users\/13468"}],"replies":[{"embeddable":true,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/comments?post=13637"}],"version-history":[{"count":9,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/posts\/13637\/revisions"}],"predecessor-version":[{"id":13671,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/posts\/13637\/revisions\/13671"}],"wp:attachment":[{"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/media?parent=13637"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/categories?post=13637"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.predictiveanalyticsworld.com\/machinelearningtimes\/wp-json\/wp\/v2\/tags?post=13637"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}