Training Large Language Models in the Open

Leandro Von Werra | Friday, May 17, 2024 | Gurten Pavillon


This talk will delve into all the aspects and challenges associated with training large language models (LLMs). The talk will cover everything from gathering data at scale, optimizing training performance on large GPU clusters, to running meaningful evaluations. In addition to the technical aspects you will also get an overview of the governance of such an endeavour and how such models can be built and released to benefit all of society.


Joshua Starmer
Leandro Von Werra
Machine Learning Engineer at Hugging Face