Connect with us

Code

The Machine Learning Engineering Open Book – An open collection of methodologies

An open collection of methodologies to help with successful training of large language models and multi-modal models.

This is a technical material suitable for LLM/VLM training engineers and operators. That is the content here contains lots of scripts and copy-n-paste commands to enable you to quickly address your needs.

This repo is an ongoing brain dump of my experiences training Large Language Models (LLM) (and VLMs); a lot of the know-how I acquired while training the open-source BLOOM-176B model in 2022 and IDEFICS-80B multi-modal model in 2023. Currently, I’m working on developing/training open-source Retrieval Augmented models at Contextual.AI.

I’ve been compiling this information mostly for myself so that I could quickly find solutions I have already researched in the past and which have worked, but as usual I’m happy to share these with the wider ML community.

The Machine Learning Engineering Open Book on GitHub: https://github.com/stas00/ml-engineering?tab=readme-ov-file
Platform: Machine Learning
⭐️: 5.6K
Advertisement

Trending