A Hackers’ Guide to Language Models

In this deeply informative video, Jeremy Howard, co-founder of and creator of the ULMFiT approach on which all modern language models (LMs) are based, takes you on a comprehensive journey through the fascinating landscape of LMs. Starting with the foundational concepts, Jeremy introduces the architecture and mechanics that make these AI systems tick. He then delves into critical evaluations of GPT-4, illuminates practical uses of language models in code writing and data analysis, and offers hands-on tips for working with the OpenAI API. The video also provides expert guidance on technical topics such as fine-tuning, decoding tokens, and running private instances of GPT models.

As we move further into the intricacies, Jeremy unpacks advanced strategies for model testing and optimization, utilizing tools like GPTQ and Hugging Face Transformers. He also explores the potential of specialized datasets like Orca and Platypus for fine-tuning and discusses cutting-edge trends in Retrieval Augmented Generation and information retrieval. Whether you’re new to the field or an established professional, this presentation offers a wealth of insights to help you navigate the ever-evolving world of language models.


  • 00:00:00 Introduction & Basic Ideas of Language Models
  • 00:18:05 Limitations & Capabilities of GPT-4
  • 00:31:28 AI Applications in Code Writing, Data Analysis & OCR
  • 00:38:50 Practical Tips on Using OpenAI API
  • 00:46:36 Creating a Code Interpreter with Function Calling
  • 00:51:57 Using Local Language Models & GPU Options
  • 00:59:33 Fine-Tuning Models & Decoding Tokens
  • 01:05:37 Testing & Optimizing Models
  • 01:10:32 Retrieval Augmented Generation
  • 01:20:08 Fine-Tuning Models
  • 01:26:00 Running Models on Macs
  • 01:27:42 Llama.cpp & Its Cross-Platform Abilities